-
Notifications
You must be signed in to change notification settings - Fork 18.3k
Closed
Labels
Description
by kq1quick:
The language specification currently states: Tokens Tokens form the vocabulary of the Go language. There are four classes: identifiers, keywords, operators and delimiters, and literals. White space, formed from spaces (U+0020), horizontal tabs (U +0009), carriage returns (U+000D), and newlines (U+000A), is ignored except as it separates tokens that would otherwise combine into a single token. While breaking the input into tokens, the next token is the longest sequence of characters that form a valid token. The second sentence is wrong: it states that there are "four" classes and then proceeds to enumerate 5. The third sentence is no longer true with the Dec 22 release. At a minimum, the assertion above that carriage returns are whitespace must be removed because they now no longer server to separate tokens and are ignored otherwise: they significantly affect the parse. See issues 453 and 454 and http://groups.google.com/group/golang-nuts/t/9eb0c7c1f20db921 (particularly my post). More explicitly, it should discuss the fact that newlines are sometimes just whitespace and sometimes parse elements and that there are special rules throughout the latter portions of the language specification detailing when they are not just whitespace. To this end, newlines may form the 6th class of tokens.