diff --git a/source/lex.tex b/source/lex.tex index 8005b33374..b31ed7f413 100644 --- a/source/lex.tex +++ b/source/lex.tex @@ -116,8 +116,8 @@ to the file. \item The source file is decomposed into preprocessing -tokens\iref{lex.pptoken} and sequences of whitespace characters -(including comments). A source file shall not end in a partial +tokens\iref{lex.pptoken} and whitespace\iref{lex.comment}. +A source file shall not end in a partial preprocessing token or in a partial comment. \begin{footnote} A partial preprocessing @@ -129,10 +129,6 @@ would arise from a source file ending with an unclosed \tcode{/*} comment. \end{footnote} -Each comment\iref{lex.comment} is replaced by one \unicode{0020}{space} character. New-line characters are -retained. Whether each nonempty sequence of whitespace characters other -than new-line is retained or replaced by one \unicode{0020}{space} character is -unspecified. As characters from the source file are consumed to form the next preprocessing token (i.e., not being consumed as part of a comment or other forms of whitespace), @@ -518,6 +514,8 @@ \indextext{comment!\tcode{//}}% The characters \tcode{//} start a comment, which terminates immediately before the next new-line character. +Each comment is replaced by one \unicode{0020}{space} character; +new-line characters are retained. \begin{note} The comment characters \tcode{//}, \tcode{/*}, and \tcode{*/} have no special meaning within a \tcode{//} comment and @@ -525,6 +523,29 @@ characters \tcode{//} and \tcode{/*} have no special meaning within a \tcode{/*} comment. \end{note} + +\indextext{whitespace}% +\pnum +Preprocessing tokens can be separated by whitespace; +this consists of comments, or whitespace characters +(\unicode{0020}{space}, +\unicode{0009}{character tabulation}, +new-line, +\unicode{000b}{line tabulation}, and +\unicode{000c}{form feed}), or both. +\begin{note} +In certain circumstances during translation phase 4, as described in \ref{cpp}, +whitespace (or the absence thereof) serves as more than +preprocessing token separation. +Whitespace can appear within a preprocessing token only as part of +a \grammarterm{header-name} or +between the quotation characters in a character literal or string literal. +\end{note} + +\pnum +Whether each nonempty sequence of whitespace characters other than new-line +is retained or replaced by one \unicode{0020}{space} character is unspecified. + \indextext{comment|)} \rSec1[lex.pptoken]{Preprocessing tokens} @@ -562,22 +583,6 @@ If a \unicode{0027}{apostrophe}, a \unicode{0022}{quotation mark}, or any character not in the basic character set matches the last category, the program is ill-formed. -Preprocessing tokens can be separated by -\indextext{whitespace}% -whitespace; -\indextext{comment}% -this consists of comments\iref{lex.comment}, or whitespace characters -(\unicode{0020}{space}, -\unicode{0009}{character tabulation}, -new-line, -\unicode{000b}{line tabulation}, and -\unicode{000c}{form feed}), or both. -As described in \ref{cpp}, in certain -circumstances during translation phase 4, whitespace (or the absence -thereof) serves as more than preprocessing token separation. Whitespace -can appear within a preprocessing token only as part of a header name or -between the quotation characters in a character literal or -string literal. \pnum Each preprocessing token that is converted to a token\iref{lex.token}