-
Notifications
You must be signed in to change notification settings - Fork 749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Reorder core basics #7153
Conversation
The grammar for universal-character-name is oddly sandwiched into the middle of the subcluase talking about the different character sets used by the standard. To improve the flow, extract that grammar into its own subclause. In the extraction, I make two other clarifying changes. First, describe this new subclause as 'a way to name any element of the of the tranlation character set using just the basic character set' rather than simply 'a way to name other characters'. Secondly, remove the 'one of' in the grammar where there is only one option to choose.
The current contents of [basic.pre] jump between specifying different things. This PR moves all the specification of names to the front, followed by the specification of entities. There are two main benefits: (1) the specification for when two names are the same is a list of 4 rules that correspond to the 4 things than can form a name --- the connection is much clearer when the paragraphs are adjacent and the list is sorted to the same order; (2) in this form, even though all the words are the same, the reordering and merging of paragraphs a fit on a single page. The very last paragraph was forced over a page-break in the original layout.
This change puts all the specification for assembling and transforming the source of a program ([lex], [cpp], and [modules]) ahead of the basic core specification of how to interpret that source.
This PR colocates [lex], [cpp], and modules to put all the parts that talk about assembling and translating a program together. In doing so, it rearranges the subclauses in [lex] and introduces subclauses that can be cross-references for each phase of translation. Metadata is introduced to identify the first and last core clause when cross-references want the first/last property rather than the specific clause itself. The subclause on comments, [lex.comment], is merged into the new [lex.phase.3] as comments feature only during translation phase three.
…on unit The definition of program at the top of [basic.link] should move to the front of [lex.separate] so that it is defined before its first usage, and also clarifies that the phases of translation produce. Similarly, move the definition of the grammar production translation-unit to the top of the first clause to actually use it, [module.unit]. Finally, retitle [basic.link] as just Linkage, rather than prgrams and linkage.
While I like this rearrangement, the placing of the predefined macro names feels odd, as I got too used to it being the very last part of the Core specification, before the Library intro, and it feels lost buried in the middle of this combined clause. |
Not having looked at the details, do we finally differentiate the grammar for preprocessor tokens from those for phase 7 tokens? "identifier" does double duty here. |
Structurally, pp-tokens in phases 3 and 4 are more clearly distinct from tokens in phase 7, but identifier still does double duty, as all I was comfortable doing was moving words around, not changing them. It would be much easier to write a tiny paper of CWG issue to address concerns with identifier after this change though. I certainly run into identifier issues on proposals I am writing that touch on this space. |
Closing this PR in favor of #7272 that does not try to mess with the text for phases of translation, but more freely reorganizes the core clauses. |
This PR shows a vision for re-ordering the Core clauses around program construction for C++26. This is based on the pre-St Louis draft, NOT the current working draft. The intent is to show direction, and re-apply useful edits if this direction gains approval.
There are a variety of small edits, but the significant ones are:
This is a response to #2252 where in practice I could not find a satisfactory way to integrate the primitive parts of [lex] with [basic], so instead made the best attempt to clarify the source aspects of program creation that I could.
Overall, I find this proposed structure very helpful, as I better understood a number of parsing issues, mostly preprocessing issues that I did not always realise were preprocessing, but cannot be sure how much of that understanding comes from being involved in making the transformation itself.