[DO NOT MERGE] Reorder core basics #7153

AlisdairM · 2024-07-23T11:09:54Z

This PR shows a vision for re-ordering the Core clauses around program construction for C++26. This is based on the pre-St Louis draft, NOT the current working draft. The intent is to show direction, and re-apply useful edits if this direction gains approval.

There are a variety of small edits, but the significant ones are:

give each phase of translation a stable label so they can be individually referenced
arrange the lex subclauses to better follow the phases of translation
merge the specification of comments directly into phase 3, as that is the only place they are used
split the literals subclause into 2, to reflect string literals are in the preprocessor, but arithmetic literals are phase 7
move all the preprocessor tokens (phase 3) below a new subtitle to better group them
MERGE [cpp] INTO PLACE AS PHASE 4
move modules adjacent to [lex], preceding [basic]

This is a response to #2252 where in practice I could not find a satisfactory way to integrate the primitive parts of [lex] with [basic], so instead made the best attempt to clarify the source aspects of program creation that I could.

Overall, I find this proposed structure very helpful, as I better understood a number of parsing issues, mostly preprocessing issues that I did not always realise were preprocessing, but cannot be sure how much of that understanding comes from being involved in making the transformation itself.

The grammar for universal-character-name is oddly sandwiched into the middle of the subcluase talking about the different character sets used by the standard. To improve the flow, extract that grammar into its own subclause. In the extraction, I make two other clarifying changes. First, describe this new subclause as 'a way to name any element of the of the tranlation character set using just the basic character set' rather than simply 'a way to name other characters'. Secondly, remove the 'one of' in the grammar where there is only one option to choose.

…and UCNs

The current contents of [basic.pre] jump between specifying different things. This PR moves all the specification of names to the front, followed by the specification of entities. There are two main benefits: (1) the specification for when two names are the same is a list of 4 rules that correspond to the 4 things than can form a name --- the connection is much clearer when the paragraphs are adjacent and the list is sorted to the same order; (2) in this form, even though all the words are the same, the reordering and merging of paragraphs a fit on a single page. The very last paragraph was forced over a page-break in the original layout.

This change puts all the specification for assembling and transforming the source of a program ([lex], [cpp], and [modules]) ahead of the basic core specification of how to interpret that source.

This PR colocates [lex], [cpp], and modules to put all the parts that talk about assembling and translating a program together. In doing so, it rearranges the subclauses in [lex] and introduces subclauses that can be cross-references for each phase of translation. Metadata is introduced to identify the first and last core clause when cross-references want the first/last property rather than the specific clause itself. The subclause on comments, [lex.comment], is merged into the new [lex.phase.3] as comments feature only during translation phase three.

…on unit The definition of program at the top of [basic.link] should move to the front of [lex.separate] so that it is defined before its first usage, and also clarifies that the phases of translation produce. Similarly, move the definition of the grammar production translation-unit to the top of the first clause to actually use it, [module.unit]. Finally, retitle [basic.link] as just Linkage, rather than prgrams and linkage.

AlisdairM · 2024-07-23T11:20:11Z

While I like this rearrangement, the placing of the predefined macro names feels odd, as I got too used to it being the very last part of the Core specification, before the Library intro, and it feels lost buried in the middle of this combined clause.

jensmaurer · 2024-07-23T16:45:42Z

Not having looked at the details, do we finally differentiate the grammar for preprocessor tokens from those for phase 7 tokens? "identifier" does double duty here.

AlisdairM · 2024-07-23T16:55:45Z

Structurally, pp-tokens in phases 3 and 4 are more clearly distinct from tokens in phase 7, but identifier still does double duty, as all I was comfortable doing was moving words around, not changing them.

It would be much easier to write a tiny paper of CWG issue to address concerns with identifier after this change though. I certainly run into identifier issues on proposals I am writing that touch on this space.

AlisdairM · 2024-09-30T00:00:02Z

Closing this PR in favor of #7272 that does not try to mess with the text for phases of translation, but more freely reorganizes the core clauses.

AlisdairM added 10 commits July 10, 2024 10:23

[lex] Reorganize contents to follow grammar and phases of translation

6092385

[lex.charset] Introduced parent [lax.char] clause for character sets …

3d060e5

…and UCNs

[std.tex] Reorder preprocessor and modules before basic

52aedd5

This change puts all the specification for assembling and transforming the source of a program ([lex], [cpp], and [modules]) ahead of the basic core specification of how to interpret that source.

[cpp] Merge preprocessor directives into lex

5f88eff

Further work cleaning up preprocessor tokens

a149f60

Links to appropriate phase of translation

a467f30

AlisdairM mentioned this pull request Jul 23, 2024

Reorder [basic] before [lex] #2252

Open

AlisdairM closed this Sep 30, 2024

AlisdairM deleted the reorder_core_basics branch October 23, 2024 23:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE] Reorder core basics #7153

[DO NOT MERGE] Reorder core basics #7153

AlisdairM commented Jul 23, 2024 •

edited

Loading

AlisdairM commented Jul 23, 2024

jensmaurer commented Jul 23, 2024 •

edited

Loading

AlisdairM commented Jul 23, 2024

AlisdairM commented Sep 30, 2024

[DO NOT MERGE] Reorder core basics #7153

[DO NOT MERGE] Reorder core basics #7153

Conversation

AlisdairM commented Jul 23, 2024 • edited Loading

AlisdairM commented Jul 23, 2024

jensmaurer commented Jul 23, 2024 • edited Loading

AlisdairM commented Jul 23, 2024

AlisdairM commented Sep 30, 2024

AlisdairM commented Jul 23, 2024 •

edited

Loading

jensmaurer commented Jul 23, 2024 •

edited

Loading