Principle: information accumulation #875

zygoloid · 2021-10-08T21:35:24Z

This proposal addresses the question of whether we should perform a fully top-down compilation (like in C), a mostly top-down compilation (like in C++), or whether we should allow information from later in the same source file to be used in earlier program constructs (like in Rust, Swift, Java, C#, Haskell, and so on).

The proposed direction is:

Entities declared later in the same source file cannot be used earlier; top-down semantics apply everywhere.
- As an exception, class member function bodies are parsed as if they appeared after the class.
Forward declarations can be used to separate interface from implementation and to allow entities to be used before they are defined.
The behavior of the program is nonetheless required to be the same as if we had a globally-consistent rule: it's always a hard error to depend on any information that is not known or that is provided later.

Fixes #472.

chandlerc

Some early thoughts on the draft here...

proposals/p0875.md

chandlerc · 2021-10-09T01:17:19Z

Thinking at a high level about whether there are tradeoffs not mentioned here...

I think there are more disadvantages than you cover for disallowing separate declarations and definitions:

Familiarity with C++ would be hit significantly as this is heavily used to separate implementation details today.
Readability is I think more impacted than you are really covering. For example, forced nesting of definitions (or deep indentation more generally) is often cited as reducing readability. Losing the ability to define members out-of-line seems likely to have sharp consequences here. Especially for nested class members, etc.

It may be worth noting that these don't seem to have much to do with top-down processing...

proposals/p0875.md

fowles · 2021-10-13T00:25:36Z

proposals/p0875.md

+In order to support this and still permit cyclic references between entities, we
+would need to permit separate declaration and definition.
+
+_Comprehensibility:_ This rule is simple to explain, and has no special cases.


this rule is only simple to explain if declaration vs definition is simple to explain, which as noted above for C++ is not true

I think this is really a problem with separate declaration and definition rather than with the top-down rule. I've tried to separate the two out and added a discussion of this there.

at our weekly meeting.

zygoloid · 2021-10-16T00:58:00Z

@chandlerc I know this isn't your preferred direction; how opposed are you to this approach? In our most recent weekly meeting, we had a 5:2 preference for this direction over a more top-down model.

josh11b · 2021-10-20T16:58:35Z

proposals/p0875.md

+impl fn F() {}
+```
+
+_Comprehensiblity:_ In general, determining whether two declarations declare the


I think this would be greatly simplified if we require overloads to be defined together with a dedicated syntax.

Yep. And this is mentioned now below in the simple compilation area.

Mentioning that option below makes it all the more confusing that it's not mentioned here.

Also, given that these sub-alternatives are relevant to multiple goals (at least "comprehensibility" and "compilation", but arguably some of the others as well), they should probably be introduced prior to the goal subsections.

I think the text below and this suggestion, while superficially similar, are actually not the same thing. The text below is talking about a possible implementation strategy, particularly for name mangling, where functions are identified by ordinal position / declaration order in the API file, without restricting where in that API file they might be declared. This suggestion, as I understand it, is about matching declaration to definition by requiring that all declarations be provided together in a contiguous source utterance, and using a similar contiguous utterance to define the overload set in the same order. These options are compatible with each other, but are independent choices.

I've added a writeup of the dedicated syntax option here.

chandlerc

Just an initial quick pass at the text here with a few comments inline....

proposals/p0875.md

chandlerc · 2022-01-08T03:07:08Z

proposals/p0875.md

+impl fn F() {}
+```
+
+_Comprehensiblity:_ In general, determining whether two declarations declare the


Yep. And this is mentioned now below in the simple compilation area.

docs/project/principles/information_accumulation.md

proposals/p0875.md

jonmeow

Sorry, I know my comments are late, but I was looking again because of fowles' comments and thought I'd make a few...

proposals/p0875.md

docs/project/principles/information_accumulation.md

proposals/p0875.md

geoffromer · 2022-01-10T19:36:37Z

proposals/p0875.md

+impl fn F() {}
+```
+
+_Comprehensiblity:_ In general, determining whether two declarations declare the


Mentioning that option below makes it all the more confusing that it's not mentioned here.

Also, given that these sub-alternatives are relevant to multiple goals (at least "comprehensibility" and "compilation", but arguably some of the others as well), they should probably be introduced prior to the goal subsections.

proposals/p0875.md

jonmeow

I added a thumbs up on other responses, not resolving where I didn't start the thread.

proposals/p0875.md

jonmeow · 2022-01-12T17:22:30Z

@chandlerc In discussion about this, you brought up an example of Swift code where a lot of functions referenced a class which was only defined at the bottom of the file. Trying to understand those functions required reading the class definition, which was out of a logical reading order. With the accumulation rule as given, the class would more likely be defined at the top of the file, and read first.

I would note though, I think the accumulation rule can also get in the way of BLUF-style reading, forcing implementation details to come first and the crux of the implementation to come last. For example, consider how we've implemented name lookup in Carbon:
https://github.com/carbon-language/carbon-lang/blob/trunk/executable_semantics/interpreter/resolve_names.cpp

To summarize layout here:

Forward declarations of AddExposedNames to address mutual recursion.
AddExposedNames implementations.
Forward declarations of ResolveNames to address mutual recursion.
ResolveNames implementations, calling AddExposedNames.
The primary ResolveNames(AST) function, which is the public API, and calls the other ResolveNames APIs.

From a BLUF perspective, I'd like to put the public function first: it's the high-level summary of what's going on. Then, dig into the function it calls, with the small helpers at the end.

From my perspective, C++'s name lookup rule exerts a strong pressure to put small helper functions at the top of the file, and the key implementation details at the bottom -- I view this as undesirable because I think people would prefer to see the main implementation, then walk through its calls.

If the leaning is that classes should have public APIs first and private APIs last, I think that reflects similar BLUF preferences: the public APIs are the main implementation, the private APIs are the helpers. I think the difference is that C++ has trained us to accept a different layout between class and free functions.

This pattern continues into header files, for example when it's necessary to provide implementation-detail classes above public classes just to ensure name lookup is satisfied. Sometimes forward declarations can be used to partially address this, but the forward declarations would still be first and would still be less important to public API users; also, either language limitations or a developer's preference for minimizing forward declarations can prevent their use.

Allowing more arbitrary layouts will certainly allow style issues as in the Swift example, where developers may provide wholly illogical layouts. I think flexibility in name lookup can also allow better API layouts, where developers aren't forced to put private implementation details above public APIs just because name lookup requires it.

proposals/p0875.md

josh11b · 2022-01-14T19:54:23Z

proposals/p0875.md

+
+As in C++, member functions of nested classes would be deferred until the
+outermost class is complete. Unlike in C++, this deferral would apply only to
+the bodies of member functions, and not to default arguments or other contexts


I don't know what the word "contexts" here means.

Expanded to a complete description of the difference from C++.

chandlerc

Haven't finished going through all the alternatives text yet, but at least covered the main principle and some of the proposal. I'd like to see if there is any way to compress or be more brief on the alternatives without losing information. But if not, it does seem better to capture the information, it may just be worth making sure it summarizes reasonably well.

docs/project/principles/information_accumulation.md

proposals/p0875.md

chandlerc

(Just wanted to note that other than a minor wording tweak an the re-ordering of some of the alternatives, I think I'm basically happy here. But I'd like to give folks with comment threads a chance to look at the updated version and see if at least their comments are addressed -- I understand that the direction is going to be universally like, but I want to make sure the write up doesn't have actively confusing points.)

geoffromer

I'm broadly happy with this; just a few minor suggestions for clarification.

proposals/p0875.md

Co-authored-by: Geoff Romer <gromer@google.com>

chandlerc

I think this looks good. I think it aligns well with the leads' decision. I know we're not going to hit full consensus here, but I'm not seeing a lot of critical comment threads left. I think this LG to land as-is. The proposal itself I think tries to capture the borderline nature of a bunch of these tradeoffs, and the principle is I think simple and to the point and well aligned with the leads decision.

Ship it!

chandlerc

I think this looks good. I think it aligns well with the leads' decision. I know we're not going to hit full consensus here, but I'm not seeing a lot of critical comment threads left. I think this LG to land as-is. The proposal itself I think tries to capture the borderline nature of a bunch of these tradeoffs, and the principle is I think simple and to the point and well aligned with the leads decision.

Ship it!

This proposal addresses the question of whether we should perform a fully top-down compilation (like in C), a mostly top-down compilation (like in C++), or whether we should allow information from later in the same source file to be used in earlier program constructs (like in Rust, Swift, Java, C#, Haskell, and so on). The proposed direction is: - Entities declared later in the same source file cannot be used earlier; top-down semantics apply everywhere. - As an exception, class member function bodies are parsed as if they appeared after the class. - Forward declarations can be used to separate interface from implementation and to allow entities to be used before they are defined. - The behavior of the program is nonetheless required to be the same as if we had a globally-consistent rule: it's always a hard error to depend on any information that is not known or that is provided later.

…type is known. (#1352) Following #875, diagnose any use of a name prior to the point where it is introduced and its type is known. This works by performing name resolution on a top-level class, interface, or impl twice: the first pass performs name-resolution for everything other than nested function bodies, and the second pass performs name resolution on function bodies. At the moment, the second pass does a superset of the work done by the first pass, and as a consequence, some identifier expressions now have their target set twice to the same thing.

zygoloid added the proposal A proposal label Oct 8, 2021

zygoloid requested a review from a team October 8, 2021 21:35

zygoloid added this to Draft in Proposals via automation Oct 8, 2021

google-cla bot added the cla: yes PR meets CLA requirements according to bot. label Oct 8, 2021

zygoloid added 2 commits October 8, 2021 14:35

Filling out template with PR 875

38ec620

Initial incomplete sketch

235cb64

chandlerc reviewed Oct 9, 2021

View reviewed changes

fowles reviewed Oct 13, 2021

View reviewed changes

Respond to review comments and add a proposed direction based on a poll

a31257a

at our weekly meeting.

zygoloid moved this from Draft to RFC in Proposals Oct 16, 2021

zygoloid marked this pull request as ready for review October 16, 2021 00:53

github-actions bot added the proposal rfc Proposal with request-for-comment sent out label Oct 16, 2021

josh11b reviewed Oct 20, 2021

View reviewed changes

josh11b mentioned this pull request Oct 29, 2021

Constraints for generics (generics details 3) #818

Merged

chandlerc mentioned this pull request Jan 5, 2022

How much complexity should we invest in deduced/inferred return types? #1008

Closed

Update based on recent discussions.

2d8066c

zygoloid requested a review from a team as a code owner January 8, 2022 02:44

chandlerc reviewed Jan 8, 2022

View reviewed changes

chandlerc self-requested a review January 8, 2022 03:11

fowles reviewed Jan 9, 2022

View reviewed changes

docs/project/principles/information_accumulation.md Outdated Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

jonmeow reviewed Jan 10, 2022

View reviewed changes

Address review comments and undo mess that prettier made.

7eac76e

geoffromer reviewed Jan 10, 2022

View reviewed changes

More responses to review comments.

9e81697

jonmeow reviewed Jan 10, 2022

View reviewed changes

proposals/p0875.md Outdated Show resolved Hide resolved

zygoloid added 3 commits January 10, 2022 15:09

Add section describing how the selected alternative was chosen.

85682d2

Simplify example to assume less about undecided language questions.

3fa2ae4

More responses to review comments.

e98f783

geoffromer reviewed Jan 12, 2022

View reviewed changes

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Show resolved Hide resolved

Update based on review comments.

7642dcf

josh11b reviewed Jan 14, 2022

View reviewed changes

Be more explicit about how the Carbon rule differs from the C++ rule.

570dfc2

chandlerc mentioned this pull request Feb 19, 2022

Open question: Calling functions defined later in the same file #472

Closed

chandlerc reviewed Feb 25, 2022

View reviewed changes

Respond to review comments.

7aff473

chandlerc reviewed Mar 2, 2022

View reviewed changes

geoffromer reviewed Mar 3, 2022

View reviewed changes

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Outdated Show resolved Hide resolved

proposals/p0875.md Show resolved Hide resolved

Apply suggestions from code review.

89d9113

Co-authored-by: Geoff Romer <gromer@google.com>

geoffromer approved these changes Mar 9, 2022

View reviewed changes

chandlerc approved these changes Mar 16, 2022

View reviewed changes

zygoloid merged commit 8259f76 into carbon-language:trunk Mar 16, 2022

Proposals automation moved this from RFC to Accepted Mar 16, 2022

zygoloid deleted the principle-information-accumulation branch March 16, 2022 21:42

github-actions bot added proposal accepted Decision made, proposal accepted and removed proposal rfc Proposal with request-for-comment sent out labels Mar 16, 2022

This was referenced Jun 29, 2022

Make the names of declarations unusable before the point where their type is known. #1352

Merged

Design overview update part 5: Names #1347

Merged

zygoloid mentioned this pull request Jul 19, 2022

Forward declaration #1416

Closed

This was referenced Oct 13, 2022

explorer crashes attempting Value -> TupleValue cast #1394

Closed

Clarifications for name lookup in classes #2286

Closed

jonmeow added a commit to jonmeow/carbon-lang that referenced this pull request Oct 13, 2022

Update name lookup with regards to carbon-language#875.

6af5930

jonmeow mentioned this pull request Oct 13, 2022

Allow unqualified name lookup for class members #2287

Merged

josh11b mentioned this pull request Dec 17, 2022

Matching forward declaration of functions and classes #2477

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Principle: information accumulation #875

Principle: information accumulation #875

zygoloid commented Oct 8, 2021 •

edited

chandlerc left a comment

chandlerc commented Oct 9, 2021

fowles Oct 13, 2021

zygoloid Oct 15, 2021

zygoloid commented Oct 16, 2021

josh11b Oct 20, 2021

chandlerc Jan 8, 2022

geoffromer Jan 10, 2022

zygoloid Jan 14, 2022

chandlerc left a comment

chandlerc Jan 8, 2022

jonmeow left a comment

geoffromer Jan 10, 2022

jonmeow left a comment

jonmeow commented Jan 12, 2022 •

edited

josh11b Jan 14, 2022

zygoloid Jan 14, 2022

chandlerc left a comment

chandlerc left a comment

geoffromer left a comment

chandlerc left a comment

chandlerc left a comment

Principle: information accumulation #875

Principle: information accumulation #875

Conversation

zygoloid commented Oct 8, 2021 • edited

chandlerc left a comment

Choose a reason for hiding this comment

chandlerc commented Oct 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zygoloid commented Oct 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonmeow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonmeow left a comment

Choose a reason for hiding this comment

jonmeow commented Jan 12, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

geoffromer left a comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

chandlerc left a comment

Choose a reason for hiding this comment

zygoloid commented Oct 8, 2021 •

edited

jonmeow commented Jan 12, 2022 •

edited