Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module system enabling separate compilation #416

Open
Kha opened this issue Apr 22, 2021 · 5 comments
Open

Module system enabling separate compilation #416

Kha opened this issue Apr 22, 2021 · 5 comments
Labels
enhancement New feature or request

Comments

@Kha
Copy link
Member

Kha commented Apr 22, 2021

Currently import truly imports everything from a module, including private declarations and transitively imported declarations. Thus if we make any change at all that is reflected in the .olean output file, we have no choice but to recompile every downstream module.

This issue usually is solved by use of a "proper" module system that allows sufficient information hiding, e.g. by exporting only the signatures of declarations (and not even that for private declarations). In the simplest case, this can be done literally in a "signature" .olean file, in which case the build system can forego rebuilding downstream modules (that in fact import only the signature file) if the file is unchanged. However, note that this does not a priori prevent rebuilds on other kinds of changes such as adding a new (public) declaration, which would be much harder to prevent.

A module system with proper information hiding would not only avoid rebuilds, but also follow general good software practices and enable further ones such as semantic versioning. However, some aspects of Lean (and similar ITPs) complicate the design of such a system:

  • Definitional unfolding is inherently anti-modular. If we want to prove something about a definition from another module, that definition must either be marked to export its body (which abbrev, [reducible], and similar should automatically do), or we have to import the full .olean file after all. Any library aiming to make the most out of the module system should strive to prove all relevant properties about a definition inside the definition's module so that downstream modules can rely on these theorems instead of having to unfold the definition. However, if a module is split into multiple parts merely for code organization, then importing the full .olean files within that group should be fine, retaining access to the definition bodies within the group and separate compilation outside of it. If a definition is sufficiently simple/stable that changes to it are not expected, it can still be fully exported using some marker as mentioned above, of course.
  • Meta-programs are even worse: if we want to run an imported function at compile time, we necessarily depend on its body and transitively on the bodies of all referenced functions. If we want separate compilation in the presence of metaprograms, I don't see any way but to put them into separate (output) files. Manually doing so is not realistic since every single notation, which we definitely want to use interspersed with non-meta code, is a metaprogram. Thus it seems we need a staging approach where metaprograms are marked as such (Racket would call this the "phase" of the declaration) and exported into different output files from regular declarations. Declarations can then be imported from specific phases into specific phases (so that metaprograms can make use of regular functions) and the build system creates file dependencies accordingly, though I'm not sure we'd need Racket's full, very general Int-indexed phase system for our use cases.
  • Further (meta)data has to be exported for compilation. In the case of [inline] and [specialize], there is no way around exporting the definition bodies (or rather their IR) if we want to these attributes to work cross module. Then there are some cases where the body affects the ABI, e.g. in type synonyms such as def MyUnboxedUInt := UInt16. Finally, some cross-module information we are currently using such as lifted closed terms should probably be limited/disabled, at least by default, so it does not defeat separate compilation.
  • Finally, some parts of the existing implementation would have to be refined to make the most out of separate compilation. For example, adding a new private declaration could still trigger downstream rebuilds if any (exported) autogenerated names (such as from macros) change. @leodemoura and I have already discussed basing macro scopes not on a module-local counter but a declaration-local one based on a (very short) hash of the declaration name (note that they are already based on the module name to make them globally unique). In the unlikely case of a (module-local) conflict between these hashes, we would increment the second one, which could in theory again lead to more rebuilds than absolutely necessary, but should hardly matter in practice.
@Kha Kha added the enhancement New feature or request label Apr 22, 2021
@eternaleye
Copy link

It may be worth looking at Musa Al-hassy's "Do-it-Yourself Module Systems" (PhD thesis and defense slides). It targets Agda, and so much of it is likely applicable to Lean as well, while providing a powerful model of how the functionality of module systems can be achieved without introducing a second "module language" with all of its downsides - while also providing powerful tools for composition and reinterpretation of interfaces.

@Kha
Copy link
Member Author

Kha commented Aug 30, 2021

@eternaleye This issue is about second-class, unparameterized modules in the sense of compilation units, so I'm not sure that work is applicable here. We certainly don't want to represent module signatures (shallowly) as Lean types.

@ydewit
Copy link
Contributor

ydewit commented Feb 11, 2022

Would this be a relevant pointer to Racket's phase separation?

@Kha
Copy link
Member Author

Kha commented Feb 11, 2022

@ydewit That, and https://dl.acm.org/doi/abs/10.1145/583852.581486

@ydewit
Copy link
Contributor

ydewit commented Jun 28, 2022

What is the meaning of a "signature" (.olean) described above?

I found the answer to my own question so removing to reduce the noise on the issue.

ChrisHughes24 pushed a commit to ChrisHughes24/lean4 that referenced this issue Dec 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Module system
Awaiting triage
Development

No branches or pull requests

3 participants