Skip to content
This repository has been archived by the owner on Oct 25, 2023. It is now read-only.

feat: precompileModules #47

Closed
wants to merge 15 commits into from
Closed

feat: precompileModules #47

wants to merge 15 commits into from

Conversation

Kha
Copy link
Member

@Kha Kha commented Jan 20, 2022

This is the more immediately important part of leanprover/lean4#949. It was less complicated in the end than I feared, though I went for the most direct implementation for now. It might be a good idea to refactor the three recursive module builders into a single one.

This should probably have a test, but it will have to wait until leanprover/lean4#959 is released.

@Kha Kha force-pushed the precomp-modules branch 2 times, most recently from 05fd2f7 to ce3472f Compare January 21, 2022 13:14
@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

Fun facts: on Linux, linking against a library as in leanc ... ./build/ir/My.Dep.so embeds this exact relative path in the NEEDED field, so the DSO is found automatically iff the cwd isn't changed. On Windows, we do the same thing as on other platforms and link against the dependent DLLs directly, but we are very lucky that this works because lld has only supported this since its very latest version!

@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

iff the cwd isn't changed

Thinking about this again, this could be an issue if the server and Lake are run from different directories because of a non-standard srcDir. So... don't do that for now.

Lake/Config/Package.lean Outdated Show resolved Hide resolved
@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

@Kha On a more general note, it doesn't really make sense allow precompiling modules for olean packages (i.e.,, buildOleanTarget and moduleOleanTarget). The whole point of the olean vs olean & c versions is that that the former doesn't build native code. Thus, I think that precompiling should only occur in the olean & c version.

@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

Would that extend to the respective facets? Meaning, building the oleans facet would not use precompilation even if explicitly opted in by the user? I would find that counterintuitive. Again, precompilation to me is an internal optimization independent of what is requested as the actual build result.

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

Again, precompilation to me is an internal optimization independent of what is requested as the actual build result.

I don't see it this way. Precompilation it not simply an optimization, it has significant effects on how a Lean module is interpreted -- extern constants from imports can now be executed by the interpreter without crashing. Thus many package may potentially only build if a precompilation is enabled.

Meaning, building the oleans facet would not use precompilation even if explicitly opted in by the user? I would find that counterintuitive.

The package facet is intended to indicate exactly what outputs are to be built for modules. The olean facet was designed to mean that Lake only builds olean files for modules. It was intended to be used by libraries which wish to avoid the overhead of building native code they don't use. Since precompilation requires building native code, it shouldn't be compatible with such a setting (imo).

@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

What's the solution then? Do we need a new facet just for precompilation?

Lake/Build/Actions.lean Outdated Show resolved Hide resolved
@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

What's the solution then? Do we need a new facet just for precompilation?

Will I need a different syntax for compiling a single module from mathlib4 than from other projects? I'm afraid this seems pointlessly user-hostile to me.

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

@Kha

What's the solution then? Do we need a new facet just for precompilation?

Adding a module facet for a module's precompiled library would be a nice future addition (so that it can be built explicitly through the CLI). However, rolling it into the olean & c facet, as you have already done, is sufficient for now.

Also, with this change #46, the more logical split for package (library) facets would be a pure module build (i.e., just the cross-platform module olean and ilean files) vs native module build (i.e., cross-platform olean/ilean plus native c/o/so) to replace the current olean/olean&c split.

Will I need a different syntax for compiling a single module from mathlib4 than from other projects? I'm afraid this seems pointlessly user-hostile to me.

No, all these changes are primarily internal, they don't significantly effect the API/UI. Even the additional facet would just provide finer control. It wouldn't make standard compilation any different.

let mut env := #[("LEAN_PATH", some oleanPath.toString)]
if !precompiledPath.isEmpty && System.Platform.isWindows then
let path := (← IO.getEnv "PATH").getD ""
-- set up search path for locating dependent DLLs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this shouldn't be just precompiledPath, but soPath and also include DLL directories from dependencies.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by soPath? I haven't looked into package dependencies yet, true. Frankly I'm fine for now if it works on mathlib4 😄 (esp. if it works well enough so that we can benchmark and detect important bottlenecks).

Copy link
Member

@tydeu tydeu Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kha I'm actually a little confused about how this works, why do you even need to change the PATH? Why is passing the full path to --load-dynlib not sufficient?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full path works, but it's that library's dependencies that are searched by name in PATH. We only pass the direct dependencies' libraries to lean, which ensures the cmdline length is reasonably bounded.

Copy link
Member

@tydeu tydeu Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kha Why would command line length be a concern? If passing the entire dependency closure works, that seems like a better solution to me than altering PATH. Furthermore, from the code in recBuildModulePrecompTargetWithLocalImports, it looks like that such an approach already necessary on Windows, so keeping it consistent across platforms seems like another benefit.

Note: I am not asking you change the PR, I just want to verify that such a solution is feasible for when I implement this feature myself (hopefully) this week,

EDIT: Using the transitive closure would also solve the problem for package dependencies as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Command line lengths are always a concern for practically unbounded input. For the linker we can always use response files, which we have yet to invent for Lean. But it will probably take a while for dependency trees to get large enough for this to blow up, so passing the entire dependency tree sounds good to me.

I don't understand the "consistent across platforms" comment, other platforms should not unnecessarily be burdened with Windows limitations.

Copy link
Member

@tydeu tydeu Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kha Maybe this is due to some experience I lack, but I still don't see why lengths are a problem. Is there some major performance or security issue with lengths that go even into the multiples of MBs range? It seems like that still wouldn't be a major burden to memory or parse time (though I can see why one might not want to log it).

For the cross-platform code, sure, special casing an operating system when there is clear advantage to for doing so makes sense. However, if multiple approaches are reasonable and one is cross-platform, I would go with it. I think simpler code is a virtue and, if it doesn't have a significant cost, should be preferred.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now I understand the confusion! Command line lengths are unfortunately quite limited in practice: https://devblogs.microsoft.com/oldnewthing/20031210-00/?p=41553. But I think we should fall under the 32K case, so we should be good for a good while. Still, people (especially people with unusual build setups like in Nixpkgs and GHC that create an unusual number of dependent libraries) do run into these limits on all platforms from time to time, which is why they invented response file to hide their humongous command-line arguments in.

Copy link
Member

@tydeu tydeu Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kha Thanks! I learn something new everyday. I am somewhat surprised that such a limitation hasn't been lifted yet (either on Windows or Unix), but I imagine there are complexities there I do not realize.

We should fall under 32K for now. However, if we assume an average of 50-80 bytes per argument, that only allows somewhere between 400-640 modules to be loaded by Lean or linked by the linker, so we will likely hit that cap rather soon. I can easily see mathlib surpassing it relatively quickly (and definitely by the time the port is finished). Thus, response files for CLIs (both Lean and the linker) seem like an important feature to consider soon.

@gebner
Copy link
Member

gebner commented Jan 21, 2022

From a mathlib point of view, it's important that precompilation can be selectively enabled for only some of the modules. That is, produce *.so for everything in Mathlib/Tactic, but for nothing else.

(EDIT: would that even work? Should we throw an error if a file has precompilation enabled but it's dependencies don't?)

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

@gebner Mathlib tactics can depend on other non-tactic parts of the library, though, right? So those dependencies would also need their native components built. As such, it seems like it would just be reasonable to build native components for everything for the time being.

@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

However, rolling it into the olean & c facet, as you have already done, is sufficient for now.

Ok. Just to be clear, does that mean we should keep buildOleanTarget as in this PR for now or not?

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

@Kha buildOleanTarget and Package.moduleOleanTarget should be reverted to what they where before the PR, leaving buildOleanAndCTarget and Package.moduleOleanAndCTarget as the only ones that do precompilation.

@Kha
Copy link
Member Author

Kha commented Jan 21, 2022

You're not describing a complete solution. Should the if be pushed upwards then, so that e.g. "oleans" uses buildOleanAndCTarget? That looks like a lot of code churn. And what should buildOleanTarget do then if precompileModules is set?

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

@Kha Sorry for not explaining this well enough. Here is summary of what I was suggesting:

  • buildOleanTarget, moduleOleanTarget, and PackageFacet.oleans/: always just build oleans (regardless of if precompileModules is set). Reason: The whole point of these options is to not build native components.

  • buildOleanAndCTarget, moduleOleanAndCTarget, and other package facets (i.e.,, staticLib, sharedLib, binary): precompile the native components if precompileModules is set. A separate PackageFacet could be added here to avoid building the static / shared library at the end (when running lake build), but that seems excessive at the moment for this PR.

@tydeu
Copy link
Member

tydeu commented Jan 21, 2022

An alternative solution would be to float the if out (as you suggested).

In that case there would be a separate buildPrecompTarget and modulePrecompTarget methods. PackageFacet.oleans would then need to be transformed something like PackageFacet.leanLib that builds precompTarget if precompileModules is set and oleanTarget otherwise. In addition, buildImportsAndDeps (i.e., print-paths) would need to be changed to take into account precompileModules when deciding whether to build oleans or oleans&c, or to do precompilation.

In fact, I think the PR as it stands needs to do some checking in buildImportsAndDeps (as it currently never does recompilation).

@Kha
Copy link
Member Author

Kha commented Jan 22, 2022

Hmm, perhaps we need to take a step back. I wanted to go with a straightforward It Just Works solution (... modulo dependencies) for now, but if that is not acceptable, I'd rather rewrite the code like I believe it will have to be in the near future anyway, as alluded to in the initial message. Specifically, I posit:

  • It is very likely that we will want to precompile some but not all modules in the future, either with a more fine-grained precompileModules option like Gabriel alluded to or, preferably, by a phase-separating module system that tells the build system exactly where the tactics are (...see also my WITS talk later today, perhaps it would have been more efficient for the discussion to wait for that one)
  • There is no reasonable use case for a facet or target that simply fails if precompilation is activated. If the module says it wants to be precompiled, that's exactly what the build system should do whenever necessary. While I can accept that a facet called oleans may not be expected to do that, this implies that it should be renamed to something more generic like Mac proposed.

If we accept both of these, the implication to me is that there should be a single target for recursively building modules that decides whether to build .c and .pre.so files on a per-module basis. Specifically,

  • .c files are built for any module that is in the closure of a shared library or binary target (the current oleanAndC)
  • .c and .pre.so files are built for any module in the closure of the modules to be precompiled

Thoughts? Also happy to try and resolve this in a call instead if it helps.

@tydeu
Copy link
Member

tydeu commented Jan 23, 2022

@Kha

Thoughts? Also happy to try and resolve this in a call instead if it helps.

I agree largely with what you said here, but I feel like we have different mental models of this change (and the related concepts in Lake) and it is leading us to talk past one another. A call seems like a good idea, as it is quicker and easier to have the back-and-forth necessary for a discussion like this in a synchronous call than this asynchronous discussion thread.

@Kha
Copy link
Member Author

Kha commented Jan 23, 2022

Great. Let's arrange a time & date on Zulip? Anyone else want to join?

@gebner
Copy link
Member

gebner commented Jan 23, 2022

I haven't read the whole discussion in depth. Some general notes:

I think it's okay to merge this implementation now, even if it is still "prototype" quality. It would be certainly be useful to try it out on mathlib4. The only hard requirement is that package depending on mathlib4 continue to work (i.e., precompilation needs to work across package dependencies, I didn't try out whether that works already).

More fine-grained control of what modules are compiled is not an immediate requirement now. It will only become important when mathlib gets more math content; say late spring / early summer.

Regarding large refactorings: I still think it would be very useful to base lake on (a Lean version of) shake, because that makes it much easier to add custom build targets and dependencies of lean files. If I understood you correctly on zulip, such a shake port is already on your todo-list. It might make sense to move that up before rewriting things twice.

Great. Let's arrange a time & date on Zulip? Anyone else want to join?

I'll gladly join unless you want to keep the meeting small.

@Kha
Copy link
Member Author

Kha commented Jan 26, 2022

I redid my measurements on mathlib4 after the above commit, which should help with incremental compilation. They are:

full build w/ precompiledModules: 19.09s
without rebuilding .os (rm -rf build/lib): 11.08s
without relinking (rm -rf build/lib/Mathlib): 10.45s
without loading dynlibs (precompileModules := false): 10.03s

So the C compilation overhead really is the largest by far, but we should assume it is mostly avoided when recompiling. Linking & loading are each around 5% overhead.

@Kha
Copy link
Member Author

Kha commented Jan 26, 2022

Note that we could slash linking overhead by 75% simply by invoking lld directly

image

(though then we would be back to the "how to find standard library search paths on Linux" issue)

@Kha
Copy link
Member Author

Kha commented Jan 26, 2022

One final performance observation for now: according to nixprof, the "overhead" of load-dynlibs is in fact -3.2% on average. I'm not quite sure why this speedup isn't manifested in the Lake build; I randomly picked a few Lake cmdlines with --load-dynlibs, and removing it did slow lean down. In both build configurations, perf reports a <1% overhead spent in lake itself, so that's not it either.

@Kha
Copy link
Member Author

Kha commented Apr 29, 2022

Ok, understandable. I probably would have merged it by myself by now if the need for the feature would have manifested, but it seems we're still not quite there yet :) .

@tydeu
Copy link
Member

tydeu commented Jun 13, 2022

@Kha I was looking through the PR as I plan on implementing a similar approach this week and I had a question. Have the Std.RBMap utilities you included the PR been added to the Lean core since the last update of this PR? If not, it seems like it would make more sense to add them there rather than here (especially for you, since you have write access to to Lean repo).

@Kha
Copy link
Member Author

Kha commented Jun 13, 2022

done leanprover/lean4@bda871d

@tydeu tydeu added this to the v3.2.0 milestone Jun 14, 2022
@tydeu
Copy link
Member

tydeu commented Jun 22, 2022

@Kha I have finally finished my version of precompiled modules and was thinking of using the test you had written in this PR. However, I am not exactly sure what it tests. What difference in behavior should one be looking for from a working version of the example test vs a non-working version?

@Kha
Copy link
Member Author

Kha commented Jun 22, 2022

Well if the test fails, we know something went wrong :) . But we could do something simple with symbols like

@[export foo]
def fooImpl := 42

@[extern "foo"]
opaque foo : Nat
import Foo

#eval foo

@tydeu tydeu closed this in 23938ed Jun 22, 2022
@Kha
Copy link
Member Author

Kha commented Jun 23, 2022

Whether to compile each module into a native shared library that is loaded whenever the module is imported

The last part doesn't seem to be the case currently. If I enable precompilation for Aesop (a prime example for a library-level precompilation use case btw) and then import it into my project which does not have precompilation enabled, no Aesop dynlibs are loaded.

tydeu added a commit that referenced this pull request Jun 24, 2022
a feature of #47 I had hetherto missed
Kha pushed a commit to Kha/lean4 that referenced this pull request Jul 17, 2023
Kha pushed a commit to Kha/lean4 that referenced this pull request Jul 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants