Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effects on modular compilation of compile time user code #1483

Closed
jakemac53 opened this issue Mar 1, 2021 · 25 comments
Closed

Effects on modular compilation of compile time user code #1483

jakemac53 opened this issue Mar 1, 2021 · 25 comments
Labels
static-metaprogramming Issues related to static metaprogramming

Comments

@jakemac53
Copy link
Contributor

jakemac53 commented Mar 1, 2021

A general concern for using Dart as a static metaprogramming language is that we will need the full code available for any dependencies that are executed at compile time.

Today both of our modular build systems (bazel and build_runner) declare all their dependencies up front, and when compiling for the web as well as running analysis with the analyzer (bazel only) we only supply what we refer to as the outline of the dependencies. These outlines define the api surface area, and enough information to evaluate constant expressions (or read already evaluated constants), but not method bodies, comments, or other unnecessary information.

In order to support running dart code at compile time (either as a part of static metaprogramming or enhanced constant evaluation), we would have to provide the full kernel or Dart source code for anything deps that are a part of compile time execution.

Downsides of providing full Dart inputs

There are some downsides to simply providing the full dart files for all transitive dependencies:

Invalidation

The largest bonus we get from using outlines is actually that it significantly improves our invalidation story. Because they contain less information, build systems which use file contents for invalidation (build_runner, bazel) get major benefits. It means that libraries that depend on the library you are editing only have to be recompiled if you change the api shape of your library (edit something public, basically, although its a bit more complicated). Editing method bodies or comments is very efficient. We would lose these properties if we just switched to using full Dart inputs without mitigating this issue somehow.

Individual action efficiency

Outlines are significantly smaller than full Dart files, and they also typically encapsulate the outline of several dart libraries into a single file. Not using outlines for analyzer means that we would have both more inputs and larger inputs to all actions.

For kernel compilation we could still use a kernel file as input (so same total # of files) but they would be larger than the outlines.

@jakemac53 jakemac53 added the static-metaprogramming Issues related to static metaprogramming label Mar 1, 2021
@jakemac53
Copy link
Contributor Author

cc @natebosch

@jakemac53
Copy link
Contributor Author

jakemac53 commented Mar 1, 2021

Proposed Solution: Split compile time and runtime deps

The general idea is to provide full Dart files (or full kernel files) for dependencies which are imported (transitively) by any library which is used at compile time (in a macro or as a part of constant evaluation). All other deps would only be provided as outlines.

This doesn't require any language feature, it could be fully done through other static configuration (in bazel) or potentially automatic detection (in build_runner). We may want to investigate a language feature to specify compile time imports but the benefit would be potentially more consistent tree-shaking of compile time imports in the runtime application and not likely anything else.

This does not fully solve the issue, but it should significantly mitigate the issue:

  • In the case of static metaprogramming, it limits the amount of code that has to be provided as a full dependency to the code which is imported directly by the libraries that implement the static metaprogramming. Application code would be able to grow in an unbounded fashion and it would have roughly the same compile time characteristics as it does today (in terms of invalidation and input size), even if the application uses static metaprogramming.
  • For constant evaluation, it limits the deps that have to be provided as a full dependency to those that are used in constant evaluation of a given library.

Concrete proposal for specific build systems

Bazel and similar build systems

For static metaprogramming libraries

In build systems such as bazel where we require up front configuration prior to reading any Dart files, the dart_library rule would have an additional flag indicating that the library has apis that will need to run at compile time. Whenever this library is used as a dependency, we would provide it and all of its transitive deps as full dependencies and not outlines to all analyzer, kernel, and DDC actions.

For constant evaluation

In build systems such as bazel where we require up front configuration prior to reading any Dart files, we should split up the deps section of the dart_library rule into two explicit sections - one for runtime deps and one for compile time deps. The compile time deps would be provided as full kernel inputs.

cc @davidmorgan

Build_runner and more dynamic build systems

Build systems that have the ability to more dynamically choose inputs after reading Dart files, may choose to automatically do detection of libraries that define static metaprogramming features or do const evaluation, removing the need for explicit configuration. This is likely actually a requirement to be able to use libraries that were created without knowledge of build_runner.

@leafpetersen
Copy link
Member

cc @johnniwinther

@jensjoha
Copy link

jensjoha commented Apr 7, 2021

  • In the case of static metaprogramming [...]
  • For constant evaluation [...]

Can someone explain to me what the characteristics of each of these terms are? What do you mean when you say 'metaprogramming' and what do you mean when you say 'constant evaluation'?

Proposed Solution: Split compile time and runtime deps

The general idea is to provide full Dart files (or full kernel files) for dependencies which are imported (transitively) by any library which is used at compile time (in a macro or as a part of constant evaluation). All other deps would only be provided as outlines.

Am I the only one worried about the usage at compile time just increasing until eventually everything has to be full files?
To me this sounds like something that will look to work up front because it's not used, but when it actually gets into the users hands the compile will just get slower and slower with no (from the users perspective) good reason why.

What is suggested here - as I see it - is basically to create a feature we will then (have to) tell the user not to use (or stuff will become slow).

(as a semi-unreleated side-note - if I recall correctly - I saw somewhere that the constant evaluation should also be allowed to read files which would mean - for things to work as if it was a single-shot compile - that that file (and it's content) would have to be tracked as well. It goes without saying that I - and the complexity of all 'build systems' as well as the compile time speed - wouldn't like that.)

  • For constant evaluation, it limits the deps that have to be provided as a full dependency to those that are used in constant evaluation of a given library.

(Which will probably grow over time to eventually be all of them.)

Concrete proposal for specific build systems

Bazel and similar build systems

[...]

Build_runner and more dynamic build systems

[...]

Please remember that the incremental compiler is also impacted by this (it too is basically split into outline and full when running with the 'experimental invalidation' which is becoming default). You might argue that it's just a "dynamic build system", but I certainly think it's more helpful to have it top of mind.

@jakemac53
Copy link
Contributor Author

Can someone explain to me what the characteristics of each of these terms are? What do you mean when you say 'metaprogramming' and what do you mean when you say 'constant evaluation'?

Metaprogramming

The general idea here is that you have code defined in some library that supports "metaprogramming". We don't have a concrete proposal but you could imagine defining a function that takes an ast node and modifies it.

The important distinction here is that the only code running at compile time is code that lives in the library that implements metaprogramming features, and not code that uses those metaprogramming libraries. In blaze etc we could just create for instance a dart_macro_library rule, and that library and its transitive deps are always provided as full kernel files, but an app can freely use it as much as they want without incurring additional penalty.

Constant evaluation

Specifically here I am talking about allowing you to invoke arbitrary functions in constant expressions. This is more concerning because it could happen in any library anywhere in the program, causing all of the transitive deps of any apis they call to be pulled in as full kernel files.

Note the distinction though for of any apis they call - it isn't all transitive deps. Just the ones actually executed at compile time in const expressions.

Am I the only one worried about the usage at compile time just increasing until eventually everything has to be full files?

I don't share the same level of concern here, for a few reasons:

  • It will only be the libraries which are actually used for constant evaluation that have to be provided as full dependencies, not all transitive dependencies of a module.
  • This is handled on a per-module basis, so only modules which actually use the feature would incur any cost at all, and the cost would be immediately visible by looking at the compile time deps section of the rule.

To me this sounds like something that will look to work up front because it's not used, but when it actually gets into the users hands the compile will just get slower and slower with no (from the users perspective) good reason why.

I think this is heavily mitigated by the explicit compile time deps separation.

What is suggested here - as I see it - is basically to create a feature we will then (have to) tell the user not to use (or stuff will become slow).

It is worth noting that all kernel compilation outside of DDC today uses full kernel files already. So I don't forsee any sort of apocalyptic scenario here with regards to build times.

For instance we should also be able to do the same "used input" tracking we already do during the interpretation of these constants, and mark most of the transitive inputs as not actually used, which would help invalidation a lot for some of the more pathological cases.

(as a semi-unreleated side-note - if I recall correctly - I saw somewhere that the constant evaluation should also be allowed to read files which would mean - for things to work as if it was a single-shot compile - that that file (and it's content) would have to be tracked as well. It goes without saying that I - and the complexity of all 'build systems' as well as the compile time speed - wouldn't like that.)

Allowing the reading of files will be separately addressed and if we can't do it efficiently (and safely) then we just won't allow it. To start all of dart:io would probably be blocked.

@scheglov
Copy link
Contributor

scheglov commented Apr 8, 2021

Can someone explain to me what the characteristics of each of these terms are? What do you mean when you say 'metaprogramming' and what do you mean when you say 'constant evaluation'?

Metaprogramming

The general idea here is that you have code defined in some library that supports "metaprogramming". We don't have a concrete proposal but you could imagine defining a function that takes an ast node and modifies it.

The important distinction here is that the only code running at compile time is code that lives in the library that implements metaprogramming features, and not code that uses those metaprogramming libraries. In blaze etc we could just create for instance a dart_macro_library rule, and that library and its transitive deps are always provided as full kernel files, but an app can freely use it as much as they want without incurring additional penalty.

I don't understand. If we have a package foo that implements some meta-programming feature, e.g. processing all classes with names starting with Foo, and we have a package bar that declared a class named FooBar, then to analyze bar the analyzer will need to execute code from foo, so the action that analyzes bar will need foo and all its transitive dependencies.

I somewhat understand how this could work with code generation currently - we have a separate action that depends on foo and its dependencies, and on bar and its dependencies, and then adds files to bar. Then a separate action analyzes bar with newly added files. This way the analyzer action does not depend on foo. I currently thinking about meta-programming as about code generation without actual code stored anywhere. So, if we don't store the generated code, the generator should live together with the analyzer in the same action.

@jensjoha
Copy link

jensjoha commented Apr 8, 2021

This is more concerning because it could happen in any library anywhere in the program, causing all of the transitive deps of any apis they call to be pulled in as full kernel files.

Note the distinction though for of any apis they call - it isn't all transitive deps. Just the ones actually executed at compile time in const expressions.

If an arbitrary function is executed at compile time, and that function is changed we don't know what it has been changed into. It might now call other functions that it didn't before. In fact it might call all functions that it can possibly reach. Those might not have been called before and they too might call whatever they can reach. That certainly seems like the potential for all transitive dependencies to me.

If looking only at the incremental compiler for a minute, (with advanced invalidation) what happens is this:

  • We try to figure out if the outline (for some definition of the outline) has changed or not.
  • If so, we have to invalidate every transitive dependency and recompile it all from scratch.
  • If not we only recompile that single file.

Said another way, if you modify a "regular function" we will only recompile that file.

With the proposed constant evaluation (as I understand it) this no longer works.
We will instead - if the outline hasn't changed - have to figure out which function(s) have changed, and whether those are used anywhere for constant evaluation. If so we will either have to pretend like the outline has changed and recompile all transitive dependencies (or try to transitively "smart invalidate" those and figure out if/which transitive dependencies it can have had an effect on). All of that depends on us being able to somehow know which functions are called in constant evaluation, something I'm not certain we can do efficiently, and something I very much doubt we can know when initializing from dill.

Am I the only one worried about the usage at compile time just increasing until eventually everything has to be full files?

I don't share the same level of concern here, for a few reasons:

  • It will only be the libraries which are actually used for constant evaluation that have to be provided as full dependencies, not all transitive dependencies of a module.

Except - as noted above - we can't know what is now needed for constant evaluation before having compiled.

  • This is handled on a per-module basis, so only modules which actually use the feature would incur any cost at all, and the cost would be immediately visible by looking at the compile time deps section of the rule.

This is very specific to (I'm guessing) blaze.

It is worth noting that all kernel compilation outside of DDC today uses full kernel files already. So I don't forsee any sort of apocalyptic scenario here with regards to build times.

Again this is about something like blaze. Also - and I may be reading this wrong - it seems to say it's okay to make all DDC users wait (much?) longer.

For instance we should also be able to do the same "used input" tracking we already do during the interpretation of these constants, and mark most of the transitive inputs as not actually used, which would help invalidation a lot for some of the more pathological cases.

As noted internally (and above) I don't think it's a given that we can do this.

@jakemac53
Copy link
Contributor Author

I don't understand. If we have a package foo that implements some meta-programming feature, e.g. processing all classes with names starting with Foo, and we have a package bar that declared a class named FooBar, then to analyze bar the analyzer will need to execute code from foo, so the action that analyzes bar will need foo and all its transitive dependencies.

Right, what I was trying to call out is the amount of code that needs to be executed at compile time doesn't grow with the size of the app, its always just foo and its transitive deps. This is in order words identical to the current codegen behavior, where only Builder code executes at "compile" time (the build script itself is static, and imports all builders).

I somewhat understand how this could work with code generation currently - we have a separate action that depends on foo and its dependencies, and on bar and its dependencies, and then adds files to bar. Then a separate action analyzes bar with newly added files. This way the analyzer action does not depend on foo. I currently thinking about meta-programming as about code generation without actual code stored anywhere. So, if we don't store the generated code, the generator should live together with the analyzer in the same action.

None of this is clear yet, we may still want to output real files to disk as there are a lot of advantages to that. It makes the IDE experience more seamless if it can navigate to real files as an example.

Likely the code generation would happen in the same analyzer action though, and it would likely also happen in the kernel actions. Possibly we would try to share that but its likely faster in blaze at least to just duplicate the work.

@jakemac53
Copy link
Contributor Author

If an arbitrary function is executed at compile time, and that function is changed we don't know what it has been changed into. It might now call other functions that it didn't before. In fact it might call all functions that it can possibly reach. Those might not have been called before and they too might call whatever they can reach. That certainly seems like the potential for all transitive dependencies to me.

Yes, the set of functions actually invoked at the compile time could be identical to the set of transitive dependencies, but that is unlikely. That would essentially imply the entire program being one const object. Maybe a few programs would fall under that category, but not most (and the ones that did it would be explicitly what the user was asking for anyways?).

If looking only at the incremental compiler for a minute, (with advanced invalidation) what happens is this:

  • We try to figure out if the outline (for some definition of the outline) has changed or not.
  • If so, we have to invalidate every transitive dependency and recompile it all from scratch.
  • If not we only recompile that single file.

Said another way, if you modify a "regular function" we will only recompile that file.

With the proposed constant evaluation (as I understand it) this no longer works.
We will instead - if the outline hasn't changed - have to figure out which function(s) have changed, and whether those are used anywhere for constant evaluation. If so we will either have to pretend like the outline has changed and recompile all transitive dependencies (or try to transitively "smart invalidate" those and figure out if/which transitive dependencies it can have had an effect on). All of that depends on us being able to somehow know which functions are called in constant evaluation, something I'm not certain we can do efficiently, and something I very much doubt we can know when initializing from dill.

I don't understand why this would be challenging. You could add a single boolean flag to each function ast node, indicating if it was evaluated at compile time. This would get serialized to kernel as a part of the node (some question here about modular builds, it could likely be omitted there though). In the constant evaluator (or macro expander) whenever you handle a function invocation you flip the bit on that ast node to true.

Whenever a function body is modified, you check the bit and invalidate all transitive deps if it was set, and then you flip the bit back to false. I might be glossing over something here but it sounds pretty tractable to me?

Except - as noted above - we can't know what is now needed for constant evaluation before having compiled.

This is why we require an explicit compile time deps section in bazel, separate from normal deps. In build systems that require static configuration of inputs we would have to require the user to configure this (tools can fill this in though, but it needs to be static build config).

Again this is about something like blaze. Also - and I may be reading this wrong - it seems to say it's okay to make all DDC users wait (much?) longer.

We don't know what the impact would be. My implication here is that DDC is actually the outlier - all the other workflows already use full dill files today. Ultimately no we would not accept a significant regression in DDC performance for this feature.

Note that in external flutter DDC users use the frontend server (and incremental compiler). We could invest in switching internal users over to this as well, although it doesn't play nicely with codegen so its a lot of work. Codegen is heavily used for angular apps internally so it would need to work well and that is the main reason we haven't switched yet.

@jensjoha
Copy link

jensjoha commented Apr 9, 2021

Yes, the set of functions actually invoked at the compile time could be identical to the set of transitive dependencies, but that is unlikely. That would essentially imply the entire program being one const object.

The entire program does not have to be one const object (or I'm misunderstanding what you mean by that). There has to be one thing used from each library (at least in the CFE where we invalidate/recompile at library level, not finer grained than that). Of course "every library" might be unlikely but if it's n or n/2 doesn't really matter.

Maybe a few programs would fall under that category, but not most (and the ones that did it would be explicitly what the user was asking for anyways?).

To me, even if it's just some select users, or only once in a while, making it slower because of a feature that we give them is not okay. I doubt the user would explicitly ask for the compile to be slow. The user has been given a feature that the user then uses --- and being penalized for it.

I don't understand why this would be challenging. You could add a single boolean flag to each function ast node, indicating if it was evaluated at compile time. This would get serialized to kernel as a part of the node (some question here about modular builds, it could likely be omitted there though). In the constant evaluator (or macro expander) whenever you handle a function invocation you flip the bit on that ast node to true.

Whenever a function body is modified, you check the bit and invalidate all transitive deps if it was set, and then you flip the bit back to false. I might be glossing over something here but it sounds pretty tractable to me?

What you're suggesting might work if we always started from scratch (and always 'went forward'), but we don't:
You're suggestion would mean that compiling one library changes other libraries because they're now used in constant evaluation (e.g. user code would change package:flutter code).
That means that the dill file(s) we've outputted are no longer valid. So we can no longer initialize from them. Nor can we utilize the concatenation feature of dill files. And for that matter, the VM saying reject to a change and us having to go back to the previous state no longer works because the old libraries have changed.

@jakemac53
Copy link
Contributor Author

To me, even if it's just some select users, or only once in a while, making it slower because of a feature that we give them is not okay. I doubt the user would explicitly ask for the compile to be slow. The user has been given a feature that the user then uses --- and being penalized for it.

This isn't as black and white as you are painting it, its all about the trade-offs. Most language features are this way, and come at a cost. For instance one could argue async/await should not exist because it comes with an overhead compared to raw futures. But the trade-offs in terms of usability/readability are generally considered worth it.

In this case, one concrete trade-off is doing more work at compile time in order to do less work at runtime. That very well might be worth it, especially if it means your production app is faster.

We should be evaluating the costs (build time, implementation, etc) versus the benefits and the proposal should live or die based on those metrics.

That means that the dill file(s) we've outputted are no longer valid.

I think I am missing something here, in all the examples I have seen using the frontend server there is only a single dill file. Is that being incrementally serialized to disk?

This sounds similar to the modular build problem though. I would suggest that whenever we are in a mode where we are outputting kernel files in a depth-first order (in terms of the library dependency graph), that we should not serialize this bit at all. Whenever initializing from a dill file that does not have this bit set, you would revert the very first build to the more pessimistic invalidation strategy.

And for that matter, the VM saying reject to a change and us having to go back to the previous state no longer works because the old libraries have changed

Can you explain this case a bit more? I am not familiar with the details of how it works.

@munificent
Copy link
Member

I doubt the user would explicitly ask for the compile to be slow. The user has been given a feature that the user then uses --- and being penalized for it.

You could make the exact same argument about static types. Static checking makes compilation a lot slower, and they definitely didn't ask for it to be slower. The more types they use, the more types have to be checked. Dart would compile much faster if it was fully dynamically typed. Of course, it would compile to slower code, but the compilation would be faster. Which is also exactly the case with being able to do more execution at compile time that currently has to happen at runtime.

You could also make the same argument about compiler optimizations. But users regularly build with -O3 and ask for more and more sophisticated optimization passes.

You're right that users don't want slow compiles. But they do want the useful behavior that the compiler is busy spending its time doing for them. It's up to us to design features that provide as much value as possible and implementations that spend that time as efficiently. But it's not the case that any additional work happening at compile time is strictly user hostile. It's simply another kind of cost that needs to provide enough benefit.

Users like stuff happening at build time because that catches and reports errors early and makes their runtime smaller and faster. They just want to spend that build time as efficiently as possible.

@leafpetersen
Copy link
Member

I doubt the user would explicitly ask for the compile to be slow. The user has been given a feature that the user then uses --- and being penalized for it.

You could make the exact same argument about static types.

Yes, but... we make a lot of compromises in the Dart type system in the name of fast compilation. Perhaps rightly, perhaps wrongly, but we do. I sort of think this discussion is getting a little philosophical in a way that I'm not sure is moving us forward. I think the concern that @jensjoha is raising is entirely valid (and I think we are all broadly on agreement on it): namely, that we cannot afford to ship a feature here which results in significant degradation of the user experience. Our goal here (and this thread has been really helpful towards this) is to understand what we can do, and what it will cost.

Personally, I think I would set out a few concrete goals here for anything that we ship in this space:

  • The feature should be largely pay as you go: users who do not use the feature don't pay a significant cost for its existence
  • The feature should have a predictable impact. There shouldn't be weird performance cliffs where adding a single use of the feature suddenly slows all the builds down dramatically.
  • We should be confident that reasonable uses of the feature impose a performance penalty commensurate with the benefit to the user.

I realize that the last in particular is very squishy. What is a reasonable use? What is commensurate? I think looking at existing codegen uses is a good start. If existing uses of codegen can be eliminated, and made faster in the process, then that's a clear win under this criteria. There is an obvious danger, already called out here though, that the convenience of the feature may lead to much more ubiquitous usage. This is something we will need to try to account for analytically or experimentally.

Good discussion!

@jensjoha
Copy link

This isn't as black and white as you are painting it, its all about the trade-offs. Most language features are this way, and come at a cost. For instance one could argue async/await should not exist because it comes with an overhead compared to raw futures. But the trade-offs in terms of usability/readability are generally considered worth it.

To my mind these are very different. async/await is slower (has more overhead) when executing it. That also means that if you, say, were to execute tests that didn't include any such path you didn't pay a thing. If you execute a path that have exactly one such usage, you pay for exactly one such usage.
With this the pay is up front at compile time no mater the actual usage --- if you change a file to fix something and that file is used by const evaluation somewhere (if we can track it) you're compile just got slower even if what you are about to do (say run a specific test, or try something click-order in your flutter app) doesn't exercise it.

In this case, one concrete trade-off is doing more work at compile time in order to do less work at runtime. That very well might be worth it, especially if it means your production app is faster.

Sure. Everything is a trade-off. What I'm worried about is that when looking at this, the looking is very narrow. For production apps where - I'm assuming - incremental compilation is not really a thing (I'm, maybe naively, assuming that one does a single-shot build for a release which also suddenly has to go through aot-phases etc) the trade-off looks very different than for the hot-reload scenario.

I think I am missing something here, in all the examples I have seen using the frontend server there is only a single dill file. Is that being incrementally serialized to disk?

Well... Things are complicated. We also do "incremental serialization" so that we don't have to re-serialize (for instance) the flutter package if it doesn't change. Your suggestion allows user code to change used packages (e.g. the flutter package) and we thus couldn't do that anymore.
When changing a test for instance (iteratively writing your testcase or similar) and running flutter test this makes a big difference.

This sounds similar to the modular build problem though. I would suggest that whenever we are in a mode where we are outputting kernel files in a depth-first order (in terms of the library dependency graph), that we should not serialize this bit at all. Whenever initializing from a dill file that does not have this bit set, you would revert the very first build to the more pessimistic invalidation strategy.

Now we're again - it seems - moving to a blaze or blaze-like build system.
Also, so me at least, making the first build slower isn't okay either. That could be every test run for instance.

Can you explain this case a bit more? I am not familiar with the details of how it works.

The protocol is that the server is asked to compile something and the vm then responds with either accept or reject.
When accepting the server "joins" to "current world" (i.e. all latest libraries send to the vm), and when rejecting it throws the incremental compiler away and starts over, initializing from the "current world" (i.e. all the libraries accepted by the VM). This is done so that if the vm rejects an update the vm and the incremental compiler still agrees about what the world looks like. But if the rejected compile changed other libraries (by setting bits) the world we'd go back to is false.

You could make the exact same argument about static types.

I'd argue yes and no here. Yes, static types makes it slower. But - at least to my knowledge - it's proportional. Every library becomes some factor slower to compile.
For the constant evaluation, though, it could mean that without it we could compile 1 file but with it we'd have to compile n files.

You could also make the same argument about compiler optimizations. But users regularly build with -O3 and ask for more and more sophisticated optimization passes.

This makes me more uneasy, because this - to me at least - focuses very much on a different scenario that the one I worry about. The way I read this you're talking about a one-shot (AOT) compile where trade-offs are naturally entirely different.

But it's not the case that any additional work happening at compile time is strictly user hostile. It's simply another kind of cost that needs to provide enough benefit.

That's not the argument I'm trying to make. The argument I'm trying to make is that I'm afraid the trade-offs involved is being looked at one-sidedly.

Users like stuff happening at build time because that catches and reports errors early and makes their runtime smaller and faster. They just want to spend that build time as efficiently as possible.

Again this depends very much on the scenario. For hot-reload and test runs etc I don't think this is generally true. Also I don't see how the constant evaluation proposed catches and reports more errors at compile time --- the way I see it it will only make debugging any errors they find later that more challenging.

  • The feature should be largely pay as you go: users who do not use the feature don't pay a significant cost for its existence

Everything else being equal, for the incremental compilation to still be sound we would (at least):

  • Have to disable advanced invalidation.
  • Have to disable incremental serialization.

We might to some extend be able to engineer our way out of some of this (but certainly not all), but I think it will come at a significant cost both in engineering time (building out the necessary tracking) and in runtime cost and memory usage (in the extra overhead for doing the tracking).

@jakemac53
Copy link
Contributor Author

jakemac53 commented Apr 12, 2021

With this the pay is up front at compile time no mater the actual usage --- if you change a file to fix something and that file is used by const evaluation somewhere (if we can track it) you're compile just got slower even if what you are about to do (say run a specific test, or try something click-order in your flutter app) doesn't exercise it.

This is still paying for actual usage, but the usage is at a distance yes, not at the function definition site. If the tests don't do any const eval using that function, it won't incur any cost. It is true that won't take into account the runtime usage of constants.

What I'm worried about is that when looking at this, the looking is very narrow

This particular issue was filed to look at the effects on modular compilation which is why the discussion/proposal I had is more focused on that explicitly. The general proposal does call out all the other use cases, such as the frontend server (or incremental compiler in general) and expression evaluation as well. We are definitely taking a broad look here, but we hadn't had any concerns raised yet on the incremental compilation side of things and so we didn't yet have an issue filed. I think you have raised enough concerns that it should definitely have its own issue/solutions proposed, but that is why this issue/proposal seems so one-sided :).

I will go ahead and create that issue today with my proposed solution and we can iterate on that there.

The protocol is that the server is asked to compile something and the vm then responds with either accept or reject.
When accepting the server "joins" to "current world" (i.e. all latest libraries send to the vm), and when rejecting it throws the incremental compiler away and starts over, initializing from the "current world" (i.e. all the libraries accepted by the VM). This is done so that if the vm rejects an update the vm and the incremental compiler still agrees about what the world looks like. But if the rejected compile changed other libraries (by setting bits) the world we'd go back to is false.

It doesn't have to happen in this issue, but I am curious to understand more here, in what scenarios would the VM reject a change? However, I think this gives further weight to the idea of never serializing this bit. It does not have any semantic meaning for the program, it is purely used for better invalidation. If the bit is never serialized, the VM would never see it no matter what state it is in, and it wouldn't matter if the bit internally changed in the incremental compiler between runs.

This makes me more uneasy, because this - to me at least - focuses very much on a different scenario that the one I worry about. The way I read this you're talking about a one-shot (AOT) compile where trade-offs are naturally entirely different.

Ya I agree the release compiles have entirely different trade-offs than dev builds. We could create a separate issue to track any concerns with those types of builds as well, I don't expect this proposal to have as much of an effect on them though.

Everything else being equal, for the incremental compilation to still be sound we would (at least):

  • Have to disable advanced invalidation.
  • Have to disable incremental serialization.

I think the proposal here does allow both of these things to be enabled. Or at least I am not convinced it doesn't :).

@munificent
Copy link
Member

I agree very much with your general point that we need to take incremental/hot-reload iteration time into account when doing cost/benefit analysis. I think that is implicit in how Jake and I think about the feature—we fully understand that hot reload is one of Dart's most valuable features—but it's always good to have it re-emphasized.

Also I don't see how the constant evaluation proposed catches and reports more errors at compile time --- the way I see it it will only make debugging any errors they find later that more challenging.

Any "runtime" error thrown during constant evaluation becomes a compile time error. This means that existing code that does validation and throws on failure could potentially turn into compile-time validation. For example, right now the language gives you no compile-time validation that your RegExp strings are actually valid regular expressions. If you forget to escape a ( or something, you don't find out until runtime. That's because RegExp doesn't have a const constructor.

If we are able to make that constructor const (which may be hard in particular for RegExp because it relies on native code, but consider the general concept here), then many of those instantiations can become const. That in turn means any parsing and validation in the constructor body happens at compile time and validation failures show up eagerly in the IDE.

Basically, any class that does useful validation in its constructor body can't make that constructor const today, which means that validation only happens at runtime. Enhanced const allows const constructors with interesting bodies, and that in turn lets that validation run at compile time.

Conversely, some classes simply don't validate everything that they could in order to enable their constructors to be const. For example, the Tolerance class in Flutter is documented to say:

/// The arguments should all be positive values.

But it doesn't actually validate that at all. Enhanced const would let them write:

  const Tolerance({
    this.distance = _epsilonDefault,
    this.time = _epsilonDefault,
    this.velocity = _epsilonDefault,
  }) {
    if (distance <= 0.0) throw ArgumentError.name('distance');
    if (time <= 0.0) throw ArgumentError.name('time');
    if (velocity <= 0.0) throw ArgumentError.name('velocity');
  }

@jensjoha
Copy link

/// The arguments should all be positive values.

But it doesn't actually validate that at all. Enhanced const would let them write:

  const Tolerance({
    this.distance = _epsilonDefault,
    this.time = _epsilonDefault,
    this.velocity = _epsilonDefault,
  }) {
    if (distance <= 0.0) throw ArgumentError.name('distance');
    if (time <= 0.0) throw ArgumentError.name('time');
    if (velocity <= 0.0) throw ArgumentError.name('velocity');
  }

The same thing is already possible without enhanced const:

  const Tolerance({
    this.distance = _epsilonDefault,
    this.time = _epsilonDefault,
    this.velocity = _epsilonDefault,
  })  : assert(distance >= 0),
        assert(time >= 0),
        assert(velocity >= 0);

which gives:

$ out/ReleaseX64/dart pkg/front_end/t.dart
foo.dart:2:31: Error: Constant evaluation error:
  const Tolerance foo = const Tolerance(distance: -1);
                              ^
foo.dart:33:25: Context: This assertion failed.
  })  : assert(distance >= 0),
                        ^
foo.dart:2:19: Context: While analyzing:
  const Tolerance foo = const Tolerance(distance: -1);

when trying something like const Tolerance foo = const Tolerance(distance: -1);.

(And also - to me - this makes it seem like you expect there to be lots of usage, which again fires my worries.)

@munificent
Copy link
Member

which gives:

$ out/ReleaseX64/dart pkg/front_end/t.dart
foo.dart:2:31: Error: Constant evaluation error:
  const Tolerance foo = const Tolerance(distance: -1);
                              ^
foo.dart:33:25: Context: This assertion failed.
  })  : assert(distance >= 0),
                        ^
foo.dart:2:19: Context: While analyzing:
  const Tolerance foo = const Tolerance(distance: -1);

Interesting. I thought that was prohibited because you can't perform operations on doubles in const expressions. Maybe comparisons are allowed?

@scheglov
Copy link
Contributor

image
I think the intention is that you are not allowed to execute custom (so arbitrary) code, but it is fine to call code about which we know what it does, and does it consistently.

@Levi-Lesches
Copy link

@jakemac53, @munificent, how compatible would you say #1565 is with these concepts? (Details for the proposal are under the "Implementation" section.)

Borrowing from @scheglov, this is roughly what it looks like:

We have a separate action that depends on generator and its dependencies, and on hand-written and its dependencies, and then adds files to hand-written. Then a separate action analyzes hand-written with newly generated files. This way the analyzer action does not depend on generator.

So I didn't get into incremental or modular compilation. Instead, I kept the code generation separate from the actual code, such that each phase has its own compilation and execution.

@jakemac53
Copy link
Contributor Author

@Levi-Lesches that sounds similar to how the current build_runner codegen works - it is possible we may want some shared thing that expands macros but it would come at a very large cost. That thing has to fully parse and analyze the program in order to expand macros, and if that parsing/analysis is not shared with the compiler or analyzer then it is all duplicate work.

Basically, yes it could help the analysis/compilation steps but the new steps which are added still have the same problem that analysis/compilation had before, and the total number of steps is roughly doubled. Each "step" in modular compilation carries a fair bit of overhead, and so doubling the size of the "critical path" (the longest sequential path of actions required to compile the app), does come with its own problems and is probably not desirable.

@Levi-Lesches
Copy link

Makes sense. So instead of starting and stopping, we'd like to find a way to say "we know everything we need to generate this code", then once it's generated, pick up where we left off?

@jakemac53
Copy link
Contributor Author

jakemac53 commented May 10, 2021

Makes sense. So instead of starting and stopping, we'd like to find a way to say "we know everything we need to generate this code", then once it's generated, pick up where we left off?

There are a couple pieces to this - but yes we would want to avoid restarting the compiler entirely after generating each piece of code as is currently suggested. This isn't actually directly related to the problem at hand in this issue though - which is the problem of needing full source inputs in order to run macros, which negatively affects invalidation and the size and quantity of inputs. Your proposal doesn't make that fundamental piece any better or worse really (as far as I can tell at least).

@Levi-Lesches
Copy link

Your proposal doesn't make that fundamental piece any better or worse really (as far as I can tell at least).

Okay that's good. I don't have any experience working with compilers so I tried going for as neutral as possible. I'm gonna stay subscribed to this issue and hopefully learn something.

@davidmorgan
Copy link
Contributor

I suspect this is obsolete, if not please reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
static-metaprogramming Issues related to static metaprogramming
Projects
Development

No branches or pull requests

7 participants