-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hot-Reload can cause crashes with old closures in the heap #41482
Comments
This looks like an easier version of TEST_CASE(IsolateReload_PendingStaticCall_DefinedToNSM). This was definitely handled before, but likely broken by kernel. |
I don't know how it was handled before, but it could be handled in the flow graph building code: Once hot-reload is done we clear out all code. Then we continue running, which will invoke the closure. That will trigger unoptimized compile of closure. The flow graph builder can then see that the call to the constructor (or any static method) is invalid due to mismatch in signatures (or types) and can call NSM directly. /cc @alexmarkov @mraleph WDYT? |
Out of curiosity I did a bit of archeology: So it seems we have never really supported the case when the code is not recompiled from scratch.
Checking number of parameters is easy (though it would still require changing flow graph builder to stop trusting Kernel, it's a bit of a slippery slope) - but with types things would get complicated, we would essentially need to duplicate some of the front end code (especially because static types of expressions are not serialised into kernel). How about a different approach: in unoptimized code our static calls are already going through a stub. There is even a code which is supposed to rebind static target on reload, and this code even does some argument descriptor validation, which does not actually do anything if argument descriptors does not match (there is TODO to redirect to NSM). Can it be that fixing this TODO would actually fix the crash? |
@mraleph What about types? If we keep number of positional/named parameters but change the types of parameters, it becomes unsound, because we have old callers not providing the right type the new callee expects. |
@mkustermann rebinder has access to the old target - so it can compare the signature and check if new one is a valid override for the old one. |
It is not sufficient to check static calls. Old code was compiled with different class hierarchy and different surrounding code, and it may not be valid with hot-reloaded code. Executing old code of a closure is totally unsound and may cause crashes. For example, consider the following code: class A {}
class B extends A {}
main() {
A foo() => new B();
// hot-reload here
A a = foo();
} In the closure code the instance of class It could be hot-reloaded to class A {}
class B {}
... Executing old code of a closure breaches soundness, although no signatures changed. Some time ago we discussed this problem in the context of hot-reloading bytecode, and I remeber that @rmacnak-google had a prototype to match closure instances to the new hot-reloaded code if possible, otherwise throw an error if unmatched closure is executed. This would guarantee that we will not attempt to execute invalid old code. @rmacnak-google could you remind what was the result of that experiment? |
@mraleph Right, IsolateReload_PendingStaticCall_DefinedToNSM never passed. It is a more difficult version of the test Martin provided, which historically did pass in the VM's frontend. Executing old code, either through a stored closure or activations still on the stack, or executing additional code, such as loaded through eval or loadUri, can breach soundness because they assume but do not verify the world of types is the same. The experiment @alexmarkov mentioned is in https://dart-review.googlesource.com/c/sdk/+/111316/ It seemed okay for the few cases I tried in Flutter Gallery, but my intuition is still that this will reject more valid changes that than it will prevent invalid changes, which is par for Dart 2. Soundness will also require disallowing reload except at the event loop boundary, which means fix-and-continue debugging will be impossible. |
Following. |
Does not verify argument types, which remains a source of unsoundness. Bug: #37517 Bug: #41482 Change-Id: I2ebfd36370bc86f9f8a0e2c2463a2f5b72a50388 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/143541 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
Does not verify argument types, which remains a source of unsoundness. Bug: #37517 Bug: #41482 Change-Id: I2ebfd36370bc86f9f8a0e2c2463a2f5b72a50388 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/143541 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Alexander Markov <alexmarkov@google.com>
Is this issue fixed now after 847a127 ? can it be closed ? |
No - there are still potential issues with old closures. They were compiled to kernel under the old static knowledge (e.g. class hierarchy). Even after hot-reload, the re-compilation of old closures in the VM will use the old kernel and therefore assume the old static knowledge is still valid, which can be no longer the case after hot-reload. (Alex provided an example in #41482 (comment) ) We should find someone to look into this, possibly in collaboration with CFE team. Ideally the CFE (which knows about the static knowledge) could give the vm a hot-reload diff, containing additional information about closures. The VM could then refuse the reload if there is an old closure in the heap which the CFE told us is no longer valid. |
@mkustermann
|
@johnniwinther is going to take a look at the suggestion made in #41482 and see if it is feasible for CFE to provide that information. Based on that investigation we will decide on how to proceed on this issue. |
@jensjoha Can you look into whether we can provide information as suggested in #41482 (comment) ? |
Either I don't understand, or it's also a problem for non-closures. E.g. if I have this flutter library:
and run it, pressing the + 3 times I get this terminal output:
(looking at the code, notice that lookAtMe contains a closure claiming to return an A returning a B (which at this point is an A) and a field of claimed type A containing a B (which at this point is an A). Now apply these changes:
hot reload, press the + 1 time and the terminal now says this:
Isn't this the situation - both for closures and fields - that is expressed in #41482 (comment) and thus a problem not centered around closures? (neither of these seem to crash the VM though --- is that the only concern?) (pressing it twice more gives
i.e. the old stuff is cleared and the new stuff is set and everything is fine again) |
What's the status on this? |
After further discussions with @rmacnak-google and @alexmarkov it is clear that fixing this would require some prototyping to evaluate the options available and the fix is going to take longer so moving this issue out of the October milestone and the plan is to complete this in Q4. |
Any updates here? Is this issue still coming up in practice? (I.e., should this be a P1?) |
I don't think this needs to be a P1. |
//cc @rmacnak-google @alexmarkov do we still think this is something that we will plan to fix in the near term, should it still be a 'p2' ? |
The following flutter issues reporting SEGVs
might be caused by this problem.
This example vm/cc test can cause a segfault:
Tentatively assigning to @rmacnak-google due to hot-reload.
The text was updated successfully, but these errors were encountered: