-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Races between background compiler and unopt switchable calls implementation #37595
Comments
fyi - @danrubel - looks like a fasta issue? |
Also cc @johnniwinther, @stefantsov, @jensjoha who are the recent contributors to https://github.com/dart-lang/sdk/blob/master/pkg/front_end/lib/src/fasta/kernel/expression_generator.dart. |
This looks very weird. The call |
@mkustermann: To me this looks VM related giving the call stack doesn't match the Dart code. |
If this is caused by a VM mis-optimization we would need to be able to reproduce this to find out what's going on. /cc @a-siva |
I concur with @johnniwinther's analysis of the problem. This is most likely a VM bug, something related to method dispatch - potentially optimizing compiler related. Given the timing of when it started surfacing it might be related to @rmacnak-google's changes in the method dispatch (enabling switchable calls in the JIT mode). Ryan, do you have any insight here? You fixed some bugs in the optimizing compiler after landing - does this look like one of those? |
@jensjoha Pointed out that this happens sometimes on our CQ try jobs as well. For example see this failed build:
Though so far I was unable to reproduce this issue. |
@rmacnak-google do you think it makes sense to turn off FLAG_unopt_monomorphic_calls and FLAG_unopt_megamorphic_calls for a week in the VM to see if these sporadic crashes stop when the flags are off. There is of course the issue of having to deal with the hot reload benchmark regressing when we turn off these flags. |
I think I've identified the underlying bug. After 1e24fe7 our optimizing compiler starts to use the megamorphic caches as input for making optimization decisions.
We've added locking around the megamorphic cache to prevent mutator and BG compiler accessing it concurrently. What we've not ensured is that BG compiler always sees a consistent view of the megamorphic cache: The BG compiler consults the megamorphic cache several times during the optimization of a function. Since the mutator is running concurrently, the BG compiler can get different answers each time it looks at the megamorphic cache. This is precisely what seems to happen here:
Here's a small diff which seems to fix it: diff --git a/runtime/vm/compiler/call_specializer.h b/runtime/vm/compiler/call_specializer.h
index e9a0a6d580..2b1e4ee25d 100644
--- a/runtime/vm/compiler/call_specializer.h
+++ b/runtime/vm/compiler/call_specializer.h
@@ -117,12 +117,6 @@ class CallSpecializer : public FlowGraphVisitor {
protected:
void InlineImplicitInstanceGetter(Definition* call, const Field& field);
- SpeculativeInliningPolicy* speculative_policy_;
- const bool should_clone_fields_;
-
- private:
- bool TypeCheckAsClassEquality(const AbstractType& type);
-
// Insert a check of 'to_check' determined by 'unary_checks'. If the
// check fails it will deoptimize to 'deopt_id' using the deoptimization
// environment 'deopt_environment'. The check is inserted immediately
@@ -133,6 +127,12 @@ class CallSpecializer : public FlowGraphVisitor {
Environment* deopt_environment,
Instruction* insert_before);
+ SpeculativeInliningPolicy* speculative_policy_;
+ const bool should_clone_fields_;
+
+ private:
+ bool TypeCheckAsClassEquality(const AbstractType& type);
+
// Insert a Smi check if needed.
void AddCheckSmi(Definition* to_check,
intptr_t deopt_id,
diff --git a/runtime/vm/compiler/jit/jit_call_specializer.cc b/runtime/vm/compiler/jit/jit_call_specializer.cc
index e60517614c..7f013022f1 100644
--- a/runtime/vm/compiler/jit/jit_call_specializer.cc
+++ b/runtime/vm/compiler/jit/jit_call_specializer.cc
@@ -167,7 +167,20 @@ void JitCallSpecializer::VisitInstanceCall(InstanceCallInstr* instr) {
// Type propagation has not run yet, we cannot eliminate the check.
// TODO(erikcorry): The receiver check should use the off-heap targets
// array, not the IC array.
- AddReceiverCheck(instr);
+
+ // After we determined `targets.HasSingleTarget()` the mutator might have
+ // updated the megamorphic cache by adding more entries with *different*
+ // targets.
+ //
+ // We therefore have to ensure the class check we insert is only valid for
+ // precisely the [targets] classes we have based our decision upon.
+ //
+ // (i.e. we cannot use [AddReceiverCheck]/[AddCheckClass], since it
+ // internally consults megmorphic cache gain, which can return a superset
+ // of the classes - possibly with different targets)
+ //
+ AddCheckClass(instr->Receiver()->definition(), targets, instr->deopt_id(), instr->env(), instr);
+
// Call can still deoptimize, do not detach environment from instr.
const Function& target =
Function::ZoneHandle(Z, unary_checks.GetTargetAt(0)); Though there might be more places where the compiler doesn't see a consistent picture of the megamorphic cache. => We should ensure that during one compilation, the compiler always sees the megamorphic cache in the same state (similarly to how we do this for ICData). |
Related to #37575 |
…idTargets we made the HasSingleTarget decision on This is a temporary fix until we implement some abstraction to ensure the BG compiler has always a consistent view of megamorphic caches it's interested in. Issue #37595 Change-Id: I98ec29db7a4ecd5776f724e11d50fb4535bacdd1 Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/113694 Commit-Queue: Martin Kustermann <kustermann@google.com> Reviewed-by: Ryan Macnak <rmacnak@google.com>
Marking this as D25 as the bug itself is in D25 /cc @aadilmaan |
A fix from kustermann@ has landed which fixes this issue, we have another issue opened at #37575 for more code health/cleanup work. |
I don't think the issue is resolved - @mkustermann's fix only fixed one race that was reproducible. I think there are more races hiding. I am reopening this and marking this a P0 and D25 blocker. Here is at least one more race that I found: note that This means that in general due to a race we might end up with a clone that has a single check in entries but is marked as megamorphic (because right after we clone This means that some methods in call specializer (as an example |
…ackground compiler thread. This is to prevent the compiler from becoming confused by mutations from the Dart execution thread. Bug: #37575 Bug: #37595 Change-Id: I43c42feff54d76b780beb6e8d627f1b1641f4bee Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/114054 Commit-Queue: Ryan Macnak <rmacnak@google.com> Reviewed-by: Martin Kustermann <kustermann@google.com>
@rmacnak-google is this issue fixed now? |
The race is much less likely now, but there's still work to rationalize the boundary between runtime and compiler access to type feedback. |
One known (but indeed much less likely) race remains - @mkustermann said he is going to prepare a quick patch for it. |
cl/114322 does that. |
…ng ICData to be megamorphic. The mutator has to ensure all writes to the ICData are flushed before we set the megamorphic bit. Similarly the BG compiler has to ensure (during cloning of ICData) that all loads from the ICData happen after we've determined whether the megamorphic bit has been set. This allows us to maintain the invariant that if the ICData is marked as megamorphic, the entries in it will have reached FLAG_max_polymorphic_checks. Issue #37595 Change-Id: Idec7aef8ca1369668f5c9c57b7b32759acbd270d Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/114322 Commit-Queue: Martin Kustermann <kustermann@google.com> Reviewed-by: Vyacheslav Egorov <vegorov@google.com>
Great. Is this closeable at this point? |
The known races are fixed, so we can close it. Though we still need to pursue the refactoring tracked at #37575 to avoid other potential bugs (where the compiler makes decisions purely based on ICData but uses ICData + MegamorphicCache for inserting class checks). |
Compling
dev\bots\test.dart
on Windows from this commit:flutter/flutter@63ffbfb
...caused this crash:
The text was updated successfully, but these errors were encountered: