Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental CFE error from DDK #36644

Closed
vsmenon opened this issue Apr 16, 2019 · 16 comments

Comments

Projects
None yet
5 participants
@vsmenon
Copy link
Member

commented Apr 16, 2019

When editing Flutter Web Gallery or similar, I occasionally get the following stack trace:

Crash when compiling packages/flutter_web.examples.gallery/gallery/demos.dart,
at character offset 2207:
packages/flutter_web.examples.gallery/demo/material/chip_demo.dart: No class info for ChipDemo
#0      ClosedWorldClassHierarchy._buildInterfaceMembers (package:kernel/class_hierarchy.dart:1008:7)
#1      ClosedWorldClassHierarchy.getInterfaceMembers (package:kernel/class_hierarchy.dart:649:12)
#2      ClosedWorldClassHierarchy.getInterfaceMember (package:kernel/class_hierarchy.dart:643:25)
#3      TypeInferrerImpl.ensureAssignable (package:front_end/src/fasta/type_inference/type_inferrer.dart:641:39)
#4      ClosureContext.handleReturn (package:front_end/src/fasta/type_inference/type_inferrer.dart:332:20)
#5      InferenceVisitor.visitReturnJudgment (package:front_end/src/fasta/kernel/inference_visitor.dart:1694:20)
#6      ReturnJudgment.acceptInference (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1199:20)
#7      ShadowTypeInferrer.inferStatement (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1572:24)
#8      TypeInferrerImpl.inferLocalFunction (package:front_end/src/fasta/type_inference/type_inferrer.dart:1466:5)
#9      InferenceVisitor.visitFunctionNodeJudgment (package:front_end/src/fasta/kernel/inference_visitor.dart:449:21)
#10     InferenceVisitor.visitFunctionExpression (package:front_end/src/fasta/kernel/inference_visitor.dart:464:27)
#11     FunctionExpression.accept1 (package:kernel/ast.dart:3747:43)
#12     ShadowTypeInferrer.inferExpression (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1526:18)
#13     TypeInferrerImpl.inferInvocation.<anonymous closure> (package:front_end/src/fasta/type_inference/type_inferrer.dart:1250:28)
#14     TypeInferrerImpl._forEachArgument (package:front_end/src/fasta/type_inference/type_inferrer.dart:1828:15)
#15     TypeInferrerImpl.inferInvocation (package:front_end/src/fasta/type_inference/type_inferrer.dart:1243:5)
#16     InferenceVisitor.visitConstructorInvocation (package:front_end/src/fasta/kernel/inference_visitor.dart:174:36)
#17     ConstructorInvocation.accept1 (package:kernel/ast.dart:3025:43)
#18     ShadowTypeInferrer.inferExpression (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1526:18)
#19     InferenceVisitor.inferElement (package:front_end/src/fasta/kernel/inference_visitor.dart:817:40)
#20     InferenceVisitor.visitListLiteralJudgment (package:front_end/src/fasta/kernel/inference_visitor.dart:883:25)
#21     ListLiteralJudgment.acceptInference (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:967:20)
#22     ShadowTypeInferrer.inferExpression (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1524:18)
#23     InferenceVisitor.visitVariableDeclarationJudgment (package:front_end/src/fasta/kernel/inference_visitor.dart:2003:16)
#24     VariableDeclarationJudgment.acceptInference (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1737:20)
#25     ShadowTypeInferrer.inferStatement (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1572:24)
#26     InferenceVisitor.visitBlockJudgment (package:front_end/src/fasta/kernel/inference_visitor.dart:90:16)
#27     BlockJudgment.acceptInference (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:207:20)
#28     ShadowTypeInferrer.inferStatement (package:front_end/src/fasta/kernel/kernel_shadow_ast.dart:1572:24)
#29     TypeInferrerImpl.inferFunctionBody (package:front_end/src/fasta/type_inference/type_inferrer.dart:1175:5)
#30     BodyBuilder.finishFunction (package:front_end/src/fasta/kernel/body_builder.dart:768:20)
#31     DietListener.listenerFinishFunction (package:front_end/src/fasta/source/diet_listener.dart:803:14)
#32     DietListener.buildFunctionBody (package:front_end/src/fasta/source/diet_listener.dart:837:7)
#33     DietListener.endTopLevelMethod (package:front_end/src/fasta/source/diet_listener.dart:312:5)
#34     Parser.parseTopLevelMethod (package:front_end/src/fasta/parser/parser.dart:2407:14)
#35     Parser.parseTopLevelMemberImpl (package:front_end/src/fasta/parser/parser.dart:2308:14)
#36     Parser.parseTopLevelDeclarationImpl (package:front_end/src/fasta/parser/parser.dart:556:14)
#37     Parser.parseUnit (package:front_end/src/fasta/parser/parser.dart:347:15)
#38     SourceLoader.buildBody (package:front_end/src/fasta/source/source_loader.dart:278:14)
<asynchronous suspension>
#39     Loader.buildBodies (package:front_end/src/fasta/loader.dart:186:15)
<asynchronous suspension>
#40     KernelTarget.buildComponent.<anonymous closure> (package:front_end/src/fasta/kernel/kernel_target.dart:301:20)
<asynchronous suspension>
#41     withCrashReporting (package:front_end/src/fasta/crash.dart:122:24)
<asynchronous suspension>
#42     KernelTarget.buildComponent (package:front_end/src/fasta/kernel/kernel_target.dart:299:12)
<asynchronous suspension>
#43     IncrementalCompiler.computeDelta.<anonymous closure> (package:front_end/src/fasta/incremental_compiler.dart:308:28)
<asynchronous suspension>
#44     CompilerContext.runInContext.<anonymous closure>.<anonymous closure> (package:front_end/src/fasta/compiler_context.dart:122:46)
#45     new Future.sync (dart:async/future.dart:224:31)
#46     CompilerContext.runInContext.<anonymous closure> (package:front_end/src/fasta/compiler_context.dart:122:19)
#47     _rootRun (dart:async/zone.dart:1124:13)
#48     _CustomZone.run (dart:async/zone.dart:1021:19)
#49     _runZoned (dart:async/zone.dart:1516:10)
#50     runZoned (dart:async/zone.dart:1463:12)
#51     CompilerContext.runInContext (package:front_end/src/fasta/compiler_context.dart:121:12)
#52     IncrementalCompiler.computeDelta (package:front_end/src/fasta/incremental_compiler.dart:127:20)
<asynchronous suspension>
#53     _compile (package:dev_compiler/src/kernel/command.dart:253:64)
<asynchronous suspension>
#54     compile (package:dev_compiler/src/kernel/command.dart:41:18)
<asynchronous suspension>
#55     compile (package:dev_compiler/src/compiler/shared_command.dart:401:12)
#56     _CompilerWorker.performRequest.<anonymous closure> (file:///Users/vsm/dart/sdk/pkg/dev_compiler/bin/dartdevc.dart:57:15)
#57     _rootRun (dart:async/zone.dart:1124:13)
#58     _CustomZone.run (dart:async/zone.dart:1021:19)
#59     _runZoned (dart:async/zone.dart:1516:10)
#60     runZoned (dart:async/zone.dart:1463:12)
#61     _CompilerWorker.performRequest (file:///Users/vsm/dart/sdk/pkg/dev_compiler/bin/dartdevc.dart:56:24)
<asynchronous suspension>
#62     AsyncWorkerLoop.run.<anonymous closure> (package:bazel_worker/src/worker/async_worker_loop.dart:33:41)
#63     _rootRun (dart:async/zone.dart:1124:13)
#64     _CustomZone.run (dart:async/zone.dart:1021:19)
#65     _runZoned (dart:async/zone.dart:1516:10)
#66     runZoned (dart:async/zone.dart:1463:12)
#67     AsyncWorkerLoop.run (package:bazel_worker/src/worker/async_worker_loop.dart:33:26)
<asynchronous suspension>
#68     main (file:///Users/vsm/dart/sdk/pkg/dev_compiler/bin/dartdevc.dart:28:57)
<asynchronous suspension>
#69     _startIsolate.<anonymous closure> (dart:isolate-patch/isolate_patch.dart:296:32)
#70     _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:171:12)

This appears to be unrelated to the constant flag. I'm hitting this in bleeding edge. @kevmoo has hit the same in dev (a few days old).

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

fyi - @stefantsov @jmesserly @jakemac53 @natebosch

I can trigger this reliably by running flutter web gallery with webdev alpha and repeated editing files. @kevmoo just hit similar.

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

This seems to be an issue with how we're saving state? I can paper over this by tossing the lastResult:

diff --git a/pkg/dev_compiler/bin/dartdevc.dart b/pkg/dev_compiler/bin/dartdevc.dart
index fbe71142d1..e8b24d4e69 100755
--- a/pkg/dev_compiler/bin/dartdevc.dart
+++ b/pkg/dev_compiler/bin/dartdevc.dart
@@ -51,11 +51,17 @@ class _CompilerWorker extends AsyncWorkerLoop {
     var args = _startupArgs.merge(request.arguments);
     var output = StringBuffer();
     var context = args.reuseResult ? lastResult : null;
-    lastResult = await runZoned(() => compile(args, previousResult: context),
-        zoneSpecification:
-            ZoneSpecification(print: (self, parent, zone, message) {
-      output.writeln(message.toString());
-    }));
+    lastResult = await runZoned(() => compile(args, previousResult: context));
+
+    if (!lastResult.success) {
+      // Clear the state and try again.
+      print('RETRY');
+      lastResult = await runZoned(() => compile(args, previousResult: null),
+          zoneSpecification:
+              ZoneSpecification(print: (self, parent, zone, message) {
+        output.writeln(message.toString());
+      }));
+    }
     return WorkResponse()
       ..exitCode = lastResult.success ? 0 : 1
       ..output = output.toString();

@vsmenon vsmenon added the p1-high label Apr 16, 2019

@vsmenon vsmenon self-assigned this Apr 16, 2019

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

Here's a cleaned up version of the above:

https://dart-review.googlesource.com/c/sdk/+/99442

This prevents crashes with webdev - ddk, but we still need to understand the root cause.

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

We'll land the above, but keep chasing on this.

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

From @jakemac53 :

Another note here - it appears if we limit to only running a single kernel worker then things are fine. That might help a bit in terms of debugging the issue.

If there are any built-in assumptions in frontend about only one process running at a time then that would cause problems (does it write any files to disk as its running?).

@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

Jake and I made a lot of progress here. Here's what we found so far:

  • ChipDemo is in the same file being edited (chip_demo.dart).
  • The error comes from compiling demos.dart, which depends on chip_demo.dart and is in another module (i.e. has a distinct Kernel summary).
  • ChipDemo does exist in the ClosedWorldClassHierarchy, but it's a different Class object that does not compare ==. Presumably one of them is a stale copy.

So we started investigating invalidation logic, and found that CFE has two files, ddc.dart and bazel_worker.dart. They both have a method called initializeIncrementalCompiler but they are different. The one in ddc.dart seems to be missing some logic, such as the Bazel worker digests.

I'm trying to switch DDC over to the bazel_worker method, and see if that fixes the problem.

(Another thing we're not sure about... if the chip_demo file is getting read from source. It shouldn't because it's from an input Kernel file. But that might explain why we get two copies of the same ChipDemo class.)

@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 16, 2019

Another note here - it appears if we limit to only running a single kernel worker then things are fine. That might help a bit in terms of debugging the issue.

A theory about this: when we have 1 worker, the jobs get sent in a consistent order. So that gives the incremental compiler a chance to see all of the changes consistently. With multiple workers, the compile jobs are getting parallelized, so the incremental compiler might "miss" an update. If we have a bug in the invalidation/state update logic, it might manifest only when the compile jobs are run in a particular order on a worker.

@vsmenon vsmenon assigned jmesserly and unassigned vsmenon Apr 16, 2019

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 16, 2019

The above is landed - that just hides the bug (reasonably well from testing, but still concerned about this).

@jmesserly is chasing down a proper fix.

@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 17, 2019

Update: tried switching to the bazel_worker.dart initializeIncrementalCompiler (with small changes so DDC can pass in some options) but haven't gotten it working with DDC's backend yet. Still investigating.

Update 2: got it working, but it doesn't fix the problem. I'll keep investigating.

Update 3: investigated the "loadFromDill" list that @jakemac53 was trying to log earlier, and it does look correct: only chip_demo's Kernel file is marked for reload. So it seems some amount of reloading is happening, but we have a stale reference somewhere. My current theory is the ClassHierarchy update logic (see applyTreeChanges) isn't working, or is being run at the wrong time . Also, I can see now why this only fails in DDC and not the Kernel file builder: the failure is in "KernelTarget.buildComponent" (called from incremental_compiler.dart) which only runs if "outlineOnly" is false. bazel_worker.dart passes "true" for that.

@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 17, 2019

I have to leave to meet someone, but I'll pick this up later tonight. My best guess currently: it's something related to class hierarchy invalidation. When I added this print:

  @override
  ClassHierarchy applyTreeChanges(Iterable<Library> removedLibraries,
      Iterable<Library> ensureKnownLibraries,
      {Component reissueAmbiguousSupertypesFor}) {
    // Remove all references to the removed classes.
    for (Library lib in removedLibraries) {
      if (!knownLibraries.contains(lib)) {
        print('NOT FOUND ${lib.importUri}');
        continue;
      }
// ...

It thinks "package:flutter_web.examples.gallery/demo/material/chip_demo.dart" is not found. Maybe it's using the wrong Library object (the new instance, rather than the old one)? I wonder if it would find the Library if it looked it up by Uri, rather than by the instance.

edit: the problem with this theory, is that it doesn't explain why we failed to add the correct new data. I saw it repopulate the class hierarchy with lots of libraries, including chip_demo. But that instance must've been out of date too, because it didn't match the version we later looked up during type inference.

@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 17, 2019

Update: still haven't tracked down the root cause. I've been able to eliminate several theories, based on experimental fixes that didn't work. Will look at it more tomorrow, unless someone tracks it down in the meantime.

dart-bot pushed a commit that referenced this issue Apr 17, 2019

[kernel_worker] retry on failure
Workaround for #36644

See similar for dartdevc: https://dart-review.googlesource.com/c/sdk/+/99442

Change-Id: Id02549a7bd8110b4691a724fa8930ede477464f8
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99735
Commit-Queue: Vijay Menon <vsm@google.com>
Reviewed-by: Jake Macdonald <jakemac@google.com>
@jmesserly

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

Some progress. Applying this patch:

--- a/pkg/front_end/lib/src/fasta/source/source_loader.dart
+++ b/pkg/front_end/lib/src/fasta/source/source_loader.dart
@@ -802,6 +802,12 @@ class SourceLoader extends Loader<Library> {
       for (LibraryDependency dependency in library.dependencies) {
         if (libraries.add(dependency.targetLibrary)) {
           workList.add(dependency.targetLibrary);
+
+          var target = this.read(dependency.targetLibrary.importUri, 0).target;
+          if (target != dependency.targetLibrary) {
+            print('LOOKUP FAIL: ${dependency.targetLibrary.importUri} '
+                'found distinct library ${target.hashCode}, instead of ${dependency.targetLibrary.hashCode}');
+          }
         }
       }

This prints that the lookup failed. This is the cause of the class hierarchy lookup failure, because we computeFullComponent by walking the library dependencies. This gives us a different Library instance than the one we obtained from the dill file.

If we added "target" in the code above to the workList (rather than dependency.targetLibrary) that would most likely fix the ClassHierarchy lookup bug, but it doesn't explain why this data got corrupted in the first place. IncrementalCompiler appears to be designed to avoid this problem, by invalidating any downstream libraries that depend on an invalidated one. So we shouldn't be getting incorrect library dependencies.

It looks like the "target" variable is coming from the dill file for chip_demo (which seems correct), but I'm not sure where the other one is coming from, or what library the import is from (presumably demos.dart). What could be going wrong: if we have leftover data from when we compiled "chip_demo" from source, maybe we're finding that rather than the data from the summary.

I have to go now, but I'll keep digging into it.

edit: even on the first successful compile, there are (at least) 6 distinct versions of chip_demo. All of them reached from an export in "flutter_web.examples.gallery/demo/material/material.dart" (which is itself exported from all.dart, and that is imported from demos.dart, the file that fails to compile). In the initial compile, the "lookup failures" start from things downstream from all.dart (including demos.dart, although it succeeds that time). On the failed compile, the "lookup failures" start from all.dart itself, and there are again at least 6 versions of chip_demo.

dart-bot pushed a commit that referenced this issue Apr 18, 2019

[kernel_worker] retry on failure
Workaround for #36644

See similar for dartdevc: https://dart-review.googlesource.com/c/sdk/+/99442

Change-Id: Id02549a7bd8110b4691a724fa8930ede477464f8
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/99735
Commit-Queue: Vijay Menon <vsm@google.com>
Reviewed-by: Jake Macdonald <jakemac@google.com>

@vsmenon vsmenon modified the milestones: Dart 2.3, Dart 2.4 Apr 22, 2019

@vsmenon vsmenon assigned kmillikin and unassigned jmesserly Apr 23, 2019

@vsmenon

This comment has been minimized.

Copy link
Member Author

commented Apr 23, 2019

@kmillikin - can you or someone on your team please take a look? We've covered this up with bandaids for now, but the underlying issue remains worrisome. A real fix here may be worth cherry-picking into Dart 2.3 / Flutter.

@kmillikin

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

/cc @jensjoha

@jensjoha

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2019

@kmillikin

This comment has been minimized.

Copy link
Member

commented Apr 25, 2019

This issue is due to Kernel canonical names. Canonical names are the way that we link Kernel modules together. The original implementation was that we built them just before serializing to a .dill file and we discarded them just after --- Kernel in-memory normally didn't have canonical names.

We've switched to a model where we do have canonical names in memory (with the incremental compiler) and we transfer libraries between components, which entails transferring ownership of canonical name subtrees. So we have to be very careful about ownership of canonical names to ensure that we reuse already-created Kernel nodes when we need to and create new replacement nodes when we need to. CL 100380 should fix this problem.

A related problem is that we have to be careful to build transformations so that they work with the incremental compiler. The Flutter widget inspector, for instance, transforms classes by adding a field to them. It needs to ensure that it correctly creates this new field to properly replace a previously-created field for the same class. A full fix for this issue is probably to provide a restricted API for third-party transformations.

@jakemac53 encountered another issue with CL 100380 mentioned in the comments there that we have not been able to reproduce.

@aadilmaan aadilmaan modified the milestones: Dart 2.3, D24 Release Apr 25, 2019

@dart-bot dart-bot closed this in ed8e425 Apr 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.