Skip to content

[flutter_tools] RPCError: removeBreakpoint: (-32602) invalid breakpoint id #111045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
christopherfujino opened this issue Sep 6, 2022 · 19 comments
Labels
c: crash Stack traces logged to the console P1 High-priority issues at the top of the work list tool Affects the "flutter" command-line tool. See also t: labels.

Comments

@christopherfujino
Copy link
Contributor

#3 crasher on stable/3.3.0
with command: flutter debug_adapter

RPCError: removeBreakpoint: (-32602) invalid breakpoint id bp/5532#98:0
at new _OutstandingRequest(vm_service.dart:1746)
at VmService._call(vm_service.dart:2262)
at VmService.removeBreakpoint(vm_service.dart:2127)
at IsolateManager._sendBreakpoints.<anonymous closure>(isolate_manager.dart:633)
at Future.forEach.<anonymous closure>(future.dart:646)
at Future.doWhile.<anonymous closure>(future.dart:703)
at _rootRunUnary(zone.dart:1399)
at _CustomZone.runUnary(zone.dart:1300)
at _CustomZone.runUnaryGuarded(zone.dart:1209)
at _CustomZone.bindUnaryCallbackGuarded.<anonymous closure>(zone.dart:1246)
at _rootRunUnary(zone.dart:1399)
at <asynchronous gap>(async)
at _CustomZone.bindUnaryCallbackGuarded.<anonymous closure>(zone.dart:1246)
at <asynchronous gap>(async)
@christopherfujino christopherfujino added c: crash Stack traces logged to the console tool Affects the "flutter" command-line tool. See also t: labels. P1 High-priority issues at the top of the work list labels Sep 6, 2022
@christopherfujino christopherfujino moved this to Top Stable (3.3) Crashers in go/flutter-tools-project (deprecated) Sep 6, 2022
@christopherfujino
Copy link
Contributor Author

christopherfujino commented Sep 6, 2022

Stable 3.3.0 uses dds version 2.2.5:

https://github.com/flutter/flutter/blob/stable/packages/flutter_tools/pubspec.yaml#L13

Source: https://github.com/dart-lang/sdk/tree/4ac76c7f35f9cafb05fd09a67db3f363b9737ff0/pkg/dds

Looks like the call at IsolateManager._sendBreakpoints.<anonymous closure>(isolate_manager.dart:633) is from https://github.com/dart-lang/sdk/blob/4ac76c7f35f9cafb05fd09a67db3f363b9737ff0/pkg/dds/lib/src/dap/isolate_manager.dart#L633

      // Before doing async work, take a copy of the breakpoints to remove
      // and remove them from the list, so any subsequent calls here don't
      // try to remove the same ones.
      final breakpointsToRemove = existingBreakpointsForIsolateAndUri.toList();
      existingBreakpointsForIsolateAndUri.clear();
      await Future.forEach<vm.Breakpoint>(breakpointsToRemove,
          (bp) => service.removeBreakpoint(isolateId, bp.id!));

@DanTup do you think this is a race condition?

@DanTup
Copy link
Contributor

DanTup commented Sep 7, 2022

@christopherfujino hmm, unsure. It looks like the code was written to avoid races and I can't see any obvious path. How often is this occurring? Is it across all platforms? I'll see if I can trigger it. If you see anyone report that they can repro this, I'd be very interested in a log file (captured with the Capture Debugging Logs command).

I enabled SDK DAPs for ~10% of users in the last Dart-Code release. If this is happening lots we can push an update to Dart-Code to revert that until we have a fix.

@DanTup
Copy link
Contributor

DanTup commented Sep 7, 2022

invalid breakpoint id bp/5532#98:0

Also - do you know if all instances of this have IDs that start bp/? It looks like VM breakpoints starts with "breakpoint" and when using Chrome/DWDS it's bp/, so I wonder if it's web-specific (although I haven't been able to repro there either yet).

@DanTup
Copy link
Contributor

DanTup commented Sep 7, 2022

I think I've managed to trigger this running in iOS Simulator (so not web), however it doesn't crash it seems to be handled. What I did was add two breakpoints that resolve to the same line (a line of code, and the blank line before it).

The VM returns the same breakpoint ID from the second request to add a breakpoint (although to complicate things, it seems like this only happens if the code has been run previously and compiled - it seems to return a new breakpoint if not), and we blindly keep track of it as a second breakpoint. When we come to remove the breakpoints, we then try to remove it twice (but after the first removal, the ID is not longer valid).

The fix for that is fairly simple, but I can't repro via web (a breakpoint on the blank line is entirely rejected) so I'll continue to see if I can understand how it happened above before fixing.

For reference, my steps to reproduce (which triggers an error if recording logs, but does not crash for me):

  • Create counter app
  • Add a blank line above _counter++
  • Run Capture Debugging Logs to start logging
  • Run the app on iOS Simulator
  • Click the button so that the counter increment runs
  • Add a breakpoint to the blank line
  • Add a breakpoint to _counter++
  • Remove the first breakpoint
  • Remove the second breakpoint
  • Click Cancel to open the log file

In the log file, find the last instance of removeBreakpoint and there's an error recorded where it failed to remove (because it was already previously removed). This error is silent to the user though (and does not crash the tool).

@DanTup
Copy link
Contributor

DanTup commented Sep 7, 2022

Based on this code it looks like breakpoint IDs for web are always scriptid/line/col (the example above was bp/5532#98:0). So adding a breakpoint on a different line that happens to resolve to the same location doesn't seem like it will work for web.

Other thoughts:

  • We somehow had two isolates live when this happened, so we tried to remove the same breakpoint twice (web only has one isolate, but I think during a restart it may exit and a new one be produced, so maybe something happened there)
  • We had two files open in VS Code that somehow represented the same script ID, so we had the ID reused that way

(None of this explains why it was a crash in that instance, but perhaps when this occurred the original call had a different call stack to my repro, one that doesn't handle errors?)

@DanTup
Copy link
Contributor

DanTup commented Sep 7, 2022

(@christopherfujino sorry for the noisy notifications 🙃)

I'm still unable to reproduce this, but I am now convinced there are some races and we just need error handling around this:

  1. It's not the issue for web, but removing a breakpoint for an isolate that has gone will fail. This means that whatever we do in the debug adapter, there is always a change that right after we start sending a removeBreakpoint call to the VM, the isolate exited. This means we can never assume it will always pass (and therefore, we need to handle errors when calling removeBreakpoint, exactly like we already do with the subsequent addBreakpoint calls)
  2. When a breakpoint exits, even if we get the event before we try to update any breakpoints, there are awaits in the processing of the exit event (for ex. we need to ensure any initialisation of the isolate completes before we try to clean it up), so there's always an opportunity for another request from the client that modifies breakpoints to try and get in there.

I'm not certain, but I suspect the issue above is triggered by the second. Maybe some timing of a restart (which pretends to terminate and re-create an isolate on web) and breakpoint update overlapped and triggered this.

I'll get a fix into DDS. If this issue is occurring frequently and you think we should roll back the 10% of users on the new DAPs until this fix is shipped, let me know.

Edit: CL at https://dart-review.googlesource.com/c/sdk/+/257962/.

DanTup added a commit to Dart-Code/Dart-Code that referenced this issue Sep 7, 2022
@christopherfujino
Copy link
Contributor Author

(@christopherfujino sorry for the noisy notifications upside_down_face)

I'm still unable to reproduce this, but I am now convinced there are some races and we just need error handling around this:

  1. It's not the issue for web, but removing a breakpoint for an isolate that has gone will fail. This means that whatever we do in the debug adapter, there is always a change that right after we start sending a removeBreakpoint call to the VM, the isolate exited. This means we can never assume it will always pass (and therefore, we need to handle errors when calling removeBreakpoint, exactly like we already do with the subsequent addBreakpoint calls)
  2. When a breakpoint exits, even if we get the event before we try to update any breakpoints, there are awaits in the processing of the exit event (for ex. we need to ensure any initialisation of the isolate completes before we try to clean it up), so there's always an opportunity for another request from the client that modifies breakpoints to try and get in there.

I'm not certain, but I suspect the issue above is triggered by the second. Maybe some timing of a restart (which pretends to terminate and re-create an isolate on web) and breakpoint update overlapped and triggered this.

I'll get a fix into DDS. If this issue is occurring frequently and you think we should roll back the 10% of users on the new DAPs until this fix is shipped, let me know.

Edit: CL at https://dart-review.googlesource.com/c/sdk/+/257962/.

Thanks for the quick fix!

@DanTup
Copy link
Contributor

DanTup commented Sep 8, 2022

I think this was closed accidentally by the SDK change being pushed to a repo. Re-opening until the DDS change is in Flutter.

@DanTup DanTup reopened this Sep 8, 2022
@christopherfujino
Copy link
Contributor Author

I think this was closed accidentally by the SDK change being pushed to a repo. Re-opening until the DDS change is in Flutter.

Thanks

@itsjustkevin
Copy link
Contributor

Interesting @DanTup I did not close this manually, I wonder if this was due to the recent roll.

@DanTup
Copy link
Contributor

DanTup commented Sep 8, 2022

Yeah, it was GitHub. I think you pushed to your sdk fork on GitHub (https://github.com/itsjustkevin/sdk) which included my commit that has the fix in DDS (https://github.com/itsjustkevin/sdk/commit/94a64a01f630e4a96b6b3cf3ec3c9a6b3f5f3445). GitHub saw my commit message (which contains "fixes ") and since you have access to close this issue, assumed a fix had been merged (you can close issues in Repo A from Repo B.. it doesn't seem to matter whether Repo B is just a fork or not).

I don't know why it didn't happen when the SDK is synced up to the official repo (the same commit already existed at dart-lang/sdk@94a64a0 yesterday), but I should be more careful with my commit messages (I was only thinking about Gerrit when I pushed it 😄).

(I've accidentally closed several of issues this way - and sometimes I've only noticed because people pinged me afterwards 😬)

@christopherfujino
Copy link
Contributor Author

Interesting @DanTup I did not close this manually, I wonder if this was due to the recent roll.

I think it's because the CL noted "Fixes #111045", and maybe your fork picked up this commit before copybara to dart-lang/sdk?

@itsjustkevin
Copy link
Contributor

itsjustkevin commented Sep 8, 2022

cc @iinozemtsev

@christopherfujino
Copy link
Contributor Author

Update on this issue:

  1. @DanTup published a new version of dart-code that stops the experiment using flutter debug-adapter, so only users explicitly using it from the command line should be affected. This will be re-enabled once the upstream dds fix is rolled to Flutter stable.
  2. The upstream DDS fix was landed in dart-lang/sdk@94a64a0

This issue can be closed once:

  1. a new version of DDS has been published
  2. this version has been autorolled to the framework.

Regarding publishing a new version of DDS, @bkonyi could you do that?

@DanTup
Copy link
Contributor

DanTup commented Sep 14, 2022

so only users explicitly using it from the command line should be affected

To be clear, I wouldn't expect any users to be explicitly running this command from the command line - though perhaps you mean users that have configured other (non-VS Code) clients to spawn the debug adapter using the flutter debug_adapter command.

This will be re-enabled once the upstream dds fix is rolled to Flutter stable.

My plan is to re-enable it based on the version of the SDK once it rolls into Flutter, so that master/beta users will have it re-enabled (if they have the updated DDS) sooner than a stable release, just to help get additional testing/validation before then.

@christopherfujino out of interest, has there been a significant reduction in the number of crash reports since the original influx?

@bkonyi
Copy link
Contributor

bkonyi commented Sep 15, 2022

@christopherfujino I'll publish a new DDS today once this CL lands.

@bkonyi
Copy link
Contributor

bkonyi commented Sep 15, 2022

DDS v2.3.0 has been published.

@christopherfujino
Copy link
Contributor Author

@DanTup I don't see a single report on the latest 3.3.2 stable!

DDS v2.3.0 has been published.

I will now close this issue as the upstream patch has been rolled to the framework. Thanks all!

@github-actions
Copy link

github-actions bot commented Oct 4, 2022

This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of flutter doctor -v and a minimal reproduction of the issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
c: crash Stack traces logged to the console P1 High-priority issues at the top of the work list tool Affects the "flutter" command-line tool. See also t: labels.
Projects
Development

No branches or pull requests

4 participants