Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flutter app freezes and VM service stops responding when adding breakpoints to multiple isolates at the same time #54650

Closed
DanTup opened this issue Jan 17, 2024 · 31 comments
Assignees
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. cherry-pick-candidate Candidates to be cherry-picked P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) vm-service The VM Service Protocol, both the specification and its implementation

Comments

@DanTup
Copy link
Collaborator

DanTup commented Jan 17, 2024

This was raised at Dart-Code/Dart-Code#4926 and seems to be affecting a number of users. I haven't managed to reproduce it yet.

The users can initially add breakpoints fine, but at some point when they try to modify the breakpoints, the Flutter app freezes and stops responding. In the logs, it seems that the VM Service never responds to the calls to addBreakpointWithScriptUri.

Here are the interesting parts of a log that logged the DAP + VM Service traffic from @mellowcello77:

// User modifies breakpoints in VS Code, causes debug adapter to send setBreakpoints request
[18:55:20] [DAP] [Info] ==> {"command":"setBreakpoints","arguments":{"source":{"adapterData":{"type":"@Script","id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","fixedId":true,"uri":"package:app/features/foo/services/foo_view_svc.dart"},"name":"package:app/features/foo/services/foo_view_svc.dart","path":"/Users/username/Code/aaaaaaaaa_app/lib/features/foo/services/foo_view_svc.dart"},"lines":[236,237,238,239],"breakpoints":[{"line":236},{"line":237},{"line":238},{"line":239}],"sourceModified":false},"type":"request","seq":18}

// Debug adapter removes old breakpoints
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"77","method":"removeBreakpoint","params":{"isolateId":"isolates/5063336331412151","breakpointId":"breakpoints/4"}}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"78","method":"removeBreakpoint","params":{"isolateId":"isolates/4500830224445031","breakpointId":"breakpoints/4"}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/5063336331412151","name":"main","number":"5063336331412151","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520797,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/4","enabled":true,"breakpointNumber":4,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7760,"line":236,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"77"}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/4500830224445031","name":"_compute","number":"4500830224445031","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520797,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/4","enabled":true,"breakpointNumber":4,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7760,"line":236,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"78"}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"79","method":"removeBreakpoint","params":{"isolateId":"isolates/5063336331412151","breakpointId":"breakpoints/5"}}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"80","method":"removeBreakpoint","params":{"isolateId":"isolates/4500830224445031","breakpointId":"breakpoints/5"}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/5063336331412151","name":"main","number":"5063336331412151","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520798,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/5","enabled":true,"breakpointNumber":5,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7867,"line":237,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"79"}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"81","method":"removeBreakpoint","params":{"isolateId":"isolates/5063336331412151","breakpointId":"breakpoints/6"}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/4500830224445031","name":"_compute","number":"4500830224445031","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520798,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/5","enabled":true,"breakpointNumber":5,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7867,"line":237,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"80"}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"82","method":"removeBreakpoint","params":{"isolateId":"isolates/4500830224445031","breakpointId":"breakpoints/6"}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/5063336331412151","name":"main","number":"5063336331412151","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520799,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/6","enabled":true,"breakpointNumber":6,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7940,"line":238,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"81"}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","method":"streamNotify","params":{"streamId":"Debug","event":{"type":"Event","kind":"BreakpointRemoved","isolateGroup":{"type":"@IsolateGroup","id":"isolateGroups/2879679844578432","name":"main.dart","number":"2879679844578432","isSystemIsolateGroup":false},"isolate":{"type":"@Isolate","id":"isolates/4500830224445031","name":"_compute","number":"4500830224445031","isSystemIsolate":false,"isolateGroupId":"isolateGroups/2879679844578432"},"timestamp":1705492520799,"breakpoint":{"type":"Breakpoint","fixedId":true,"id":"breakpoints/6","enabled":true,"breakpointNumber":6,"resolved":true,"location":{"type":"SourceLocation","script":{"type":"@Script","fixedId":true,"id":"libraries/@168202577/scripts/package%3Aapp%2Ffeatures%2Ffoo%2Fservices%2Ffoo_view_svc.dart/18d1745dbfc","uri":"package:app/features/foo/services/foo_view_svc.dart"},"tokenPos":7940,"line":238,"column":9}}}}}
[18:55:20] [General] [Info] [stage macos (macOS)] <== [VM] {"jsonrpc":"2.0","result":{"type":"Success"},"id":"82"}

// Debug adapter sends addBreakpointWithScriptUri to add new breakpoints
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"83","method":"addBreakpointWithScriptUri","params":{"isolateId":"isolates/5063336331412151","scriptUri":"file:///Users/username/Code/aaaaaaaaa_app/lib/features/foo/services/foo_view_svc.dart","line":236}}
[18:55:20] [General] [Info] [stage macos (macOS)] ==> [VM] {"jsonrpc":"2.0","id":"84","method":"addBreakpointWithScriptUri","params":{"isolateId":"isolates/4500830224445031","scriptUri":"file:///Users/username/Code/aaaaaaaaa_app/lib/features/foo/services/foo_view_svc.dart","line":236}}

// Requests never complete...


// User terminates, which sends terminate to debug adapter
[18:55:27] [DAP] [Info] ==> {"command":"terminate","arguments":{"restart":false},"type":"request","seq":19}

// Debug adapter tells `flutter run` process to shutdown, which it does
[18:55:27] [General] [Info] [stage macos (macOS)] ==> [Flutter] [{"id":1,"method":"app.stop","params":{"appId":"2bb7504f-3d48-432a-90b9-2b030e7b1fa9"}}]
[18:55:27] [DAP] [Info] <== {"seq":8796,"type":"event","body":{"message":"<== [Flutter] [+9344 ms] DevFS: Deleting filesystem on the device (file:///var/folders/8w/11_jl3f52ld0ccqrkl1rt1jh0000gn/T/aaaaaaaaa_appNfDbkW/aaaaaaaaa_app/)\n"},"event":"dart.log"}
[18:55:27] [DAP] [Info] <== {"seq":8797,"type":"event","body":{"category":"stdout","output":"[+9344 ms] DevFS: Deleting filesystem on the device (file:///var/folders/8w/11_jl3f52ld0ccqrkl1rt1jh0000gn/T/aaaaaaaaa_appNfDbkW/aaaaaaaaa_app/)\n"},"event":"output"}

My guess is that there is some issue triggered by concurrent requests to add breakpoints to each isolate. I don't know if it's a Flutter-specific issue or Dart so I thought I'd start here.

I'll try again to see if I can get a repro now I have a slightly better understanding of the issue.

(@bkonyi any ideas here?)

@bkonyi
Copy link
Contributor

bkonyi commented Jan 17, 2024

Nope, no ideas off the top of my head. We'll need a repro to make any progress on this, unfortunately.

@mellowcello77
Copy link

@DanTup Willing to share the project as it is, just needs docker to fully reproduce. But I'll try strip as much from it as I can tomorrow too and see what happens.

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 17, 2024

@mellowcello77 can you strip out the bits that require docker? If the freeze just occurs when setting breakpoints (and having isolates), seems like it shouldn't be necessary.

Please also don't share anything sensitive/confidential. The simpler the repo the better.

Can you also confirm your OS, Flutter version and the device you're running on (if a physical device, can you confirm whether the issue also repros on the desktop device or a simulator)? Thanks!

@a-siva a-siva added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) vm-service The VM Service Protocol, both the specification and its implementation P1 A high priority bug; for example, a single project is unusable or has many test failures P2 A bug or feature request we're likely to work on needs-info We need additional information from the issue author (auto-closed after 14 days if no response) labels Jan 17, 2024
@mkustermann
Copy link
Member

/cc @aam

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

Thanks to info posted at Dart-Code/Dart-Code#4926 (comment) by @QCIPaulCardno I think I've managed to get a reliable repro for this. Currently running this on M1 Mac OS desktop device (so far have been unable to repro on Intel Mac or Windows) using current Flutter stable.

The code spawns some isolates that just count, with the counts shown on the app:

import 'dart:isolate';

import 'package:flutter/material.dart';

void main() {
  runApp(const MainApp());
}

class MainApp extends StatefulWidget {
  const MainApp({super.key});

  @override
  State<MainApp> createState() => _MainAppState();
}

class _MainAppState extends State<MainApp> {
  final _counters = List.filled(10, 0);

  @override
  void initState() {
    super.initState();

    for (var i = 0; i < _counters.length; i++) {
      // BP 1
      var myReceivePort = ReceivePort();

      // BP 2
      Isolate.spawn<SendPort>(_count, myReceivePort.sendPort);

      // BP 3
      myReceivePort.listen((message) {
        setState(() {
          _counters[i] = message;
        });
      });
    }
  }

  static void _count(SendPort mySendPort) async {
    for (var i = 0;; i++) {
      mySendPort.send(i);
      await Future.delayed(const Duration(milliseconds: 100));
    }
  }

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        body: Column(
          crossAxisAlignment: CrossAxisAlignment.start,
          children: [
            for (var (index, counter) in _counters.indexed)
              Text('Isolate $index: $counter'),
          ],
        ),
      ),
    );
  }
}

When the app is running, I add breakpoints to the line under // BP 1 and then // BP 2. When I add the second breakpoint, the app freezes. Note: Neither of those lines I'm adding breakpoints to are executing (they already completed before I started adding breakpoints).

breakpoint_freeze.mov

Complete DAP + VM log here:

Dart-Code-Log-2024-00-04 11-26-54.txt

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

I wasn't able to repro with the sample above on Intel Mac or on Windows. I don't know if that means it's specific to ARM Mac though or if there might just be some timing issue or something that differs between them.

@mellowcello77
Copy link

mellowcello77 commented Jan 18, 2024

@DanTup I just ran your repro code and doing exact same on my M2 as in your video. Tested on both iOS and MacOS.

@bkonyi
Copy link
Contributor

bkonyi commented Jan 18, 2024

@aam do you think you can take a look? I don't have access to an M1 device.

@aam
Copy link
Contributor

aam commented Jan 18, 2024

@aam do you think you can take a look? I don't have access to an M1 device.

I would be surprised if it's mac or m1 specific. I will try to reproduce it though.
If folks are able to reproduce this reliably, then getting a list all threads in dart vm process would be very helpful.
On android you can do kill -SIGQUIT [vm_app_pid] and then pull trace_nn file from /data/anr

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

It's the macOS Desktop device where I can repro this, no Android involved. It'll probably be faster if you can reproduce it, but otherwise I'm happy to run commands and provide more info if I can (note: I've not used lldb/gdb much so I might need explicit steps).

@a-siva
Copy link
Contributor

a-siva commented Jan 18, 2024

@DanTup do you think you will be able to reproduce the same using devtools (want to get vscode out of the equation)

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

I can certain give it a go.

@a-siva a-siva removed the P1 A high priority bug; for example, a single project is unusable or has many test failures label Jan 18, 2024
@aam
Copy link
Contributor

aam commented Jan 18, 2024

@DanTup wrote

It's the macOS Desktop device where I can repro this

Ah, then can you try lldb -p <pid> where is the process id of flutter app and then capture output of bt all?

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

Yep, I can repro with DevTools (with the app run from the terminal, no VS Code running). I had to toggle the second breakpoint a few times (I did not do this particularly quickly):

devtools_repro.mov

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

@aam does this look right?

danny@Dannys-MacBook-Air isolate_breakpoint_freeze_repro % lldb -p 1051 bt all
(lldb) process attach --pid 1051
Process 1051 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00000001852f7f14 libsystem_kernel.dylib`mach_msg2_trap + 8
libsystem_kernel.dylib`mach_msg2_trap:
->  0x1852f7f14 <+8>: ret    

libsystem_kernel.dylib`macx_swapon:
    0x1852f7f18 <+0>: mov    x16, #-0x30
    0x1852f7f1c <+4>: svc    #0x80
    0x1852f7f20 <+8>: ret    
Target 0: (isolate_breakpoint_freeze_repro) stopped.
Executable module set to "/Users/danny/Dev/TestStuff/isolate_breakpoint_freeze_repro/build/macos/Build/Products/Debug/isolate_breakpoint_freeze_repro.app/Contents/MacOS/isolate_breakpoint_freeze_repro".
Architecture set to: arm64-apple-macosx-.
(lldb) 

@aam
Copy link
Contributor

aam commented Jan 18, 2024

@DanTup thanks, but I expect to see more threads (named DartWorker and such) in the flutter dart vm process. You should be able see them if you say thread list in lldb I hope.

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

Ah, seems like bt all includes all threads if I run it after attaching, but if I added it to the command it did not.

Here's thread list:

(lldb) thread list
Process 1011 stopped
* thread #1: tid = 0x1d6c, 0x000000018f677f14 libsystem_kernel.dylib`mach_msg2_trap + 8, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  thread #2: tid = 0x1d84, 0x000000018f679bc8 libsystem_kernel.dylib`__workq_kernreturn + 8
  thread #3: tid = 0x1d89, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.flutter.ui'
  thread #4: tid = 0x1d8a, 0x000000018f677f14 libsystem_kernel.dylib`mach_msg2_trap + 8, name = 'io.flutter.raster'
  thread #5: tid = 0x1d8b, 0x000000018f677f14 libsystem_kernel.dylib`mach_msg2_trap + 8, name = 'io.flutter.io'
  thread #6: tid = 0x1d8c, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.1'
  thread #7: tid = 0x1d8d, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.2'
  thread #8: tid = 0x1d8e, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.3'
  thread #9: tid = 0x1d8f, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.4'
  thread #10: tid = 0x1d90, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.5'
  thread #11: tid = 0x1d91, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.6'
  thread #12: tid = 0x1d92, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.7'
  thread #13: tid = 0x1d93, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'io.worker.8'
  thread #14: tid = 0x1d94, 0x000000018f67e060 libsystem_kernel.dylib`kevent + 8, name = 'dart:io EventHandler'
  thread #15: tid = 0x1d95, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'Dart Profiler ThreadInterrupter'
  thread #16: tid = 0x1d96, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'Dart Profiler SampleBlockProcessor'
  thread #17: tid = 0x1d9b, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #18: tid = 0x1d9c, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #19: tid = 0x1d9e, 0x000000018f677f14 libsystem_kernel.dylib`mach_msg2_trap + 8, name = 'com.apple.NSEventThread'
  thread #20: tid = 0x1dc0, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #21: tid = 0x1dc1, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #22: tid = 0x1dc4, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #23: tid = 0x1dc5, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #24: tid = 0x1dc6, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #25: tid = 0x1dc7, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #26: tid = 0x1dc8, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'
  thread #27: tid = 0x1dc9, 0x000000018f67b710 libsystem_kernel.dylib`__psynch_cvwait + 8, name = 'DartWorker'

And the full output of bt all is here:

https://gist.github.com/DanTup/cea9c51d1ad75472d01767c8c0c88528

@aam aam removed the needs-info We need additional information from the issue author (auto-closed after 14 days if no response) label Jan 18, 2024
@aam
Copy link
Contributor

aam commented Jan 18, 2024

The deadlock is caused by thread #3 RemoveBreakpoint grabbing WriteRwLocker for breakpoint_locations_lock, then requesting a Safepoint, while another thread #17 is waiting to grab ReadRwLocker for same breakpoint_locations_lock(which #3 holds) in HasBreakpoint, so doesn't yield to Safepoint requested by #3.

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

A user reported this issue in the vs-code channel on the Flutter discord and believes they don't have multiple isolates, so it might be that multiple isolates is not required to trigger this (but perhaps it makes it more likely, as we'll be sending more add/remove breakpoint requests).

@aam
Copy link
Contributor

aam commented Jan 18, 2024

I'm not sure why this was not encountered before, people started to report it only recently.

@DanTup
Copy link
Collaborator Author

DanTup commented Jan 18, 2024

Yeah, I wondered that too. At first I thought maybe it was due to more people using the new SDK debug adapters in the latest release (up from 50% to 100%) and maybe the behaviour is slightly different there, however reports are that it occurs using the legacy adapters too, and the poster on Discord suggested it was happening in both Android Studio and VS Code.

So far all reports I've seen (and the only place I can repro) is still ARM Macs.. I don't know whether that's coincidence or there's something making it more likely to trigger there.

@jacob314
Copy link
Member

jacob314 commented Jan 19, 2024

@bkonyi bkonyi added P1 A high priority bug; for example, a single project is unusable or has many test failures and removed P2 A bug or feature request we're likely to work on labels Jan 19, 2024
@aam aam self-assigned this Jan 19, 2024
@mellowcello77
Copy link

mellowcello77 commented Jan 22, 2024

Is there anything we can do in the meantime, maybe a downgrade? I can't touch breakpoints in my projects, which is, of course, making it super hard to debug this way. Thanks again for looking into this.

@bkonyi
Copy link
Contributor

bkonyi commented Jan 22, 2024

Is there anything we can do in the meantime, maybe a downgrade? I can't touch breakpoints in my projects, which is, of course, making it super hard to debug this way. Thanks again for looking into this.

Unfortunately not. This bug has seemingly been present for awhile, so it's not clear why developers are encountering it frequently all of a sudden. We do have a fix ready and we should see if we can get it into a hotfix release.

@a-siva
Copy link
Contributor

a-siva commented Jan 22, 2024

Is this a cherry pick candidate ?

@bkonyi
Copy link
Contributor

bkonyi commented Jan 22, 2024

Is this a cherry pick candidate ?

I think so, it seems like it's impacting quite a few users.

@a-siva a-siva added the cherry-pick-candidate Candidates to be cherry-picked label Jan 22, 2024
@jacob314
Copy link
Member

I think this should be cherry-picked given 6 people have now thumbs up the VSCode issue and the severity of the issue when it occurs.

@aam
Copy link
Contributor

aam commented Jan 23, 2024

The fix for this landed in flutter flutter/flutter@676e322, @DanTup would you be able to give it a try?

@a-siva
Copy link
Contributor

a-siva commented Jan 23, 2024

A cherry pick request for this has been filed here #54699

copybara-service bot pushed a commit that referenced this issue Jan 23, 2024
…ions locks and Ensure setting breakpoints is lock-safe.

TEST=DartAPI_BreakpointLockRace and DeoptimizeFramesWhenSettingBreakpoint

Fixes
#54650
flutter/flutter#140878

Acquire reload opreation scope when deoptimizing the world to ensure locks can be acquired for compilation.
Set up scope for operations that can be run while the world is deoptimized and stopped to avoid races.
Ensure code stays unoptizimed when single stepping, prevent other isolates to reoptimize.

Bug: #54650 and flutter/flutter#140878
Change-Id: I9a88096f15a34b645281e5b2b3805a73dd93672e
Cherry-pick: https://dart-review.googlesource.com/c/sdk/+/347420 and https://dart-review.googlesource.com/c/sdk/+/345743
Cherry-pick-request: #54699
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/347650
Reviewed-by: Alexander Aprelev <aam@google.com>
Commit-Queue: Kevin Chisholm <kevinjchisholm@google.com>
@DanTup
Copy link
Collaborator Author

DanTup commented Jan 23, 2024

The fix for this landed in flutter flutter/flutter@676e322, @DanTup would you be able to give it a try?

I just tested with the change before that and confirmed I could still reproduce the issue (although it did require toggling the breakpoint a few times).

Then I tried on latest and I was not able to reproduce the issue even with a significant amount of very quick breakpoint toggling. As far as I can tell, the issue is definitely fixed. Thanks! :-)

copybara-service bot pushed a commit that referenced this issue Jan 23, 2024
…ns locks and Ensure setting breakpoints is lock-safe.

Acquire reload opreation scope when deoptimizing the world to ensure locks can be acquired for compilation.
Set up scope for operations that can be run while the world is deoptimized and stopped to avoid races.
Ensure code stays unoptizimed when single stepping, prevent other isolates to reoptimize it.

Fixes:
#54650
flutter/flutter#140878

TEST=DartAPI_BreakpointLockRace and DeoptimizeFramesWhenSettingBreakpoint

Bug: #54650 and flutter/flutter#140878
Cherry-pick: https://dart-review.googlesource.com/c/sdk/+/347420 and https://dart-review.googlesource.com/c/sdk/+/345743
Cherry-pick-request: #54699
Change-Id: Ia4bc883121dac978fbb76027906a810000ef1138
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/347760
Reviewed-by: Alexander Aprelev <aam@google.com>
Commit-Queue: Siva Annamalai <asiva@google.com>
@DanTup
Copy link
Collaborator Author

DanTup commented Jan 25, 2024

I just verified the issue is no longer reproducible for me using the new Flutter stable (3.16.9). If you were hitting this, please run flutter upgrade and try again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. cherry-pick-candidate Candidates to be cherry-picked P1 A high priority bug; for example, a single project is unusable or has many test failures type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) vm-service The VM Service Protocol, both the specification and its implementation
Projects
None yet
Development

No branches or pull requests

7 participants