having a potential deadlock in code #1

zylthinking · 2021-08-25T04:36:47Z

That can be produced when main drop reactor early than in threads. e.g

reactor.lock().map(|mut r: MutexGuard<Box<Reactor>>| r.wake(id)).unwrap(); 
thread::sleep(Duration::from_secs(6)); 
drop(reactor);

and

block_on(mainfut);
drop(reactor);
thread::sleep(Duration::from_secs(12));

The text was updated successfully, but these errors were encountered:

cfsamson · 2021-08-25T22:19:17Z

As long as there are tasks queued, dopping the reactor should only decrease the refcount since it's wrapped in an Arc and each task needs a copy of the reactor to work. I'm not sure, I understand the deadlock scenario fully here. Could you provide an example I can test using the final example, for example using the playground (although I guess it will time out with the sleep intervals in your code above I can copy it and run in locally): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=91a5cdf856530eecfb46345a36fb5bbb

zylthinking · 2021-08-27T15:19:32Z

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=cf33e19fda62de240ff9cf24fe50c6be ***@***.***/0?redirect=https%3A%2F%2Fplay.rust-lang.org%2F%3Fversion%3Dstable%26mode%3Ddebug%26edition%3D2018%26gist%3Dcf33e19fda62de240ff9cf24fe50c6be&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) I add some logs to show what happened

…

On Aug 26 2021, at 6:19 am, Carl Fredrik Samson ***@***.***> wrote: As long as there are tasks queued, dopping the reactor should only decrease the refcount since it's wrapped in an Arc and each task needs a copy of the reactor to work. I'm not sure, I understand the deadlock scenario fully here. Could you provide an example I can test using the final example, for example using the playground (although I guess it will time out with the sleep intervals in your code above I can copy it and run in locally): https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=91a5cdf856530eecfb46345a36fb5bbb ***@***.***/1?redirect=https%3A%2F%2Fplay.rust-lang.org%2F%3Fversion%3Dstable%26mode%3Ddebug%26edition%3D2018%26gist%3D91a5cdf856530eecfb46345a36fb5bbb&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub ***@***.***/2?redirect=https%3A%2F%2Fgithub.com%2Fcfsamson%2Fexamples-futures%2Fissues%2F1%23issuecomment-905914655&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D), or unsubscribe ***@***.***/3?redirect=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAIEPMZ5TBITCTARDT52QWLT6VT67ANCNFSM5CYGKCXA&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D). Triage notifications on the go with GitHub Mobile for iOS ***@***.***/4?redirect=https%3A%2F%2Fapps.apple.com%2Fapp%2Fapple-store%2Fid1477376905%3Fct%3Dnotification-email%26mt%3D8%26pt%3D524675&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D) or Android ***@***.***/5?redirect=https%3A%2F%2Fplay.google.com%2Fstore%2Fapps%2Fdetails%3Fid%3Dcom.github.android%26utm_campaign%3Dnotification-email&recipient=cmVwbHkrQUFJRVBNNE9ORFNBQktQTFUyUVpDTlY3R0tSTzdFVkJOSEhEVVZPTjY0QHJlcGx5LmdpdGh1Yi5jb20%3D).

cfsamson · 2021-08-29T13:01:03Z

Thanks. Good catch!

I think the major issue is that I use drop to shut down the reactor, but can't guarantee that the shutdown signal isn't sent from one of the child threads (as you showed when the refcount reaches 1 in one of the child threads instead of the main thread as intended).

The issue is probably a bit contrived since I fire off a seperate thread to handle each event for this example as we don't have any real I/O and use sleeping threads to simulate the waiting. In a real Reactor you wouldn't fire off threads that way and rely on sleep for wakeup. One solution is to treat the timer threads as fire-and-forget since they're only created to be able to simulate waiting for an external event. We don't have to join them.

If we want to keep with best practices and join all spawned threads (even the ones that we create only for simulation). I think the cleanest solution is to not rely on Drop for cleanly shutting down the Reactor but instead provide an instance method for it so if we remove the Drop impl and instead add an instance method called shutdown on Reactor:

    fn shutdown(&mut self) {
        println!("really drop, sending quit");
        self.dispatcher.send(Event::Close).unwrap();
        println!("waiting for executor thread done");
        self.handle.take().map(|h| h.join().unwrap()).unwrap();
        println!("executor thread ended");
    }

And in "main" we need to call this instead of relying on drop to shutdown the reactor:

    println!("drop 1");
    reactor.lock().map(|mut r| r.shutdown()).unwrap();
    println!("just wait long to observe 2 threads will blocked forever");
    thread::sleep(Duration::from_secs(1000));

Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=67550c19e85246827cdb8422b63ceb44

What do you think? Do you think it's better to rewrite Drop or just treat the timer threads as fire-and-forget by not joining them?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

having a potential deadlock in code #1

having a potential deadlock in code #1

zylthinking commented Aug 25, 2021

cfsamson commented Aug 25, 2021

zylthinking commented Aug 27, 2021 via email

cfsamson commented Aug 29, 2021 •

edited

Loading

having a potential deadlock in code #1

having a potential deadlock in code #1

Comments

zylthinking commented Aug 25, 2021

cfsamson commented Aug 25, 2021

zylthinking commented Aug 27, 2021 via email

cfsamson commented Aug 29, 2021 • edited Loading

cfsamson commented Aug 29, 2021 •

edited

Loading