Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock when sending circular message #13

Closed
stoically opened this issue Jun 9, 2020 · 13 comments · Fixed by #15
Closed

Deadlock when sending circular message #13

stoically opened this issue Jun 9, 2020 · 13 comments · Fixed by #15
Assignees
Labels
limitation A limitation in xtra
Milestone

Comments

@stoically
Copy link
Contributor

stoically commented Jun 9, 2020

When sending async messages like A -[send]-> B -[send]-> A, xtra will be stuck in a deadlock. After researching a bit it seems that this is expected behavior. There was notify mentioned to mitigate that, but that doesn't seem to work in my case. If it's possible a pointer/example would be helpful.

It'd also nice if it'd be somehow possible to print a warning or something when this happens, because keeping track of who sends messages to whom in an actor system can get quite hard. Going by that reddit thread I can see that it's not an easy problem to solve.

Kudos again for xtra as it is!

@Restioson
Copy link
Owner

Restioson commented Jun 9, 2020

Can you send the code that reproduces the problem and one with notify?

For me, the following example prints an endless stream of Got A / Got B:

use xtra::prelude::*;

struct MyActor;

impl Actor for MyActor {}

#[derive(Debug)]
enum CircularMessage {
    A,
    B,
}

impl CircularMessage {
    fn swap(self) -> CircularMessage {
        match self {
            CircularMessage::A => CircularMessage::B,
            CircularMessage::B => CircularMessage::A,
        }
    }
}

impl Message for CircularMessage {
    type Result = ();
}

impl SyncHandler<CircularMessage> for MyActor {
    fn handle(&mut self, m: CircularMessage, ctx: &mut Context<Self>) {
        println!("Got {:?}", m);
        ctx.notify_later(m.swap())
    }
}

#[tokio::main]
async fn main() {
    let addr = MyActor.spawn();
    addr.do_send(CircularMessage::A).unwrap();
    loop {} // prevent exit
}

For interest's sake, it does not seem to leak memory, either. The program's memory stays pinned at around ~160kb.

@stoically
Copy link
Contributor Author

stoically commented Jun 9, 2020

Sure, that's what I was trying to do:

use async_trait::async_trait;
use xtra::prelude::*;

struct Initialized(Address<ActorA>);
impl Message for Initialized {
    type Result = ();
}

struct Hello;
impl Message for Hello {
    type Result = ();
}

struct ActorA {
    actor_b: Address<ActorB>,
}
impl Actor for ActorA {}

#[async_trait]
impl Handler<Hello> for ActorA {
    async fn handle(&mut self, _: Hello, _: &mut Context<Self>) {
        println!("ActorA: Hello");
        self.actor_b.send(Hello).await.unwrap();
    }
}

struct ActorB;
impl Actor for ActorB {}

#[async_trait]
impl Handler<Initialized> for ActorB {
    async fn handle(&mut self, m: Initialized, _: &mut Context<Self>) {
        println!("ActorB: Initialized");
        let actor_a = m.0;
        actor_a.send(Hello).await.unwrap();
    }
}

#[async_trait]
impl Handler<Hello> for ActorB {
    async fn handle(&mut self, _: Hello, _: &mut Context<Self>) {
        println!("ActorB: Hello");
    }
}

#[tokio::main]
async fn main() {
    let actor_b = ActorB {}.spawn();
    let actor_a = ActorA {
        actor_b: actor_b.clone(),
    }
    .spawn();
    actor_b.send(Initialized(actor_a.clone())).await.unwrap();
}

ActorB never receives the Hello message. Then tried multiple ways to use notify_later in that scenario but nothing prevented the deadlock. Maybe I was just missing the correct way to do it?

@Restioson
Copy link
Owner

I think the issue here might be that send waits for the return of a message. This probably causes a deadlock where they are each waiting for a response before being able to handle the next message. In this trivial example, this is solvable by replacing send with do_send, or not awaiting on send (though do_send is preferred as it is faster and cleaner).

Once this change is applied, I get the following output:

ActorB: Initialized
ActorA: Hello
ActorB: Hello

and then the program exits successfully.

The recommendation for notify_later was about sending messages to the actor itself. notify_later acts a little like ctx.address().unwrap().do_send(...), enqueueing a message to be processed after the current queue is processed.

Does this solve your actual usecase too?

@stoically
Copy link
Contributor Author

stoically commented Jun 9, 2020

Thanks for the explanation, that makes sense.

Does this solve your actual usecase too?

Unfortunately it does not - I do need to use send and await it since the return values are required.

@Restioson
Copy link
Owner

Unfortunately it does not - I do need to use send and await it since the return values are required.

Off the top of my head, might it be possible to have an InitializationComplete message sent from the second actor? Maybe there is a more elegant way of solving this with regards to the actual usecase. Do there need to be two return values either-way? Could you send a snippet of your usecase?

@stoically
Copy link
Contributor Author

stoically commented Jun 9, 2020

The Initialized message isn't what's deadlocking. It also deadlocks with the initialized message being do_send. What's locking it is the fact that ActorB wants to know something from ActorA which in turn needs to ask ActorB something to have a full picture of what to answer. And ActorB can't give that information initially since that depends on the message interpretation from ActorA.

@Restioson
Copy link
Owner

Restioson commented Jun 9, 2020

Could you do this:

ActorA sends information needed for the request to ActorB. It doesn't wait for a response.
ActorB sends information request to ActorA. It does wait for response.

Or similar?

@stoically
Copy link
Contributor Author

stoically commented Jun 9, 2020

In my usecase ActorB is consumer code and ActorA is library code. The Hello from ActorB needs an answer, since that's the consumer asking for information it needs. The message from the library before giving an answer is to get additional information for a complete answer.

@Restioson
Copy link
Owner

Restioson commented Jun 9, 2020

Would this work:

ActorB sends "Hello" to ActorA. ActorB waits for response.
ActorA sends whatever info the request needs. It doesn't wait for a response.
ActorB receives this, and makes the request, with the additional info.

If not, could we discuss on the rust discord to reduce back-and-forthing, perhaps?

@stoically
Copy link
Contributor Author

There could certainly be a different modelling that works, but what leaves me worried is that it might lock at runtime without compile-time warning, which could get problematic especially when the code base grows, since it doesn't get easier to have the full picture of which ways messages could flow.

And sure, feel free to DM me on discord.

@Restioson
Copy link
Owner

I've just opened #15, which allows interleaving messages using the strategy I explained to you in Discord DMs. Applying this to your example prints the following:

ActorB: Initialized
ActorA: Hello
ActorB: Hello

Process finished with exit code 0

Could you test it out with your real use case?

@Restioson Restioson added this to the 0.4.0 milestone Jun 10, 2020
@Restioson Restioson added the limitation A limitation in xtra label Jun 10, 2020
@Restioson Restioson self-assigned this Jun 10, 2020
@stoically
Copy link
Contributor Author

Looks like an elegant solution, and works. Thanks!

@Restioson
Copy link
Owner

Released in 0.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
limitation A limitation in xtra
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants