Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Child Actor not always killed, when its parent is killed #27

Closed
efibutov opened this Issue Jun 21, 2018 · 5 comments

Comments

Projects
None yet
3 participants
@efibutov
Copy link

efibutov commented Jun 21, 2018

@kquick
Looks like a bug...

class Child(Actor):
    def receiveMessage(self, msg, sender):
        print('Im child')


class Parent(Actor):
    def receiveMessage(self, msg, sender):
        self.createActor(Child)



asys = ActorSystem('multiprocTCPBase')
p = asys.createActor(Parent)
asys.tell(p, ActorExitRequest())
asys.shutdown()

Environment:
Ubuntu 18.04
Python 3.6.5

Running the code above many times causes leaving processes (htop)

@kquick

This comment has been minimized.

Copy link
Owner

kquick commented Jun 22, 2018

Hi @efibutov,

Thank you for your report: there is indeed a timing window where an actor created during the shutdown can get left behind.

The asys.tell does not wait (even for the ActorExitRequest to be transmitted, so it doesn't have a significant effect on the system, but the asys.shutdown() should cause all the Actors to be shutdown. However, it does this by delivering an ActorExitRequest to each running actor, and your Parent actor will always create a new child whenever it gets a message. Thus, the following events occur:

  1. Parent created
  2. ActorExitRequest sent to parent
  3. Parent receives ActorExitRequest, creates child
  4. Parent finishes handling ActorExitRequest and exits
  5. Child starts and attempts to connect to parent... which is no longer there
  6. Child should exit

The issue is with step 6, and I will get this fixed shortly.

To avoid this problem (even after the fix), I recommend rewriting the Parent as:

class Parent(Actor):
    def receiveMessage(self, msg, sender):
        if not isinstance(msg, ActorSystemMessage):
            self.createActor(Child)

The ActorSystemMessage is a base class for all of the messages that might be delivered from or relative to the Actor System itself (see http://thespianpy.com/doc/using.html#outline-container-hH-bb3655d6-66df-42d5-9486-e81c8687e9d6 for more details).

@kquick kquick added the bug label Jun 22, 2018

@kquick

This comment has been minimized.

Copy link
Owner

kquick commented Jun 23, 2018

Fixed in 723fce0.

@pjz

This comment has been minimized.

@kquick

This comment has been minimized.

Copy link
Owner

kquick commented Jul 27, 2018

Thanks for the link, @pjz.

There's a lot of good background in that post, but Trio is targeting a bit of a different model than an Actor model. At a high level, the ActorSystem in Thespian is roughly equivalent to Trio's nursery (and to borrow a good idea from Trio, I'll look into adding context manager functionality to the Thespian ActorSystem object soon). The difference however is that Trio expects each concurrent function to run to single-shot completion as part of the current process, with blocking until all concurrent functions have completed, whereas Thespian Actors support multiple message interactions, external processes (potentially remote), and extended lifetimes.

The Trio model is effectively a subset of the Actor model and could be implemented in Thespian (assuming the soon-to-be-added context manager functionality for simplicity):

class Target1(Actor):
    def receiveMessage(self, msg, sender):
        if not isinstance(msg, ActorSystemMessage):
            self.send(sender, response_for(msg))

class Target2(Actor): ...similar...

with ActorSystem(...) as asys:
  t1 = asys.createActor(Target1)
  t2 = asys.createActor(Target2)
  t1.tell(...requestmsg1...)
  t2.tell(...requestmsg2...)
  r1 = asys.listen()
  r2 = asys.listen()

with appropriate post-determination of which response r1 and r2 represent.

[The particular problem @efibutov reported is in the Thespian ActorSystem internals management of the concurrency startup/shutdown sequencing because the Actor model does not block like Trio's does, so this particular problem doesn't necessarily occur in the Trio model (hrm... how does it handle nesting of nursery access?), although there are still plenty of concurrency issues that both have to solve.]

@kquick

This comment has been minimized.

Copy link
Owner

kquick commented Aug 2, 2018

Fixed in release 3.9.3 (https://github.com/kquick/Thespian/releases/tag/thespian-3.9.3). Thank you for the report.

@kquick kquick closed this Aug 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.