Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"async stop" seems to conflict with the "respawn" feature #529

Closed
thefab opened this issue Sep 10, 2013 · 9 comments · Fixed by #536
Closed

"async stop" seems to conflict with the "respawn" feature #529

thefab opened this issue Sep 10, 2013 · 9 comments · Fixed by #536

Comments

@thefab
Copy link
Contributor

thefab commented Sep 10, 2013

During an "async stop" with a big graceful timeout, depending on the context, it seems that some processes can be respawned between the "kill -TERM" and the "kill -KILL".

I'm not completly sure about that but I have some suspicious behaviours with this kind of watcher:

#!/usr/bin/env python

import signal
import random
import sys
import time


def handler(signum, frame):
    value = random.randint(3, 10)
    print "waiting %i seconds..." % value
    time.sleep(value)
    print "exiting"
    sys.exit(0)

signal.signal(signal.SIGTERM, handler)


print "starting"
while True:
    print "start sleep(1)..."
    time.sleep(1)
    print "end sleep(1)"

and with this config file:


[circus]
check_delay = 5
endpoint = tcp://127.0.0.1:5555
pubsub_endpoint = tcp://127.0.0.1:5556
stats_endpoint = tcp://127.0.0.1:5557
httpd = False
debug = False
loglevel = DEBUG

[watcher:dummy]
cmd = dummy.py
args = foo
warmup_delay = 0
numprocesses = 5
graceful_timeout = 30
stdout_stream.class = FileStream
stdout_stream.filename = test1.log
stderr_stream.class = FileStream
stderr_stream.filename = test2.log

I'm not really sure because I am still not comfortable with the circus source but a kind of "stopping flag" seems to be missing between the kill -TERM and the kill -KILL.

This issue blocks #515 for us.

And of course, this issue is not solved by the #528 fix.

Thanks

@almet
Copy link
Contributor

almet commented Sep 11, 2013

Can you tell me if 7e4e3cf fixes your issue? https://github.com/mozilla-services/circus/tree/fix-529

@thefab
Copy link
Contributor Author

thefab commented Sep 11, 2013

It sounds good @ametaireau ! We are going to do a "real life" test in a few hours to be sure.

@almet
Copy link
Contributor

almet commented Sep 11, 2013

Cool, anyway, this will need tests as well :)

@ghost
Copy link

ghost commented Sep 11, 2013

It looks like _stopping should be initialized in Watcher.__init__:

Traceback (most recent call last):
  File "/home/synext/lib/python2.7/site-packages/zmq/eventloop/minitornado/ioloop.py", line 799, in _run
    self.callback()
  File "/home/synext/lib/python2.7/site-packages/circus-0.9.3-py2.7.egg/circus/arbiter.py", line 533, in manage_watchers
    watcher.manage_processes()
  File "/home/synext/lib/python2.7/site-packages/circus-0.9.3-py2.7.egg/circus/util.py", line 319, in _log
    return func(self, *args, **kw)
  File "/home/synext/lib/python2.7/site-packages/circus-0.9.3-py2.7.egg/circus/watcher.py", line 438, in manage_processes
    and not self._stopping):
AttributeError: 'Watcher' object has no attribute '_stopping'

@tarekziade
Copy link
Member

Good catch, thanks. We will add a test that covers this

@almet
Copy link
Contributor

almet commented Sep 12, 2013

Fixed in the branch. https://github.com/mozilla-services/circus/tree/fix-529

@tarekziade
Copy link
Member

added PR at #536 - LGTM. @ametaireau I will let you merge

@almet
Copy link
Contributor

almet commented Sep 12, 2013

Okay, adding a test before that.

@almet
Copy link
Contributor

almet commented Sep 12, 2013

should be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants