New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use argparse.FileType #141
Conversation
Thanks and apologies for changing this without asking. :D I'll merge this this weekend. |
No apology necessary! The FileType setup is cleaner, but I didn't know it wouldn't serialize until I tried it on my Windows box. Thanks for your consideration of the PR! |
Commit 38a23e3 addresses #137. Introducing a slight delay between worker process kills lets the process end cleanly (cleaner?) and prevents Windows from returning SIGTERM (15) instead of the intended return code (RET_IMPORT/2). I ran this a large number of times on Windows and Linux and wasn't able to get it to fail, but it can be easily reverted if you see any issues. |
dramatiq/cli.py
Outdated
os.kill(proc.pid, signum) | ||
if proc.is_alive(): | ||
os.kill(proc.pid, signum) | ||
# For Windows, give the process time to clean up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we potentially use os.waitpid
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears that this issue is related to timing and how Windows handles the kill signal. The child process sometimes terminates between the check (proc.is_alive()
) and the kill. If the main process sends the kill signal, Windows assigns the signal (SIGTERM is 15 in this case) to the child process return code, so the main process return code gets set to 15.
Depending upon the process timing, the child may have terminated before proc.is_alive()
, so we never send a kill signal and don't unintentionally modify the return code--the happy path. Occasionally, the process hasn't terminated and gets killed, causing the intermittent failures.
Using a one-second timeout on a join call is cleaner and achieves the same result.
@ryansm1 sorry, I accidentally pushed the wrong thing (the current master) to your branch and now GH won't let me rectify that. I'll open a new PR with your changes and some of my own additions on top. |
Using argparse.FileType for argparse's log-file argument stores an unserializable TextIOWrapper on args (see https://docs.python.org/3/library/argparse.html#argparse.FileType). This breaks the spawn multiprocessing creation method (i.e. Windows).
This commit reverts to the prior method of handling args.log_file and attempts to close the log file handle when exiting.