New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use argparse.FileType #141

Closed
wants to merge 0 commits into
base: master
from

Conversation

2 participants
@ryansm1
Copy link
Contributor

ryansm1 commented Nov 26, 2018

Using argparse.FileType for argparse's log-file argument stores an unserializable TextIOWrapper on args (see https://docs.python.org/3/library/argparse.html#argparse.FileType). This breaks the spawn multiprocessing creation method (i.e. Windows).

This commit reverts to the prior method of handling args.log_file and attempts to close the log file handle when exiting.

@Bogdanp

This comment has been minimized.

Copy link
Owner

Bogdanp commented Nov 28, 2018

Thanks and apologies for changing this without asking. :D

I'll merge this this weekend.

@ryansm1

This comment has been minimized.

Copy link
Contributor

ryansm1 commented Nov 29, 2018

No apology necessary! The FileType setup is cleaner, but I didn't know it wouldn't serialize until I tried it on my Windows box. Thanks for your consideration of the PR!

@ryansm1

This comment has been minimized.

Copy link
Contributor

ryansm1 commented Nov 30, 2018

Commit 38a23e3 addresses #137. Introducing a slight delay between worker process kills lets the process end cleanly (cleaner?) and prevents Windows from returning SIGTERM (15) instead of the intended return code (RET_IMPORT/2).

I ran this a large number of times on Windows and Linux and wasn't able to get it to fail, but it can be easily reverted if you see any issues.

os.kill(proc.pid, signum)
if proc.is_alive():
os.kill(proc.pid, signum)
# For Windows, give the process time to clean up

This comment has been minimized.

@Bogdanp

Bogdanp Dec 1, 2018

Owner

Could we potentially use os.waitpid here?

This comment has been minimized.

@ryansm1

ryansm1 Dec 3, 2018

Contributor

It appears that this issue is related to timing and how Windows handles the kill signal. The child process sometimes terminates between the check (proc.is_alive()) and the kill. If the main process sends the kill signal, Windows assigns the signal (SIGTERM is 15 in this case) to the child process return code, so the main process return code gets set to 15.

Depending upon the process timing, the child may have terminated before proc.is_alive(), so we never send a kill signal and don't unintentionally modify the return code--the happy path. Occasionally, the process hasn't terminated and gets killed, causing the intermittent failures.

Using a one-second timeout on a join call is cleaner and achieves the same result.

@Bogdanp Bogdanp closed this Dec 22, 2018

@Bogdanp Bogdanp force-pushed the ryansm1:master branch from c299da6 to 5399932 Dec 22, 2018

@Bogdanp

This comment has been minimized.

Copy link
Owner

Bogdanp commented Dec 22, 2018

@ryansm1 sorry, I accidentally pushed the wrong thing (the current master) to your branch and now GH won't let me rectify that. I'll open a new PR with your changes and some of my own additions on top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment