Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control-C does not stop worker processes #3805

Closed
simonster opened this issue Jul 23, 2013 · 8 comments
Closed

Control-C does not stop worker processes #3805

simonster opened this issue Jul 23, 2013 · 8 comments

Comments

@simonster
Copy link
Member

I keep filing bugs about ^C and various functionality, so please set me straight if this isn't supposed to work, but when I press ^C, it doesn't seem to halt worker processes:

 ➜ julia git:(master) ✗>julia -p 4
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "help()" to list help topics
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.2.0-2763.r9fd52dd58
 _/ |\__'_|_|_|\__'_|  |  Commit 9fd52dd58 2013-07-23 17:26:57
|__/                   |  x86_64-apple-darwin12.4.0

julia> @everywhere println("hello world")
hello world
    From worker 2:  hello world
    From worker 3:  hello world
    From worker 4:  hello world

julia> pmap(x->while true; end, 1:3)
^CERROR: interrupt
 in yield at multi.jl:1436
 in wait at task.jl:105
 in wait at task.jl:28
 in sync_end at multi.jl:1138
 in pmap at multi.jl:1146

At this point, the worker processes are still executing and taking up a core each. If I try to run something on them, Julia just hangs:

julia> @everywhere println("hello world")
hello world

If I press ^C again, I can get back to the REPL, but the workers continue to run even after I exit Julia. Setting err_stop=true, err_retry=false in pmap doesn't seem to help, and I get the same behavior with @everywhere.

@JeffBezanson
Copy link
Sponsor Member

I recall at one point we almost moved the detach to multi.jl:1064 (the ssh case only) but I think we decided against it to make the local & remote cases more consistent. Certainly for local processes we can have a way to send a signal, and possibly for remote processes too.

@JeffBezanson
Copy link
Sponsor Member

The idea would be to have a separate mechanism for signaling workers, since ^C by itself is ambiguous. Sometimes you might want to just interrupt a local loop and leave alone background things.

@amitmurthy
Copy link
Contributor

How about

  • each worker process also writes its OS pid to stdout, which is consequently parsed and stored in the Worker object on pid 1.
  • stopworkers(;pids = workers()) would use the system kill command over either ssh (for remote workers) or executed as local commands (for local workers) to send a SIGINT to the workers.
  • I couldn't find an equivalent to the kill command for sending a signal to a process on Windows - we may have to write a helper executable for the same that will wrap the appropriate Windows API call.

@JeffBezanson
Copy link
Sponsor Member

If we can keep the Process object for the local workers, we can use the kill function on it, which already supports windows via libuv.

@JeffBezanson
Copy link
Sponsor Member

Maybe we could have ^C interrupt only the repl process if it's in the middle of running something, but send to all workers if typed at the prompt with nothing running locally. That way if you have "background" tasks running on workers and you enter while true end, ^C will only break out of that loop, but otherwise be sent everywhere.

@amitmurthy
Copy link
Contributor

Sounds good to me.

^C^C can be used to interrupt both a local waiting pmap and stuck remote workers.

The detection of whether anything is running locally will be done in the sigint_handler in init.c ?

@simonster
Copy link
Member Author

Maybe ^C should also interrupt the workers in a @sync block? I feel like ^C should have the same behavior for pmap as for map.

@JeffBezanson
Copy link
Sponsor Member

We now have interrupt for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants