Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia can't run in parallel mode: (ERROR: type LocalProcess has no field r_stream) #2109

Closed
mauriciogtec opened this issue May 15, 2017 · 3 comments

Comments

@mauriciogtec
Copy link

mauriciogtec commented May 15, 2017

I installed julia 0.5.2 and 0.6-rc on WSL without problems and it runs fine in default mode. But when trying to open julia in parallel mode using

julia -p <#cores>

I get the following error

ERROR: type LocalProcess has no field r_stream
Stacktrace:
 [1] (::Base.Distributed.##call#19#20)(::Nullable{VersionNumber}, ::WorkerConfig, ::Type{T} where T, ::Int64, ::TCPSocket, ::TCPSocket, ::Base.Distri
buted.DefaultClusterManager) at ./distributed/cluster.jl:74
 [2] (::Core.#kw#Type)(::Array{Any,1}, ::Type{Base.Distributed.Worker}, ::Int64, ::TCPSocket, ::TCPSocket, ::Base.Distributed.DefaultClusterManager)
at ./<missing>:0
 [3] connect_to_peer(::Base.Distributed.DefaultClusterManager, ::Int64, ::WorkerConfig) at ./distributed/process_messages.jl:330
 [4] (::Base.Distributed.##117#118{WorkerConfig,Int64})() at ./task.jl:335
Error [ErrorException("type LocalProcess has no field r_stream")] on 3 while connecting to peer 2. Exiting.
Worker 2 terminated.
Worker 3 terminated.ERROR (unhandled task failure): EOFError: read end of file

ERROR (unhandled task failure): Version read failed. Connection closed by peer.



^Cfatal: error thrown and no exception handler available.
InterruptException()
jl_run_once at /home/centos/buildbot/slave/package_tarball64/build/src/jl_uv.c:132
process_events at ./libuv.jl:82 [inlined]
wait at ./event.jl:216
task_done_hook at ./task.jl:256
unknown function (ip: 0x7f0e16c9b72b)
jl_call_fptr_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:337 [inlined]
jl_call_method_internal at /home/centos/buildbot/slave/package_tarball64/build/src/julia_internal.h:356 [inlined]
jl_apply_generic at /home/centos/buildbot/slave/package_tarball64/build/src/gf.c:1930
jl_apply at /home/centos/buildbot/slave/package_tarball64/build/src/julia.h:1422 [inlined]
finish_task at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:232
start_task at /home/centos/buildbot/slave/package_tarball64/build/src/task.c:275

I am using Windows 10 creators update. I have tried installing Julia using PPA and generic Linux binaries, in either case I get the same error. This is a WSL issue, since I have tried this in a full Linux installation without issues.

@sunjoong
Copy link

sunjoong commented May 15, 2017

@mauriciogtec - I guess... it might be a network problem because "Each worker binds to only one of the local interfaces and listens on the first free port starting from 9009. -- https://docs.julialang.org/en/stable/manual/parallel-computing/#network-requirements-for-localmanager-and-sshmanager "

You look like having working real linux system, so could check what port listen when launching julia parallel mode. Compair it with in WSL. I think... that port will not listen. If so, that might be similar to mine; oh... that's not a julia issue.

In my problem #1498 (comment), it looks like "Anti-Virus was blocking the program from listening", but in other issue #1853 (comment), that looks like to happen without Anti-Virus.

@goldfita
Copy link

Mine hangs right after running cmdlineargs test. If I hit ctrl-c, I get an EXCEPTION_ACCESS_VIOLATION (not sure what the deal is with the path).

unknown function (ip: 000000006508A3F1)
unknown function (ip: 000000006508A3F1)
jl_call_fptr_internal at /cygdrive/c/Users/goldfing/Desktop/cygwin64/home/goldfing/julia/src/cygdrive/c/Users/goldfing/Desktop/cygwin64/home/goldfing/julia/src\julia_internal.h:337 [inlined]

The same build makes it through Base.runtests on another machine. The only difference is the target architecture. I do have different anti-virus programs on the two machines, but most of the rest of the tests pass, and it's clearly using multiple workers right from the start.

@benhillis
Copy link
Member

Closing out old issues, if you're still having issues on newer versions of Windows please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants