Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some tips to solve the segmentation fault error when using validate.py. #28

Open
DarrenRuan opened this issue Apr 28, 2020 · 0 comments

Comments

@DarrenRuan
Copy link

DarrenRuan commented Apr 28, 2020

Here, I try to summarize the issue of the segmentation fault and potential solutions (although it might fail).

My env: Julia 1.1

The Error Description

#27
signal (11): Segmentation fault
in expression starting at no file:0

#20
pid: 0 traj: 58 / 2062 (then stuck)
signal (11): Segmentation fault
in expression starting at no file:0
GetResult at /home/ilan/minonda/conda-bld/work/Python-3.5.2/Modules/_ctypes/callproc.c:911 [inlined]

Potential Solutions

(I really appreciate if you could share your insights here)

  1. Using Julia v0.6 and
    cd ~/.julia/lib/v0.6 rm PyCall.jl
    Reference: https://github.com/sisl/ngsim_env/blob/master/docs/usingTrainedPolicy.md
    (But I did not know how to install Julia packages on v0.6, like Pkg.add(PackageSpec(url="https://github.com/sisl/Vec.jl"))).
  • How could use PackageSpec in Julia v0.6?
  • How could install 'LinearAlgebra' in Julia v0.6? because

LinearAlgebra is a standard library introduced in Julia v0.7 containing Base.LinAlg from Julia v0.6, so it is not available on Julia v0.6 A package that requires it will not work on Julia 0.6 either. (https://discourse.julialang.org/t/package-linearalgebra/16064)

  1. Just like what authors suggested in 1, we could also rm ~/julia/.julia/compiled/v1.1/PyCall/****.jl. (i still got the error even I did this)

  2. using 'single_process_collect_trajectories', set --debug = True ( I failed.)

  3. Try to use different '--n_proc' and sleep time. (It is really hard.)
    e.g. there are 4 vCPUs on your machine, how to make it match what the authors have mentioned below. (Try --n_proc = 3 or 4 or 5)

Running validate.py occasionally hangs with no error messages or anything like that. Previous experience suggests that this is somehow related to julia processes remaining unfinished and the python script moving on. Looking in validate.py, there is a sleep() call. In the past, we have had some limited success in overcoming the hanging problem by increasing the sleep duration. However, it is not guaranteed. We have been unable to produce a minimal reproducible example of this happening, but the thoughts are that it is related to the machine's load. A higher load means we need to wait longer.

Reference: https://github.com/sisl/ngsim_env/tree/master/scripts/imitation

  1. My point: is it possible for us to output trajlist (in validate.py) even if one of the processes failed? Really appreciate any response here. Could you give me some hints about this?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant