-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CURAND handles are collected early #699
Comments
Random.seed!
is not thread safe, for CUDA.CURAND.default_rng()
CUDA.CURAND.default_rng()
CUDA.CURAND.default_rng()
after I got the internal library error, all CURAND calls failed, but only when it's executed in a thread, with a rng. So I think the bug is in julia> randn(CUDA.CURAND.default_rng(), 2)
2-element CuArray{Float64, 1}:
-0.026281558163828614
0.3863620119156326
julia> fetch(Threads.@spawn randn(CUDA.CURAND.default_rng(), 2))
ERROR: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:317 [inlined]
[2] fetch(t::Task)
@ Base ./task.jl:332
[3] top-level scope
@ threadingconstructs.jl:179
nested task error: CURANDError: internal library error (code 999, CURAND_STATUS_INTERNAL_ERROR)
Stacktrace:
[1] throw_api_error(res::CUDA.CURAND.curandStatus)
@ CUDA.CURAND ~/.julia/dev/CUDA/lib/curand/error.jl:53
[2] seed!(rng::CUDA.CURAND.RNG, seed::UInt64, offset::Int64)
@ CUDA.CURAND ~/.julia/dev/CUDA/lib/curand/random.jl:45
[3] seed!(rng::CUDA.CURAND.RNG)
@ CUDA.CURAND ~/.julia/dev/CUDA/lib/curand/random.jl:38
[4] (::CUDA.CURAND.var"#46#48"{CuContext})()
@ CUDA.CURAND ~/.julia/dev/CUDA/lib/curand/CURAND.jl:54
[5] get!
@ ./iddict.jl:163 [inlined]
[6] default_rng()
@ CUDA.CURAND ~/.julia/dev/CUDA/lib/curand/CURAND.jl:40
[7] (::var"#5#6")()
@ Main ./threadingconstructs.jl:169
julia> fetch(Threads.@spawn CUDA.randn(2))
2-element CuArray{Float32, 1}:
-0.23838419
1.6223384 |
Also reproduces with |
Looks like the RNG handle is getting destroyed, even though I keep it alive from the task finalizer. But the reproducer is very finicky, maybe it depends on the GC's ability to mark both the task and the RNG as dead at the same time? |
Describe the bug
using CUDA on the master branch.
Random.seed!
is not thread safe, forCUDA.CURAND.default_rng()
To reproduce
The Minimal Working Example (MWE) for this bug:
Log:
Expected behavior
pass
Version info
Details on Julia:
Additional context
Shall we add this code to the test case?
The text was updated successfully, but these errors were encountered: