You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Allocating an array, switching devices, then triggering GC, seems to cause errors. Its unclear to me to what extent this is supposed to work or whether this is too experimental, but it certainly hampers single-process multi-GPU work quite a bit (which otherwise seems very doable) so if there's an easy fix it'd be great to have one.
Here's a MWE (Julia 1.6, CUDA 2.6.1):
julia>using CUDA
julia>device!(0)
julia> x = CUDA.rand(2,2)
2×2 CuArray{Float32, 2}:0.3867710.4485490.4190930.383297
julia>device!(1)
julia> x =nothing
julia> GC.gc(true)
WARNING: Error while freeing CuPtr{Nothing}(0x00002aab9fe30000):
Base.KeyError(key=CUDA.CuPtr{Nothing}(0x00002aab9fe30000))
The bug is easy enough to understand, this line looks up the pointer in the pool for the current device, rather than the one in which it was allocated, so its not there.
Allocating an array, switching devices, then triggering GC, seems to cause errors. Its unclear to me to what extent this is supposed to work or whether this is too experimental, but it certainly hampers single-process multi-GPU work quite a bit (which otherwise seems very doable) so if there's an easy fix it'd be great to have one.
Here's a MWE (Julia 1.6, CUDA 2.6.1):
The bug is easy enough to understand, this line looks up the pointer in the pool for the current device, rather than the one in which it was allocated, so its not there.
The text was updated successfully, but these errors were encountered: