When is it safe to free external DevicePtrs? #435

TravisWhitaker · 2019-02-07T04:15:29Z

In this stack overflow answer, there's code similar to the following for making an Array out of a DevicePtr.

...
ptx <- PTX.createTargetFromContext ctx
...
fp  <- c_generate_gpu_data
...
arr@(Array _ ad) <- Sugar.allocateArray (Z :. 32) :: IO (Vector Float)
LRU.insertUnmanaged (ptxMemoryTable ptx) ad fp
let r = PTX.runWith ptx $ A.fold (+) 0 (use arr')
print r
free fp

Something that's not quite clear to me is precisely when it's safe to free the device array backing arr. There're a few potential answers:

After r is in WHNF, which means the kernel has launched. This is probably not sufficient, because used arrays are allowed to be copied to the remote asynchronously.
After r (or part of it) is in normal form, which means the computation has finished. This probably works, but isn't ideal, since I might want to use the same arr again later.
After arr (or its ArrayData) is GC'd, which means the host is done with it but it might still be present on the device. Empirically, this seems to work, but I want to double-check that I understand why.

Initially the assumption 'host array is unreachable' -> 'device array is reclaimable' seemed insufficient to me, but from what I can tell this is essentially how Data.Array.Accelerate.Array.Remote.Table works. From my understanding, arrays are considered reclaimable once their UniqueArray goes out of scope.

Is it safe to do something like this?

...
myFree :: DevicePtr a -> IO ()
...
-- Like the 'mw' helper in `makeWeakArrayData`, but this probably isn't well-typed.
getUniqueArray :: (ArrayPtrs e' ~ Ptr a') => ArrayEltR e' -> ArrayData e' -> UniqueArray a'
...
ptx <- PTX.createTargetFromContext ctx
...
fp  <- c_generate_gpu_data
...
arr@(Array _ ad) <- Sugar.allocateArray (Z :. 32) :: IO (Vector Float)
let !uad = uniqueArrayData (getUniqueArray arrayElt ad)
addFinalizer uad (myFree fp)
LRU.insertUnmanaged (ptxMemoryTable ptx) ad fp
let r = PTX.runWith ptx $ A.fold (+) 0 (use arr')
print r
-- myFree gets called later, if I'm lucky.

The text was updated successfully, but these errors were encountered:

tmcdonell · 2019-02-18T17:08:42Z

(oops, sorry I forgot to reply to this... feel free to ping me when I lose track of things...)

Hm, that is a good question...

I think we want that when r is in WHNF, the final kernel has completed executing (so run should be synchronous, whereas runAsync can return immediately). But, poking around the code right now I don't think that that is actually the case; we have only that all the kernels have been scheduled at this point, but they may not necessarily have completed (since kernels execute asynchronously w.r.t. each other). Under most circumstances I don't think you can observe this distinction, but see more below...
Certainly after r is in normal form it will be safe; it definitely needs to synchronise the kernel execution before it can copy the data back.
After arr is GC'd is safe because that can only happen after r gets GC'd as well, so either it is done evaluating with fp or it is not going to do so.

Yes, arrays are considered reclaimable once the UniqueArray goes out of scope. Thinking about this more... there is perhaps an awkward condition here where you force r to WHNF to initiate the computation, but if that computation takes a long time, and you discard r without ever forcing it to normal form, and a GC runs, then I think you can fall into a case where the arrays are reclaimed even though the computation may still be executing... I guess that motivates that (1) should be synchronous w.r.t. kernel execution.

Yes, I think the code you posted at the end should be safe; r is the lynchpin which keeps everything it depends on alive.

tmcdonell · 2019-02-18T17:22:13Z

A further comment... I think v1.2 does implement (1) above as being synchronous w.r.t. kernel execution. I think this problem exists only in the development version, where a lot of this code got rewritten (mostly for the benefit of the CPU backend).

tmcdonell · 2020-09-03T08:42:52Z

I think this can be closed... feel free to reopen if you have further questions

tmcdonell closed this as completed Sep 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When is it safe to free external DevicePtrs? #435

When is it safe to free external DevicePtrs? #435

TravisWhitaker commented Feb 7, 2019

tmcdonell commented Feb 18, 2019

tmcdonell commented Feb 18, 2019

tmcdonell commented Sep 3, 2020

When is it safe to free external DevicePtrs? #435

When is it safe to free external DevicePtrs? #435

Comments

TravisWhitaker commented Feb 7, 2019

tmcdonell commented Feb 18, 2019

tmcdonell commented Feb 18, 2019

tmcdonell commented Sep 3, 2020