-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High allocations and getindex #150
Comments
Ok, now I think that the problem might be related to the definition of grad for getindex https://github.com/FluxML/Flux.jl/blob/ca1c73ed352d0557a72e12b663fff20332d9aaff/src/tracker/lib/array.jl#L100 Where you allocate a new zeroed array every time ( It would be nice to be able to create a custom sparse version of this, but sparse cuarrays and views throw errors. |
The output of We now also have support for pushing objects into the GC pool again, see JuliaGPU/CuArrays.jl#275, so this could be used in Flux to early-free those |
Thanks for looking into it. Should I use the do syntax or put some |
Both. The GC allocator has improved, so please try the master branch of CuArrays. But it didn't change the last example you posted much, so it might be worth adding some |
Let's close this, see #137 (comment). If the issue persists, please open a new issue with an updated MWE. |
Hi and thanks for your work!
I have a problem that effectively prohibits me from training my Flux model with multiple of embedding matrices.
It's hard to avoid allocations when indexing into (flux-tracked) CuArrays and my training time mainly becomes GC time.
Views won't work in these examples that I think illustrate the problem:
I think getindex has some issues with Flux backprop as well as the following script allocates a lot of memory in the Flux.back! calls:
Are there any workarounds I can use?
Please let me know if you need more info.
Thank you so much for looking into this!
The text was updated successfully, but these errors were encountered: