TPA-LSTM trains slower on GPU than on CPU #15

sdobber · 2021-01-06T19:02:47Z

Apparently, the construction with Flux.unstack and Flux.stack is much slower than the 'slow' Zygote.Buffer. The latter cannot be used on the GPU due to missing support for array mutation.

The text was updated successfully, but these errors were encountered:

DhairyaLGandhi · 2021-01-19T18:01:09Z

Probably open an issue on Flux.jl with an MWE?

sdobber · 2021-01-19T19:13:55Z

Which issue are you referring to? The fact that stack and unstack are slower, or that Zygote.Buffer does not work? I will try to get a MWE together (though it might take a bit due to another project I need to work on). I have to admit that the mistake could as well be completely on my side. I'm new to GPU programming, so I'm still learning a lot while slowly moving forward, and I'm probably still doing a lot of things the wrong way 😄

DhairyaLGandhi · 2021-01-19T19:17:03Z

Both? I'm happy to help with network architectures as well. Btw, did you check that we added a reference to this repo on the flux site https://fluxml.ai/ecosystem.html#advanced-models

This should address issue #15. Update of Flux to newer version stops GPU from segfaulting when writing to a `Zygote.Buffer`.

sdobber · 2021-01-26T18:54:49Z

Note to myself: MWE for Zygote.Buffer segfault on GPU:

using Flux
inp = rand(Float32, 137, 10, 1000) |> gpu
B = Flux.Zygote.Buffer(inp, 137,9,1000) 
t = 1
x = inp[:,t,:]
B[:,t,:] = x

sdobber mentioned this issue Jan 24, 2021

Flux.stack / Flux.unstack slower than Zygote.Buffer FluxML/Flux.jl#1475

Open

sdobber added a commit that referenced this issue Jan 26, 2021

Run TPA-LSTM calculations via Zygote.Buffer

5ee3fe9

This should address issue #15. Update of Flux to newer version stops GPU from segfaulting when writing to a `Zygote.Buffer`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPA-LSTM trains slower on GPU than on CPU #15

TPA-LSTM trains slower on GPU than on CPU #15

sdobber commented Jan 6, 2021

DhairyaLGandhi commented Jan 19, 2021

sdobber commented Jan 19, 2021

DhairyaLGandhi commented Jan 19, 2021

sdobber commented Jan 26, 2021

TPA-LSTM trains slower on GPU than on CPU #15

TPA-LSTM trains slower on GPU than on CPU #15

Comments

sdobber commented Jan 6, 2021

DhairyaLGandhi commented Jan 19, 2021

sdobber commented Jan 19, 2021

DhairyaLGandhi commented Jan 19, 2021

sdobber commented Jan 26, 2021