Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run DARNN on GPU #12

Closed
jookty opened this issue Nov 18, 2020 · 2 comments
Closed

How to run DARNN on GPU #12

jookty opened this issue Nov 18, 2020 · 2 comments

Comments

@jookty
Copy link

jookty commented Nov 18, 2020

I have tried to run DARNN model on GPU, But some error has occurred.
I modified some code below
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
encoder_lstm = Seq(HiddenRecur(Flux.LSTMCell(inp, encodersize) |> gpu))
decoder_lstm = Seq(HiddenRecur(Flux.LSTMCell(1, decodersize) |> gpu ))
#= @inbounds =# for t in 1:m.poollength # for gpu
input = input |> gpu
target = target |> gpu
model = model |> gpu
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Error message is

Warning: haskey(::TargetIterator, name::String) is deprecated, use Target(; name = name) !== nothing instead.
│ caller = llvm_compat(::VersionNumber) at compatibility.jl:176
└ @ CUDAnative ~/.julia/packages/CUDAnative/ierw8/src/compatibility.jl:176
[ Info: CUDA is on
[ Info: device = gpu
[ Info: Training
[ Info: epochs = 1
ERROR: LoadError: MethodError: no method matching (::var"#loss#30")(::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing})
Stacktrace:
[1] macro expansion at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0 [inlined]
[2] _pullback(::Zygote.Context, ::var"#loss#30", ::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:13
[3] loss at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:82 [inlined]
[4] _pullback(::Zygote.Context, ::var"#loss#30") at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface2.jl:0
[5] pullback(::Function, ::Zygote.Params) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:172
[6] gradient(::Function, ::Zygote.Params) at /home/jee/.julia/packages/Zygote/1GXzF/src/compiler/interface.jl:53
[7] train2() at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:85
[8] top-level scope at /home/jee/Projects/Julia/learn/FluxArch/DARNN/e2.jl:92
[9] include(::Module, ::String) at ./Base.jl:377
[10] exec_options(::Base.JLOptions) at ./client.jl:288
[11] _start() at ./client.jl:484

Would you to help me to fix the model?

@sdobber
Copy link
Owner

sdobber commented Nov 20, 2020

I would very much like to see these models run on a GPU, but unfortunately I don't have access to a suitable one currently, so I cannot do any development in that direction myself.

When I remember correctly, Flux's RNNs and their CUDA-implementations are a weak point in the Flux ecosystem - for example, you cannot change the activation function in a GRU to something different than tanh, as this is the only combination CUDA supports (which is why propably LSTnet would not work on a GPU with it's relu-GRU-cell). Also, I would expect the Seq(...)-part in the DARNN to give problems, as I am not sure if its heavy use of Zygote.Buffer is actually compatible with calculations on the GPU. I also cannot guarantee that all model parameters get transferred to the GPU correctly when using |> gpu with the nested structs of the models - I guess this would be my starting point for an investigation.

@jookty
Copy link
Author

jookty commented Nov 23, 2020

Thank you for your kind reply.

@sdobber sdobber closed this as completed in 6bf7ccc Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants