New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in update!
for Metal arrays and Adam optimiser
#150
Comments
Metal really doesn't like Float64 even as an intermediate step, not an array: julia> using Metal
julia> x = MtlArray([1 2 3f0])
1×3 MtlMatrix{Float32, Metal.MTL.MTLResourceStorageModePrivate}:
1.0 2.0 3.0
julia> Float32.(x .+ 4.0)
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::Metal.mtlKernelContext, ::MtlDeviceMatrix{Float32, 1}, ::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{2}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, Metal.var"#86#87"{Float32}, Tuple{Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{2}, Nothing, typeof(+), Tuple{Base.Broadcast.Extruded{MtlDeviceMatrix{Float32, 1}, Tuple{Bool, Bool}, Tuple{Int64, Int64}}, Float64}}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported unsupported use of double value
Reason: unsupported unsupported use of double value I guess the equivalent for CUDA is slow but we seldom care for Optimisers. Seems very likely to work with |
It turns out that I was using a Float64 learning rate, with Float32 works instead julia> opt = Optimisers.setup(MyAdam(1f-3), x) |> Flux.gpu
Leaf(MyAdam{Float32}(0.001, (0.9, 0.999), 1.19209f-7), (Float32[0.0, 0.0, 0.0], Float32[0.0, 0.0, 0.0], (0.9, 0.999)))
julia> Optimisers.update!(opt, x, g)
(Leaf(MyAdam{Float32}(0.001, (0.9, 0.999), 1.19209f-7), (Float32[0.0368521, 0.0404, 0.0267676], Float32[0.000135806, 0.000163214, 7.16493f-5], (0.81, 0.998001))), Float32[0.7669642, 0.055360284, 0.8887157]) Can we make this more robust? |
Xref also #119, Would probably be ideal to follow the eltype of the array (or the state arrays). That would add some complexity to each rule. But if we do that, then the rule need not have a type parameter at all, we could just store all learning rates as |
Maybe this was closed prematurely, trying after #151: julia> using Optimisers, Flux
julia> x = rand(Float32, 3) |> Flux.gpu
3-element MtlVector{Float32, Metal.MTL.MTLResourceStorageModePrivate}:
0.42997134
0.18900502
0.7357338
julia> g = rand(Float32, 3) |> Flux.gpu;
julia> opt = Optimisers.setup(Optimisers.Descent(1e-3), x);
julia> Optimisers.update!(opt, x, g) # OK, as above
(Leaf(Descent(0.001), nothing), Float32[0.74836177, 0.57043684, 0.4284932])
julia> opt_adam = Optimisers.setup(Flux.Adam(), x) # Flux.Adam() always made Float64
Leaf(Adam(0.001, (0.9, 0.999), 1.0e-8), (Float32[0.0, 0.0, 0.0], Float32[0.0, 0.0, 0.0], (0.9, 0.999)))
julia> Optimisers.update!(opt_adam, x, g) # above there's a typo, model isn't defined
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::Metal.mtlKernelContext, ::MtlDeviceVector{Float32, 1}, ::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(-), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(*), Tuple{Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(/), Tuple{Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(/), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(-), Tuple{Int64, Float64}}}}, Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(+), Tuple{Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(sqrt), Tuple{Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}, Nothing, typeof(/), Tuple{Base.Broadcast.Extruded{MtlDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0}, Nothing, typeof(-), Tuple{Int64, Float64}}}}}}, Float32}}}}, Float32}}}}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported unsupported use of double value
Reason: unsupported unsupported use of double value
[23] materialize!(dest::Any, bc::Base.Broadcast.Broadcasted{<:Any})
@ Base.Broadcast ./broadcast.jl:876 [inlined]
[24] subtract!(x::MtlVector{…}, x̄::Base.Broadcast.Broadcasted{…})
@ Optimisers ~/.julia/packages/Optimisers/TxzMn/src/interface.jl:103
[25] _update!(ℓ::Optimisers.Leaf{…}, x::MtlVector{…}; grads::IdDict{…}, params::IdDict{…})
@ Optimisers ~/.julia/packages/Optimisers/TxzMn/src/interface.jl:97
[26] update!(::Optimisers.Leaf{…}, ::MtlVector{…}, ::MtlVector{…})
@ Optimisers ~/.julia/packages/Optimisers/TxzMn/src/interface.jl:77
(jl_OTs5qt) pkg> st
Status `/private/var/folders/yq/4p2zwd614y59gszh7y9ypyhh0000gn/T/jl_OTs5qt/Project.toml`
[587475ba] Flux v0.14.3 `https://github.com/FluxML/Flux.jl.git#master`
[3bd65402] Optimisers v0.3.0 The problem isn't Lines 209 to 213 in 6a4f948
julia> lazyg = Optimisers.apply!(Optimisers.Adam(), opt_adam.state, x, g)[2]
Base.Broadcast.Broadcasted{Metal.MtlArrayStyle{1}}(*, (Base.Broadcast.Broadcasted(/, (Base.Broadcast.Broadcasted(/, (Float32[0.17961349, 0.2150245, 0.1685136], Base.Broadcast.Broadcasted(-, (1, 0.8099999785423279)))), Base.Broadcast.Broadcasted(+, (Base.Broadcast.Broadcasted(sqrt, (Base.Broadcast.Broadcasted(/, (Float32[0.0009599409, 0.0013757595, 0.0008449606], Base.Broadcast.Broadcasted(-, (1, 0.9980010128617287)))),)), 1.0f-8)))), 0.001f0))
julia> x .= x .- lazyg
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::Metal.mtlKernelContext, ::MtlDeviceVector{Float32, 1}, ::Base.Broadcast.Broadcasted{Metal.MtlArrayStyle |
The following lines using
Descent
run fineNow with
Adam
instead:The text was updated successfully, but these errors were encountered: