Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU error when using Zeros() as bias in Conv layer #1332

Closed
a-r-n-o-l-d opened this issue Sep 10, 2020 · 6 comments · Fixed by #1379
Closed

GPU error when using Zeros() as bias in Conv layer #1332

a-r-n-o-l-d opened this issue Sep 10, 2020 · 6 comments · Fixed by #1379

Comments

@a-r-n-o-l-d
Copy link

When I try to use a Conv layer without bias (Julia: 1.5, CUDA: v1.3.3, Flux: v0.11.1), I have an error with GPU (all good with CPU).

Minimal example:

using Flux
using CUDA
using Flux: onehotbatch
using Flux: crossentropy
using Flux: convfilter, Zeros

CUDA.allowscalar(false)

batch = (Float32.(rand(32,32,3,4)), onehotbatch([8,5,9,2], 0:9)) |> gpu;
model = Chain(Conv(weight=convfilter((3,3), 3=>64), bias=Zeros()), 
	GlobalMaxPool(), flatten, Dense(64, 10), softmax) |> gpu;

loss(x, y) = crossentropy(model(x), y)
opt = ADAM()
ps = params(model);
gs = gradient(ps) do
	training_loss = loss(batch...)
	return training_loss
end
Flux.Optimise.update!(opt, ps, gs)

Error with julia debug level set to 2:

ERROR: a exception was thrown during kernel execution.
Stacktrace:
 [1] Bool at float.jl:73
 [2] convert at number.jl:7
 [3] setindex! at /home/afertin/.julia/packages/CUDA/dZvbp/src/device/array.jl:101
 [4] _setindex! at abstractarray.jl:1176
 [5] setindex! at abstractarray.jl:1153
 [6] broadcast_kernel at /home/afertin/.julia/packages/GPUArrays/eVYIC/src/host/broadcast.jl:62
ERROR: KernelException: exception thrown during kernel execution on device GeForce RTX 2080 Ti
Stacktrace:
 [1] check_exceptions() at /home/afertin/.julia/packages/CUDA/dZvbp/src/compiler/exceptions.jl:93
 [2] prepare_cuda_call() at /home/afertin/.julia/packages/CUDA/dZvbp/src/state.jl:85
 [3] context at /home/afertin/.julia/packages/CUDA/dZvbp/src/state.jl:142 [inlined]
 [4] cufunction(::GPUArrays.var"#broadcast_kernel#18", ::Type{Tuple{CUDA.CuKernelContext,CuDeviceArray{Bool,0,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{},typeof(+),Tuple$
Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{Float64,Base.Broadcast.Extruded{CuDeviceArray{Bool,0,CUDA.AS.Global},Tuple{},Tuple{}}}},Base.Broadcast.Broadcast$d{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{Int64,Float64}},Base.Broadcast.Broadcasted{CUDA.Cu$rrayStyle{0},Nothing,typeof(CUDA.culiteral_pow),Tuple{CUDA.CuRefValue{typeof(^)},Base.Broadcast.Extruded{CuDeviceArray{Float32,0,CUDA.AS.Global},Tuple{},Tuple{}},CUDA.CuRefValue{Val{2}$}}}}}},Int64}}; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/afertin/.julia/packages/CUDA/dZvbp/src/compiler/execution.jl:293
 [5] cufunction(::GPUArrays.var"#broadcast_kernel#18", ::Type{Tuple{CUDA.CuKernelContext,CuDeviceArray{Bool,0,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{},typeof(+),Tuple$Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{Float64,Base.Broadcast.Extruded{CuDeviceArray{Bool,0,CUDA.AS.Global},Tuple{},Tuple{}}}},Base.Broadcast.Broadcast$d{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{Int64,Float64}},Base.Broadcast.Broadcasted{CUDA.Cu$rrayStyle{0},Nothing,typeof(CUDA.culiteral_pow),Tuple{CUDA.CuRefValue{typeof(^)},Base.Broadcast.Extruded{CuDeviceArray{Float32,0,CUDA.AS.Global},Tuple{},Tuple{}},CUDA.CuRefValue{Val{2}$}}}}}},Int64}}) at /home/afertin/.julia/packages/CUDA/dZvbp/src/compiler/execution.jl:293
 [6] #launch_heuristic#838 at /home/afertin/.julia/packages/CUDA/dZvbp/src/gpuarrays.jl:19 [inlined]
 [7] launch_heuristic at /home/afertin/.julia/packages/CUDA/dZvbp/src/gpuarrays.jl:17 [inlined]
 [8] copyto! at /home/afertin/.julia/packages/GPUArrays/eVYIC/src/host/broadcast.jl:66 [inlined]
 [9] copyto! at /home/afertin/.julia/packages/GPUArrays/eVYIC/src/host/broadcast.jl:76 [inlined]
 [10] materialize! at ./broadcast.jl:848 [inlined]
 [11] materialize!(::CuArray{Bool,0}, ::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(+),Tuple{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{F
loat64,CuArray{Bool,0}}},Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(*),Tuple{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{0},Nothing,typeof(-),Tuple{I
nt64,Float64}},Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{0},Nothing,typeof(CUDA.culiteral_pow),Tuple{Base.RefValue{typeof(^)},CuArray{Float32,0},Base.RefValue{Val{2}}}}}}}}) at ./bro
adcast.jl:845
 [12] apply!(::ADAM, ::CuArray{Bool,0}, ::CuArray{Float32,0}) at /home/afertin/.julia/packages/Flux/05b38/src/optimise/optimisers.jl:176
 [13] update!(::ADAM, ::CuArray{Bool,0}, ::CuArray{Float32,0}) at /home/afertin/.julia/packages/Flux/05b38/src/optimise/train.jl:23
 [14] update!(::ADAM, ::Zygote.Params, ::Zygote.Grads) at /home/afertin/.julia/packages/Flux/05b38/src/optimise/train.jl:29
 [15] top-level scope at REPL[13]:1
@DrChainsaw
Copy link
Contributor

I wonder if this has anything to do with fmap turning Zeros() into a single element vector or scalar.

Have you tried using bias=Zeros(64) instead?

If that doesn't work, then this is a separate issue:

julia> cc = Conv(ones(1,1,1,10), Flux.Zeros())
Conv((1, 1), 1=>10)

julia> cc(ones(1,1,1,10))

julia> cc(ones(1,1,1,10)) |> size
(1, 1, 10, 10)

julia> cc32 = Flux.f32(cc)
Conv((1, 1), 1=>10)

julia> cc32(ones(1,1,1,10)) |> size
ERROR: MethodError: no method matching reshape(::Float32, ::Int64, ::Int64, ::Colon, ::Int64)
Closest candidates are:
  reshape(::FillArrays.AbstractFill, ::Union{Colon, Int64}...) at C:\Users\echrska\.julia\packages\FillArrays\tE9Xq\src\FillArrays.jl:209
  reshape(::OffsetArrays.OffsetArray, ::Union{Colon, Int64}...) at C:\Users\echrska\.julia\packages\OffsetArrays\sUnpU\src\OffsetArrays.jl:234
  reshape(::AbstractArray, ::Union{Colon, Int64}...) at reshapedarray.jl:117
  ...
Stacktrace:
 [1] (::Conv{2,4,typeof(identity),Array{Float32,4},Float32})(::Array{Float64,4}) at C:\Users\echrska\.julia\packages\Flux\05b38\src\layers\conv.jl:145
 [2] top-level scope at REPL[19]:1

julia> cc32.bias
0.0f0

Happy to do a PR to fix this, but it would be good with some directions. I see the following alternatives:

  1. Throw an error if bias has the wrong dimension, even if it is Zeros. Obviously update the documentation.
    • Maybe keep the API as is and just silently change empty Zeros to an instance with the right dimension.
  2. Try to get fmap to somehow figure out the right dimension of the new array when it sees Zeros with no dimension
  3. Don't map Zeros.

I think 1) is the only feasible option out of those, but I'd like some confirmation before I proceed with the PR.

@DhairyaLGandhi
Copy link
Member

The question is what would reshaping it imply. For this case it seems like a function is causing Zeros to materialize, if we can avoid that, we should be fine. Further for the OP, again understanding which function is causing the zeros to materialize would be useful

@DrChainsaw
Copy link
Contributor

DrChainsaw commented Oct 28, 2020

It seems like there is an possible to make Zeros behave like an array of zeros with a given shape. I though that could be useful so that in case it materializes, it will at least be valid.

Preventing it from materializing is ofc not mutually exclusive and should perhaps be done to the largest extent possible. Isn't there a risk though that it requires alot of special treatment in packages which should not need to care about it, e.g. CUDA (and in the future ROCArrays or what it will end up being called)?

In my example above it is Adapt.adapt which causes it to materialize. Implementing Adapt.adapt(T, x::Zeros) = x prevents it from materializing. Maybe this is an acceptable solution given what I understand to be the main misson of Adapt. Not sure if it will prevent the CUDA error, but I can verify and fix that if/when I make the PR.

Summary is that I still think 1 is good to prevent crashes in case it accidentally materializes, but also prevent it from materializing at least in the mapping functions provided by Flux.

@DhairyaLGandhi
Copy link
Member

What kinds of areas do you think it would get in the way of CUDA, preventing it from materializing is also so we don't accidentally train on it.

Adapt for the Zeros() case / throwing the error on incorrect dimensions seems like the right sort of direction to take on this one.

@DrChainsaw
Copy link
Contributor

preventing it from materializing is also so we don't accidentally train on it.

Ah, how could I not think about that!

Preventing them from materializing is of course the only valid option.

Throwing for incorrect dims (except Zeros ofc) is probably also good to catch errors earlier. As it is now, the error is thrown far from the cause.

I'll try to make a PR for this tonight unless someone else wants to do it.

@DhairyaLGandhi
Copy link
Member

A PR would be good. We should define the adapt rule in Flux, if it's sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants