added support for SparseArray input in Dense layer #987

ajprabhu09 · 2020-01-06T13:35:04Z

Fixes Issue #965
Dense Layer with is able to take SparseArray as input

MikeInnes · 2020-01-06T17:53:47Z

Can you describe why this is needed a bit more?

It seems like it's the same definition that's already covered by (a::Dense)(x::AbstractArray), since sparse arrays are abstract arrays (and the method body hasn't changed).

ajprabhu09 · 2020-01-06T20:14:39Z

For some reason gradients of Dense layer with an Arrays as it's weight and a SparseMatrix as input fails.


using Flux, SparseArrays

md = Dense(2,2)
ms = Dense(sparse(randn(2,2)), sparse(randn(2)))
xd = randn(2, 2)
xs = sparse(randn(2, 2))

gradient(() -> sum(md(xd)), Flux.params(md)) #Works
gradient(() -> sum(ms(xs)), Flux.params(ms)) #Works
gradient(() -> sum(ms(xd)), Flux.params(ms)) #Works
gradient(() -> sum(md(xs)), Flux.params(md)) #Fails

I tried using sum(md.σ.(md.W * xs .+ md.b)) in its place which works. Maybe something to do with BLAS calls and SparseArrays in the backward pass?

MikeInnes · 2020-01-10T14:20:41Z

Right you are, but this error actually comes from trying to convert the sparse array to Float32 to match the layer:

julia> gradient(x -> sum(Float32.(x)), sparse(randn(2)))
ERROR: MethodError: no method matching zero(::Type{Any})

Your patch avoids the conversion, but really we should just fix this error in Zygote.

ajprabhu09 · 2020-01-10T15:08:14Z

So what can be defined for zero(::Type{Any}) As I understand it the zero function is missing for the Type Any?

MikeInnes · 2020-01-10T18:02:37Z

Well, at some point we're calling zero(Any). That can't be right, so it needs digging into to find out why that's happening.

racinmat · 2020-10-15T10:00:50Z

The main problem is caused by mixing Float32 and Float64, which results in zero(Any).
When we have consistent types, it's working.

W = randn(Float32, 2,2)
b = randn(Float32, 2)
md = Dense(W, b)
xs = sparse(randn(Float32, 2, 2))
gradient(() -> sum(md(xs)), Flux.params(md))

W = randn(Float64, 2,2)
b = randn(Float64, 2)
md = Dense(W, b)
xs = sparse(randn(Float64, 2, 2))
gradient(() -> sum(md(xs)), Flux.params(md))

are working.

But

W = randn(Float32, 2,2)
b = randn(Float32, 2)
md = Dense(W, b)
xs = sparse(randn(Float64, 2, 2))
gradient(() -> sum(md(xs)), Flux.params(md))

W = randn(Float64, 2,2)
b = randn(Float64, 2)
md = Dense(W, b)
xs = sparse(randn(Float32, 2, 2))
gradient(() -> sum(md(xs)), Flux.params(md))

throw the above-mentioned error.
The main question is: how should we approach it?
Since Flux uses Float32 by default, but Julia itself is using usually Float64 by default, it's very easy to run into this inconsistency.

racinmat · 2020-10-15T10:10:27Z

For Zygote itself, it's working, see:

W = randn(Float32, 2,2)
b = randn(Float32, 2)
xs = sparse(randn(Float32, 2, 2))
gradient(() -> sum(W*xs .+ b), Params([W,b]))

W = randn(Float32, 2,2)
b = randn(Float32, 2)
md = Dense(W, b)
xs = sparse(randn(Float64, 2, 2))
gradient(() -> sum(W*xs .+ b), Params([W,b]))

W = randn(Float64, 2,2)
b = randn(Float64, 2)
md = Dense(W, b)
xs = sparse(randn(Float32, 2, 2))
gradient(() -> sum(W*xs .+ b), Params([W,b]))

W = randn(Float64, 2,2)
b = randn(Float64, 2)
md = Dense(W, b)
xs = sparse(randn(Float64, 2, 2))
gradient(() -> sum(W*xs .+ b), Params([W,b]))

all of them are passing, so I don't think this is problem with Zygote.

racinmat · 2020-10-15T10:19:43Z

Hm, it seems the conversion is problematic. Dense layer explicitly casts input data to type of parameter in https://github.com/FluxML/Flux.jl/blob/master/src/layers/basic.jl#L138.

And Zygote can't pullback through it, following Zygote code


W = randn(Float32, 2,2)
b = randn(Float32, 2)
md = Dense(W, b)
xs = sparse(randn(Float64, 2, 2))
gradient(() -> sum(W*Float32.(xs) .+ b), Params([W,b]))

W = randn(Float64, 2,2)
b = randn(Float64, 2)
md = Dense(W, b)
xs = sparse(randn(Float32, 2, 2))
gradient(() -> sum(W*Float64.(xs) .+ b), Params([W,b]))

crashes. The question is: should this be fixed in Zygote or in Flux?

ajprabhu09 · 2020-10-15T10:24:06Z

I guess if mixed-precision compute is supported this shouldn't be an issue.

racinmat · 2020-10-15T10:29:13Z

I made issue on Zygote with MWE, we'll see FluxML/Zygote.jl#810

ToucheSir · 2021-06-14T20:44:58Z

Since this turned out to be about eltype mismatches instead of sparse array support, I think it can be safely closed.

added support for SparseArray input in Dense layer

ebed036

CarloLucibello closed this Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added support for SparseArray input in Dense layer #987

added support for SparseArray input in Dense layer #987

ajprabhu09 commented Jan 6, 2020 •

edited

Loading

MikeInnes commented Jan 6, 2020

ajprabhu09 commented Jan 6, 2020

MikeInnes commented Jan 10, 2020

ajprabhu09 commented Jan 10, 2020

MikeInnes commented Jan 10, 2020

racinmat commented Oct 15, 2020

racinmat commented Oct 15, 2020

racinmat commented Oct 15, 2020

ajprabhu09 commented Oct 15, 2020

racinmat commented Oct 15, 2020

ToucheSir commented Jun 14, 2021

added support for SparseArray input in Dense layer #987

added support for SparseArray input in Dense layer #987

Conversation

ajprabhu09 commented Jan 6, 2020 • edited Loading

MikeInnes commented Jan 6, 2020

ajprabhu09 commented Jan 6, 2020

MikeInnes commented Jan 10, 2020

ajprabhu09 commented Jan 10, 2020

MikeInnes commented Jan 10, 2020

racinmat commented Oct 15, 2020

racinmat commented Oct 15, 2020

racinmat commented Oct 15, 2020

ajprabhu09 commented Oct 15, 2020

racinmat commented Oct 15, 2020

ToucheSir commented Jun 14, 2021

ajprabhu09 commented Jan 6, 2020 •

edited

Loading