Error taking gradients through REN with `nx = 0` and/or `nv = 0` #99

nic-barbara · 2023-07-12T21:14:01Z

When either nx = 0 or nv = 0, we have zero-dimensional arrays in the REN. This is fine for the forward-pass, but raises an error on back-propagation due to inconsistent dimensions. This is related to JuliaLang/julia#28866.

A minimal example is as follows.

using Flux
using Random
using RobustNeuralNetworks

"""
Test that backpropagation runs and parameters change
"""
batches = 10
nu, nx, nv, ny = 4, 5, 0, 2
γ = 10
ren_ps = LipschitzRENParams{Float64}(nu, nx, nv, ny, γ)
model = DiffREN(ren_ps)

# Dummy data
us = randn(nu, batches)
ys = randn(ny, batches)
data = [(us[:,k], ys[:,k]) for k in 1:batches]

# Dummy loss function just for testing
function loss(m, u, y)
    x0 = init_states(m, size(u,2))
    x1, y1 = m(x0, u)
    return Flux.mse(y1, y) + sum(x1.^2)
end

# Debug batch updates
opt_state = Flux.setup(Adam(0.01), model)
gs = Flux.gradient(loss, model, us, ys)
Flux.update!(opt_state, model, gs[1])

The error is caused by broadcasting addition with bias vectors in the evaluation of the REN. We need to find a work-around.

The text was updated successfully, but these errors were encountered:

nic-barbara · 2023-07-12T21:17:25Z

Error message:

ERROR: DimensionMismatch("variable with size(x) == (0,) cannot have a gradient with size(dx) == (0, 10)")
Stacktrace:
  [1] (::ChainRulesCore.ProjectTo{AbstractArray, NamedTuple{(:element, :axes), Tuple{ChainRulesCore.ProjectTo{Float64, NamedTuple{(), Tuple{}}}, Tuple{Base.OneTo{Int64}}}}})(dx::Matrix{Float64})
    @ ChainRulesCore ~/.julia/packages/ChainRulesCore/a4mIA/src/projection.jl:227
  [2] _project
    @ ~/.julia/packages/Zygote/SuKWp/src/compiler/chainrules.jl:189 [inlined]
  [3] unbroadcast(x::Vector{Float64}, x̄::Matrix{Float64})
    @ Zygote ~/.julia/packages/Zygote/SuKWp/src/lib/broadcast.jl:59
  [4] #1179
    @ ~/.julia/packages/Zygote/SuKWp/src/lib/broadcast.jl:83 [inlined]
  [5] map
    @ ./tuple.jl:222 [inlined]
  [6] #1178
    @ ~/.julia/packages/Zygote/SuKWp/src/lib/broadcast.jl:83 [inlined]
  [7] #3734#back
    @ ~/.julia/packages/ZygoteRules/OgCVT/src/adjoint.jl:71 [inlined]
  [8] Pullback
    @ ~/.julia/dev/RobustNeuralNetworks/src/Wrappers/REN/ren.jl:80 [inlined]
...

nic-barbara · 2023-07-12T21:17:55Z

This error is only raised when us and ys are matrices rather than vectors in the minimal example.

nic-barbara self-assigned this Jul 12, 2023

nic-barbara added the bug Something isn't working label Jul 12, 2023

nic-barbara linked a pull request Jul 12, 2023 that will close this issue

Bugfix for Linear RENs and Type Stability #98

Merged

nic-barbara mentioned this issue Jul 12, 2023

Improve speed with back-propagation #100

Closed

nic-barbara linked a pull request Jul 12, 2023 that will close this issue

Bugfix: error when taking gradients with nv = 0 or nx = 0 #101

Merged

nic-barbara removed a link to a pull request Jul 12, 2023

Bugfix for Linear RENs and Type Stability #98

Merged

nic-barbara closed this as completed in #101 Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error taking gradients through REN with `nx = 0` and/or `nv = 0` #99

Error taking gradients through REN with `nx = 0` and/or `nv = 0` #99

nic-barbara commented Jul 12, 2023

nic-barbara commented Jul 12, 2023

nic-barbara commented Jul 12, 2023

Error taking gradients through REN with nx = 0 and/or nv = 0 #99

Error taking gradients through REN with nx = 0 and/or nv = 0 #99

Comments

nic-barbara commented Jul 12, 2023

nic-barbara commented Jul 12, 2023

nic-barbara commented Jul 12, 2023

Error taking gradients through REN with `nx = 0` and/or `nv = 0` #99

Error taking gradients through REN with `nx = 0` and/or `nv = 0` #99