Fix adjoint Iterators.product behavior with nothing #1170

lxvm · 2022-02-18T20:32:24Z

Hi,
This PR fixes the issue illustrated below where a DimesionMismatch occurs depending on the position of nothing arguments to the pullback of Iterators.product.

using Zygote

function aget()
    y, back = Zygote._pullback(Iterators.product, 1:5, 1:3, 1:2)
end

function atest0()
    y, back = aget()
    t0 = [(i, j, k) for i in 1:5, j in 1:3, k in 1:2]
    back(t0)
    # OK
end
function atest1()
    y, back = aget()
    t1 = [(nothing, j, k) for i in 1:5, j in 1:3, k in 1:2]
    back(t1)
    # ERROR: DimensionMismatch("new dimensions (3,) must be consistent with array size 5")
end
function atest2()
    y, back = aget()
    t2 = [(i, nothing, k) for i in 1:5, j in 1:3, k in 1:2]
    back(t2)
    # ERROR: DimensionMismatch("new dimensions (2,) must be consistent with array size 3")
end
function atest3()
    y, back = aget()
    t3 = [(i, j, nothing) for i in 1:5, j in 1:3, k in 1:2]
    back(t3)
    # OK
end
function atest4()
    y, back = aget()
    t4 = [(nothing, nothing, k) for i in 1:5, j in 1:3, k in 1:2]
    back(t4)
    # ERROR: DimensionMismatch("new dimensions (2,) must be consistent with array size 5")
end
function atest5()
    y, back = aget()
    t5 = [(nothing, j, nothing) for i in 1:5, j in 1:3, k in 1:2]
    back(t5)
    # ERROR: DimensionMismatch("new dimensions (3,) must be consistent with array size 5")
end
function atest6()
    y, back = aget()
    t6 = [(i, nothing, nothing) for i in 1:5, j in 1:3, k in 1:2]
    back(t6)
    # OK
end
function atest7()
    y, back = aget()
    t7 = [(nothing, nothing, nothing) for i in 1:5, j in 1:3, k in 1:2]
    back(t7)
    # OK
end

I hope this PR enforces the desired behavior of the pullback in this edge case, but I don't know enough about Zygote to understand where the nothing arguments are originating from (I noticed this bug while trying to differentiate some other code).
I can reference:

With this PR, the results of the tests defined above are:

julia> atest0()
(nothing, [6.0, 12.0, 18.0, 24.0, 30.0], [10.0, 20.0, 30.0], [15.0, 30.0])

julia> atest1()
(nothing, nothing, [10.0, 20.0, 30.0], [15.0, 30.0])

julia> atest2()
(nothing, [6.0, 12.0, 18.0, 24.0, 30.0], nothing, [15.0, 30.0])

julia> atest3()
(nothing, [6.0, 12.0, 18.0, 24.0, 30.0], [10.0, 20.0, 30.0], nothing)

julia> atest4()
(nothing, nothing, nothing, [15.0, 30.0])

julia> atest5()
(nothing, nothing, [10.0, 20.0, 30.0], nothing)

julia> atest6()
(nothing, [6.0, 12.0, 18.0, 24.0, 30.0], nothing, nothing)

julia> atest7()
(nothing, nothing, nothing, nothing)

mcabbott

This looks great, thanks!

Could you add (at least some of) these tests to the tests?

If you can invent a function whose gradient produces this error, that would also be nice to have in the tests.

src/lib/array.jl

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

lxvm · 2022-02-22T12:47:32Z

Thank you for reviewing the change!
I've added some tests which should check for the expected behavior.

I can't seem to come up with a mwe of a gradient that leads to this error.
When I find some more time I will try to create one, because I'm not sure where the nothing came from and I believe it is causing my gradients to vanish.
Is this pull request adequate?

The context in which this error arose is in computing a gradient for an electromagnetic simulation (specifically this one: https://github.com/lxvm/DeltaRCWA.jl/blob/1c445ab8d72129350de275ab782283d7ebac221e/scripts/zygote.jl#L41) where I used a product iterator to construct a matrix (the stacktrace from the gradient pointed to here: https://github.com/lxvm/DeltaRCWA.jl/blob/1c445ab8d72129350de275ab782283d7ebac221e/src/sheets.jl#L312).
I am confused about the reason for the error.

mcabbott · 2022-02-22T13:22:02Z

I see, the nothing is from here:

julia> gradient(k⃗ -> k⃗[2], (1,2,3))
((nothing, 1.0, nothing),)

and so a way to get the wrong answer is:

julia> gradient(x -> sum([y[2] * y[3] for y in Iterators.product(x, x, x, x)]), [1,2,3,4])
([224.0, 288.0, 352.0, 416.0],)

julia> ForwardDiff.gradient(x -> sum([y[2] * y[3] for y in Iterators.product(x, x, x, x)]), [1,2,3,4])
4-element Vector{Int64}:
 320
 320
 320
 320

julia> gradient(x -> sum(y[2] * y[3] for y in Iterators.product(x, x, x, x)), [1,2,3,4])  # different path, misses the rule?
([320.0, 320.0, 320.0, 320.0],)

(@v1.8) pkg> st Zygote
Status `~/.julia/environments/v1.8/Project.toml`
  [e88e6eb3] Zygote v0.6.34

Thanks for tracking this down!

lxvm · 2022-02-22T19:15:44Z

Thank you for noticing those things!
I'll see if I can find out why my code calls the rule by using the debugger.

test/lib/array.jl

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

lxvm · 2022-02-22T23:45:24Z

The last test looks good!

mcabbott · 2022-02-23T00:39:08Z

Thanks!

Fix adjoint Iterators.product behavior with nothing

999e28f

mcabbott reviewed Feb 19, 2022

View reviewed changes

src/lib/array.jl Outdated Show resolved Hide resolved

lxvm and others added 2 commits February 21, 2022 15:01

Apply suggestions from code review

d7b8b43

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

add Iterators.product adjoint tests

748be56

mcabbott reviewed Feb 22, 2022

View reviewed changes

test/lib/array.jl Outdated Show resolved Hide resolved

Update test/lib/array.jl

99445ba

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>

mcabbott merged commit 8b9d67d into FluxML:master Feb 23, 2022

lxvm mentioned this pull request Jan 2, 2024

Improve adjoint for product and zip #1489

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix adjoint Iterators.product behavior with nothing #1170

Fix adjoint Iterators.product behavior with nothing #1170

lxvm commented Feb 18, 2022

mcabbott left a comment

lxvm commented Feb 22, 2022

mcabbott commented Feb 22, 2022

lxvm commented Feb 22, 2022

lxvm commented Feb 22, 2022

mcabbott commented Feb 23, 2022

Fix adjoint Iterators.product behavior with nothing #1170

Fix adjoint Iterators.product behavior with nothing #1170

Conversation

lxvm commented Feb 18, 2022

mcabbott left a comment

Choose a reason for hiding this comment

lxvm commented Feb 22, 2022

mcabbott commented Feb 22, 2022

lxvm commented Feb 22, 2022

lxvm commented Feb 22, 2022

mcabbott commented Feb 23, 2022