Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fill can't handle nothing as a recieved gradient #660

Closed
wants to merge 1 commit into from

Conversation

dsweber2
Copy link
Contributor

I was using fill as a way of indexing into another matrix, e.g. x[fill(1,N)...], and this causes (nothing,) to be handed to the pullback of fill. As all the gradient does is sum at the moment, this causes errors trying to sum nothing. I address this by checking the type, and passing on any received nothing, and if the entries are mixed it ignores any nothings in the sum.

@MikeInnes
Copy link
Member

Why would fill get a tuple of nothings as its gradient? That seems fishy and probably points to a different issue.

nothing (not a tuple) is a valid gradient, but the @adjoint macro should already be handling that.

@dsweber2
Copy link
Contributor Author

here's a minimal example of what I was doing that caused the problem.

function example(x,N)
    ax = axes(x)
    extraAxe = ax[2+N:end]
    filledLoc = fill(1, N)
    return x[:, filledLoc..., extraAxe...]
end
y, back = pullback(example, randn(5,3,4,3), 2)
back(y)

which attempts to sum (nothing, nothing). Not sure where the code for it is, but it does make sense that the adjoint of splatting could result in multiple nothings grouped together.

@MikeInnes
Copy link
Member

Ok, thanks for the clarification, I see what's happening. Unfortunately this is two things conspiring, the nothing gradient from indexing and the tuple-ing of those nothings from the splat (so easy to forget that iteration means you have to support tuples as a gradient). I think you already understood that but just so it's spelt out.

The best way to resolve this is probably to use accum_sum which handles nothings; I'll think about it bit I suspect we can't avoid passing the tuple of nothings here.

As an aside, this is quite a strange use of Fill; it would be more typical to use ntuple here since the compiled code will get more benefit from knowing the size of filledLoc rather than its particular value. This is not to say we shouldn't support it, of course.

@dsweber2
Copy link
Contributor Author

dsweber2 commented Jun 1, 2020

it would be more typical to use ntuple

Good point, I should get more in the habit of using tuples. Thanks for looking into it.

@CarloLucibello
Copy link
Member

closing as stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants