-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add project
utility
#1087
Add project
utility
#1087
Conversation
The conclusion of SciML/SciMLSensitivity.jl#493 is that RecursiveArrayTools needs a 3-line update to be fully compatible with ChainRulesCore 1.0. If there are other issues, it would be helpful to describe and try to isolate them.
Yes, Zygote still allows complex gradients. If you think something is wrong with them, you'll need to provide an example. I think you're proposing to re-break #342 etc, which isn't a great idea. |
#342 has a pretty complete discussion about why the gradients are expected and correct, and with projection, it is possible to project it into the space of numbers the users want, which can be different for different cases. If the gradients are complex, the gradients are complex. Beyond compat with RAT, it shouldn't be the AD that forces projection, but something either the compiler determines via promotion/ conversion or the user opts in via calling project explicitly. The discussion in SciML/SciMLSensitivity.jl#493 also suggests that Zygote should be compatible with the broadcasting mechanism as RAT defines it since it satisfies the Base definition, even if by using fallbacks, which seems reasonable especially considering we were able to AD through it without special handling. |
What 3 lines? I'll just add them.
I thought the current issue was that broadcasting VectorOfArray produces an Array. This is the default behavior of AbstractArray objects though, so that's what I find confusing. If you define an AbstractArray object and just define indexing, |
Maybe we can discuss RAT on an issue there? The "3 lines" I mean are here: SciML/SciMLSensitivity.jl#493 (comment) . The reason projection wants to be involved in broadcasting is that |
This PR seems like it does at least 3 different things. It should be broken up into several thing so each can be reviewed. @ChrisRackauckas and @mcabbott should finish talking about ProjectTo for RecursiveArrayTools in SciML/SciMLSensitivity.jl#493
@DhairyaLGandhi , @mcabbott and I should talk about this in a video call. But trying to talk this out in writing this thread is going to leads to a lot of time spent to reach a resolution that can be reached in a 30 minute call. |
Let's jump on a call in 5 minutes to flesh this out? I think it would be good to get some answers. |
closing as stale |
This PR revives working with RecursiveArrayTools (ref SciML/SciMLSensitivity.jl#493), and as discussed with @ChrisRackauckas , AD systems shouldn't be projecting internally, so we can reinstate RAT/ DiffEqSensitivity.
It is not known ahead of time where in the call stack a custom adjoint may be used or a custom type, we should allow for custom adjoints to manipulate code as they see fit.
Another change is that we allow complex gradients. These work most of the time, and the times they don't can be considered for bug fixes. I have a some methods which do technically do the same thing as
_project
in the accumulation stage rather than adding copies with every invocation of an adjoint.