New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AutoDiff] [SR-14218] Correctly propagate tangent vectors of inout parameters from functions with multiple basic blocks #37861
Conversation
@swift-ci Please test |
Build failed |
@swift-ci Please test macOS platform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@@ -2052,6 +2052,8 @@ bool PullbackCloner::Implementation::run() { | |||
SmallVector<SILValue, 8> retElts; | |||
// This vector will contain all indirect parameter adjoint buffers. | |||
SmallVector<SILValue, 4> indParamAdjoints; | |||
// This vector will identify the locations where initialization is needed. | |||
SmallVector<bool, 8> outputsToInitialize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use a SmallBitVector
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, I've gone ahead and done so.
@swift-ci Please test |
@swift-ci Please test Windows platform |
1 similar comment
@swift-ci Please test Windows platform |
@swift-ci Please test |
Build failed |
@swift-ci Please test macOS platform |
@swift-ci Please test Windows platform |
Build failed |
@swift-ci Please test macOS |
Build failed |
@swift-ci Please test macOS |
Build failed |
@swift-ci Please test macOS platform |
Build failed |
@swift-ci Please test macOS platform |
Build failed |
@swift-ci Please test macOS platform |
Build failed |
@swift-ci Please test macOS platform |
Apparently, the parameter index calculation in apple#37861 was not always correct in presence of other pullback parameters (e.g. captures and non-differentiated args). Collect all inout parameters and collect inout parameter adjoints correctly. Fixes SR-15891
Pullback generation for functions featuring active inout parameters and multiple basic blocks currently does not correctly propagate tangent vectors for the inout parameters. Changes to the tangent vectors of the inout parameters are not being copied back at the end of pullbacks involving control flow. As a workaround, to get correct tangent vectors you currently need to manually copy tangent vectors to an outside holding variable, then manually copy those tangent vectors back in before the function call site.
This pull request adds the missing copies of tangent vectors for inout parameters at the end of pullbacks, leading to correct tangent vectors around functions using inout parameters and control flow. Tests have been added for common permutations of these functions that previously produced incorrect tangent vectors.
In one large differentiable physics simulator with a complex model, this eliminates 80 lines of code for manual tangent vector copies. It also improves gradient descent throughput by 80% due to the removal of overhead from these explicit tangent vector copies.
Most of the credit for this fix belongs to @dan-zheng, who provided the original code to correctly copy these tangent vectors and worked with me to refine the fix.
Resolves SR-14053 and SR-14218.