-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Fix derivatives of norm(p=inf) #78105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Following up on #51099 (comment), we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes #67517 [ghstack-poisoned]
🔗 Helpful links
✅ No Failures (0 Pending)As of commit 0055821 (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
@ngimel I improved the explanation for why we have to do that trick in I also simplified the formula for the derivative. That's for @albanD to have a look. |
fwiw, this PR fixes some issues that were uncovered in the top PR of the stack after fixing a the forward AD for the |
Following up on #51099 (comment), we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes #67517 [ghstack-poisoned]
Following up on #51099 (comment), we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes #67517 [ghstack-poisoned]
@pytorchbot merge this |
Hey @lezcano. |
Summary: Following up on #51099 (comment), we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes #67517 Pull Request resolved: #78105 Approved by: https://github.com/ngimel Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/0c8c39fa715155190c51016fad5bdfc459ed80b3 Reviewed By: mehtanirav Differential Revision: D36668395 fbshipit-source-id: 5c3d887729279fdcd6dd0e47c4ea6372096a280b
Stack from ghstack:
Following up on #51099 (comment), we fix these derivatives, as they were incorrect until now.
As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be.
Fixes #67517