Add batched grad testing to OpInfo #50818

zou3519 · 2021-01-20T16:58:47Z

Stack from ghstack:

Add batched grad testing to OpInfo #50818 Add batched grad testing to OpInfo

This PR does two things:

Add batched grad testing to OpInfo
Improve the error message from gradcheck if batched gradient
computation fails to include suggestions for workarounds.

To add batched grad testing to OpInfo, this PR:

adds new check_batched_grad=True and check_batched_gradgrad=True
attributes to OpInfo. These are True by default because we expect most
operators to support batched gradient computation.
If check_batched_grad=True, then test_fn_grad invokes gradcheck
with check_batched_grad=True.
If check_batched_gradgrad=True, then test_fn_gradgradgrad invokes
gradgradcheck with check_batched_grad=True.

The improved gradcheck error message looks like the following when an
exception is thrown while computing batched gradients:
https://gist.github.com/zou3519/5a0f46f908ba036259ca5e3752fd642f

Future

Sometime in the not-near future, we will separate out "batched grad
testing" from "gradcheck" for the purposes of OpInfo to make the
testing more granular and also so that we can test that the vmap
fallback doesn't get invoked (currently batched gradient testing only
tests that the output values are correct).

Test Plan:

run tests pytest test/test_ops.py -v -k "Gradients"

Differential Revision: D25997703

This PR does two things: 1. Add batched grad testing to OpInfo 2. Improve the error message from `gradcheck` if batched gradient computation fails to include suggestions for workarounds. To add batched grad testing to OpInfo, this PR: - adds new `check_batched_grad=True` and `check_batched_gradgrad=True` attributes to OpInfo. These are True by default because we expect most operators to support batched gradient computation. - If `check_batched_grad=True`, then `test_fn_grad` invokes gradcheck with `check_batched_grad=True`. - If `check_batched_gradgrad=True`, then `test_fn_gradgradgrad` invokes gradgradcheck with `check_batched_grad=True`. The improved gradcheck error message looks like the following when an exception is thrown while computing batched gradients: https://gist.github.com/zou3519/5a0f46f908ba036259ca5e3752fd642f Future - Sometime in the not-near future, we will separate out "batched grad testing" from "gradcheck" for the purposes of OpInfo to make the testing more granular and also so that we can test that the vmap fallback doesn't get invoked (currently batched gradient testing only tests that the output values are correct). Test Plan: - run tests `pytest test/test_ops.py -v -k "Gradients"` [ghstack-poisoned]

This PR does two things: 1. Add batched grad testing to OpInfo 2. Improve the error message from `gradcheck` if batched gradient computation fails to include suggestions for workarounds. To add batched grad testing to OpInfo, this PR: - adds new `check_batched_grad=True` and `check_batched_gradgrad=True` attributes to OpInfo. These are True by default because we expect most operators to support batched gradient computation. - If `check_batched_grad=True`, then `test_fn_grad` invokes gradcheck with `check_batched_grad=True`. - If `check_batched_gradgrad=True`, then `test_fn_gradgradgrad` invokes gradgradcheck with `check_batched_grad=True`. The improved gradcheck error message looks like the following when an exception is thrown while computing batched gradients: https://gist.github.com/zou3519/5a0f46f908ba036259ca5e3752fd642f Future - Sometime in the not-near future, we will separate out "batched grad testing" from "gradcheck" for the purposes of OpInfo to make the testing more granular and also so that we can test that the vmap fallback doesn't get invoked (currently batched gradient testing only tests that the output values are correct). Test Plan: - run tests `pytest test/test_ops.py -v -k "Gradients"` ghstack-source-id: 4bb191e Pull Request resolved: #50818

albanD

Looks good to me.

One question I would have about all these functions for which batched grad check fails is: should we make these hard errors for now to make sure they don't return silently wrong gradients?

zou3519 · 2021-01-21T14:59:29Z

Looks good to me.

One question I would have about all these functions for which batched grad check fails is: should we make these hard errors for now to make sure they don't return silently wrong gradients?

My understanding is that they are hard errors -- are they actually soft errors in the code right now? (Is that what happens when gradcheck runs with raise_exception=False?)

albanD · 2021-01-21T16:24:19Z

My understanding is that they are hard errors

They are hard error when you run gradcheck because the gradients don't match (from my understanding, maybe I'm wrong here).
But if people try to use them, they will run without any error (and produce wrong gradients) <- I meant silent error here.

facebook-github-bot · 2021-01-21T23:16:24Z

@zou3519 merged this pull request in 1669151.

zou3519 requested a review from albanD as a code owner January 20, 2021 16:58

facebook-github-bot added the cla signed label Jan 20, 2021

zou3519 requested a review from mruberry January 20, 2021 22:13

albanD approved these changes Jan 20, 2021

View reviewed changes

facebook-github-bot closed this in 1669151 Jan 21, 2021

facebook-github-bot added the Merged label Jan 21, 2021

facebook-github-bot deleted the gh/zou3519/344/head branch January 25, 2021 15:19

zou3519 mentioned this pull request Jan 26, 2021

Add batched grad checking to Opinfo #50600

Closed

imaginary-person mentioned this pull request Feb 5, 2021

test_fn_grad_pow & test_fn_grad_grad_pow fail for dtype float64 on both CPU & CUDA #51804

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add batched grad testing to OpInfo #50818

Add batched grad testing to OpInfo #50818

Uh oh!

zou3519 commented Jan 20, 2021 •

edited

Loading

Uh oh!

albanD left a comment

Uh oh!

zou3519 commented Jan 21, 2021

Uh oh!

albanD commented Jan 21, 2021

Uh oh!

facebook-github-bot commented Jan 21, 2021

Uh oh!

Uh oh!

Add batched grad testing to OpInfo #50818

Add batched grad testing to OpInfo #50818

Uh oh!

Conversation

zou3519 commented Jan 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Jan 21, 2021

Uh oh!

albanD commented Jan 21, 2021

Uh oh!

facebook-github-bot commented Jan 21, 2021

Uh oh!

Uh oh!

zou3519 commented Jan 20, 2021 •

edited

Loading