[jvpvjp] Batch norm coverage with decomposition #877

samdow · 2022-06-15T15:25:38Z

Adds decomposition from #675 for forward over reverse coverage. Similar to with layer norm, we needed to recompute the mean and variance so autograd propagates properly (sad) and needed to return tensors of zeros instead of None (sad)

zou3519 · 2022-06-17T20:26:26Z

functorch/_src/decompositions.py

+        grad_weight = torch.zeros_like(weight)  # should be None but doesn't work with vjp
+    else:
+        grad_weight = torch.zeros(())  # should be None but doesn't work with vjp
+
+    if output_mask[2]:
+        grad_bias = grad_output_sum
+    else:
+        grad_bias = torch.zeros_like(grad_output_sum)  # should be None but doesn't work with vjp


sad, but is it what it is

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…on (pytorch/functorch#877)

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…on (pytorch/functorch#877)

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

… and saved_var when input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

…hen input requires grad" We want to avoid having to recompute saved_mean and saved_invstd in batch_norm_backward's decomposition in functorch (see pytorch/functorch#877), but also avoid unnecessarily computing forward grads for saved_mean and saved_invstd when they are not needed. Tested locally with: `python test/test_ops.py -k test_jvpvjp_nn_functional_batch_norm` Issues: - not sure if gradgrad in core is missing something, but it is able to pass while the fwgrad_bwgrad comparison fails in functorch [ghstack-poisoned]

facebook-github-bot added the cla signed label Jun 15, 2022

samdow force-pushed the batch_norm_decomp branch from e072fd0 to 44727e7 Compare June 15, 2022 17:52

batch norm forward over reverse coverage with decomposition

825f439

samdow force-pushed the batch_norm_decomp branch 2 times, most recently from 3b68942 to 825f439 Compare June 16, 2022 16:23

zou3519 approved these changes Jun 17, 2022

View reviewed changes

zou3519 reviewed Jun 17, 2022

View reviewed changes

samdow merged commit 347334c into main Jun 17, 2022

soulitzer mentioned this pull request Jul 11, 2022

Update batch norm to compute forward grads for saved_mean and saved_var when input requires grad pytorch/pytorch#81293

Closed

zou3519 pushed a commit to zou3519/pytorch that referenced this pull request Jul 20, 2022

[functorch] batch norm forward over reverse coverage with decompositi…

d8c020d

…on (pytorch/functorch#877)

bigfootjon pushed a commit to pytorch/pytorch that referenced this pull request Jul 21, 2022

[functorch] batch norm forward over reverse coverage with decompositi…

a178624

…on (pytorch/functorch#877)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[jvpvjp] Batch norm coverage with decomposition #877

[jvpvjp] Batch norm coverage with decomposition #877

samdow commented Jun 15, 2022

zou3519 Jun 17, 2022

[jvpvjp] Batch norm coverage with decomposition #877

[jvpvjp] Batch norm coverage with decomposition #877

Conversation

samdow commented Jun 15, 2022

zou3519 Jun 17, 2022

Choose a reason for hiding this comment