-
Notifications
You must be signed in to change notification settings - Fork 25.7k
TST Adds module test comparing output of cpu and gpu #64694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit b6713c5 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
Codecov Report
@@ Coverage Diff @@
## master #64694 +/- ##
==========================================
- Coverage 66.69% 66.68% -0.01%
==========================================
Files 718 718
Lines 92693 92694 +1
==========================================
- Hits 61817 61814 -3
- Misses 30876 30880 +4 |
test/test_modules.py
Outdated
| module_inputs_cpu = module_info.module_inputs_func(module_info, device="cpu", dtype=dtype, | ||
| requires_grad=True) | ||
|
|
||
| def make_leafs_input(items): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for cases where the input are not leaves. For example in NLLLoss, the forward_input args are not leaves:
import torch
from torch.testing import make_tensor
input = make_tensor((15, 10), device='cpu', dtype=torch.float32, requires_grad=True).log_softmax(dim=1)
input.is_leaf
# FalseThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHy do you need them to be leafs?
And the only way for them not to be leafs is if they do require gradients so not a problem for gradcheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is checking for cpu+gpu parity for _.grad. backward() only accumulates gradients in the leafs:
import torch
from torch.testing import make_tensor
import torch.nn as nn
input = (make_tensor((15, 10), device='cpu', dtype=torch.float32, requires_grad=True)
.log_softmax(dim=1))
print("input.requires_grad:", input.requires_grad)
target = torch.empty(15, device='cpu').uniform_().mul(10).floor().long()
nll = nn.NLLLoss()
loss = nll(input, target)
loss.backward(retain_graph=True)
print("input.grad:", input.grad)
# input.requires_grad: True
# input.grad: NoneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case,
for item in items:
if isinstance(item, torch.Tensor):
item.retain_grad()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good, thanks for the update! Few comments below
test/test_modules.py
Outdated
| def _make_leafs(item): | ||
| if isinstance(item, dict): | ||
| for i in item.values(): | ||
| _make_leafs(i) | ||
| elif isinstance(item, tuple): | ||
| for i in item: | ||
| _make_leafs(i) | ||
| else: | ||
| if not isinstance(item, torch.Tensor) or item.is_leaf: | ||
| return | ||
| old_requires_grad = item.requires_grad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is interesting. Is this needed because of NLLLoss's ModuleInputs? Looks like the input there is created with requires_grad=True then log_softmax() is called, creating a graph which breaks the test.
I wonder if we want this somewhere more general to enforce that all tensors with requires_grad=True coming from module_inputs_funcs are guaranteed to have an empty graph. Wouldn't this be needed for the gradcheck tests as well?
Also if we do make it more general and called automagically, it probably should handle lists as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be needed for the gradcheck tests as well?
gradcheck will check the input and call retain_grad:
pytorch/torch/autograd/gradcheck.py
Line 663 in c12df2d
| inp.retain_grad() |
I updated this function to do the same.
test/test_modules.py
Outdated
| # === Compare forward output between cpu and gpu === | ||
| with freeze_rng_state(): | ||
| cpu_output = cpu_module(*cpu_forward_args, **cpu_forward_kwargs) | ||
| with freeze_rng_state(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I don't think you need this second freeze_rng_state call
test/test_modules.py
Outdated
| module_inputs_cpu = module_info.module_inputs_func(module_info, device="cpu", dtype=dtype, | ||
| requires_grad=True) | ||
|
|
||
| def _retain_grad(item): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding a comment explaining the reason this is necessary?
|
Depending on what the issue is in #64444 (comment), this PR may also cause issues. |
@thomasjpfan FYI a |
|
@jbschlosser I wonder if that is why the number of threads is set to one in pytorch/torch/testing/_internal/common_nn.py Lines 5895 to 5897 in 0561e10
|
Yeah good find, we should probably replicate that for the |
|
Finished in #68097 |
Summary: Continuation of #64694; fixes issues with the diff there Reviewed By: mruberry Differential Revision: D32300650 Pulled By: jbschlosser fbshipit-source-id: f3a5e72b019d4eddd7202854999eab61fffc9006 [ghstack-poisoned]
Continues #61935
This PR adds:
test_cpu_gpu_results_are_equal, which checks the results from cpu and cuda.allow_devicestomodulesto filter tests without skipping.Note this PR does not compare gradgrad check to keep the PR smaller.
cc @albanD @mruberry @jbschlosser @walterddr