-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NVFuser] RuntimeError: ref_id_it != replayed_concrete_ids_.vector().end() INTERNAL ASSERT FAILED #84510
Comments
The previous error was resolved, however when I changed the shape of x from import torch
torch._C._jit_set_nvfuser_single_node_mode(True)
torch._C._debug_set_autodiff_subgraph_inlining(False)
torch.manual_seed(0)
def func(x, y, z):
return (x + y)**z
func_script = torch.jit.script(func)
x = torch.rand([300, 1, 1, 1, 1], device="cuda").requires_grad_()
y = torch.rand([1, 1, 1, 4], device="cuda")
z = torch.rand([1, 1, 1, 1], device="cuda")
for i in range(10):
res = func(x, y, z)
grad = torch.autograd.grad(res, x, torch.ones_like(res))[0]
res_script = func_script(x, y, z)
grad_script = torch.autograd.grad(res_script, x, torch.ones_like(res))[0]
print(f"{i}: max_result_error {(res_script-res).abs().max()}, max_grad_error {(grad_script-grad).abs().max()}") output
VersionI'm using torch nightly, and here is all the version information.
Related issue:aiqm/torchani#628 def func(x, y, z, w):
ret = (x + y) * z * w * 2
return ret
device = "cuda"
x = torch.rand([360, 1, 1, 1, 1], device=device).requires_grad_()
y = torch.rand([1, 1, 1, 4], device=device)
z = torch.rand([1, 1, 1, 1], device=device)
w = torch.rand([1, 1, 8, 1], device=device) |
Looks like this bug is already fixed in the latest nvfuser, likely by csarofeen#2517, you should see it in next upstream push cc @jjsjann123 This is what I get:
|
@jjsjann123 when will this fix be in a PyTorch release? We have users reporting errors due to this issue using PyTorch 2.0. Currently our work around is to disable NVFuser. |
I don't think we'll actually try to patch this in upstream pytorch. nvfuser is in deprecating mode in TorchScript at this moment. Upstream has switched to NNC as the default fuser in TorchScript as well. If you are stuck with PyTorch 2.0, I think manually disabling nvfuser for NNC sounds like a good way to get unblocked. If you can move to nightly pytorch and feeling exploratory, you can try to patch nvfuser runtime with nvfuser pypi package. You can basically try pip install pytorch nightly (choose you cuda version).
This should hot swap the nvfuser library that's shipped with upstream pytorch. 🤞 |
Thank you for this update! |
🐛 Describe the bug
Run with
error message:
cc @ngimel @jjsjann123 @zasdfgbnm
Versions
the latest pytorch nightly
The text was updated successfully, but these errors were encountered: