New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPS bug on torch.transpose
and torch.log
#89673
Comments
I am observing a very similar issue (probably the same root cause) with import torch
print(f'Running PyTorch version: {torch.__version__}')
dtype = torch.float32
devices = [torch.device("mps"), torch.device("cpu")]
for device in devices:
print(f"Using device: {device}")
source = torch.randn(3, 1088, 2048, dtype=dtype, device=device)
print("source: ", source.shape, source.dtype, source.device,
source.cpu().numpy().flatten().min(), source.cpu().numpy().flatten().max())
target = torch.clamp(torch.moveaxis(source, 0, -1), 0.0, 1.0)
print("clamp(moveaxis(source)): ", target.shape, target.dtype, target.device,
target.cpu().numpy().flatten().min(), target.cpu().numpy().flatten().max())
target = torch.moveaxis(torch.clamp(source, 0.0, 1.0), 0, -1)
print("moveaxis(clamp(source)): ", target.shape, target.dtype, target.device,
target.cpu().numpy().flatten().min(), target.cpu().numpy().flatten().max()) which leads to completely wrong results when
|
From version 2.0.0, my issue is fixed.
There are some other (maybe) related issues, so I won't close this issue. |
馃悰 Describe the bug
The following code uses
torch.transpose
andtorch.log
for computing the loss value.However, it seems that the application order of the above two functions causes the different results.
Here is my result:
At least, I think loss values (
torch.nn.NLLLoss
) should not be negative value, becausetorch.nn.Softmax
is applied.In addition, the loss values after 10,000 epochs on CPU and MPS avoiding buggy path are different.
I wonder why this difference happens.
Versions
On my Apple M2 MacBook Air:
cc @kulinseth @albanD @malfet @DenisVieriu97 @razarmehr @abhudev
The text was updated successfully, but these errors were encountered: