Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error calculating loss RuntimeError: Native API failed. Native API returns: -50 (PI_ERROR_INVALID_ARG_VALUE) -50 (PI_ERROR_INVALID_ARG_VALUE) #335

Closed
turbobuilt opened this issue Apr 24, 2023 · 5 comments
Labels
ARC ARC GPU Crash Execution crashes

Comments

@turbobuilt
Copy link

Hi all,

Not an expert on anything intel or compiler related. Have this weird bug and don't know what to do. I'm running pytorch on Ubuntu 22 with an arc a770

Traceback (most recent call last):
File "/home/dev/projects/proj/sub/ndb.py", line 398, in
loss = criterion(output, target)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1190, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/loss.py", line 536, in forward
return F.mse_loss(input, target, reduction=self.reduction)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 3292, in mse_loss
return torch._C._nn.mse_loss(expanded_input, expanded_target, _Reduction.get_enum(reduction))
RuntimeError: Native API failed. Native API returns: -50 (PI_ERROR_INVALID_ARG_VALUE) -50 (PI_ERROR_INVALID_ARG_VALUE)

any ideas? This happens when computing the loss. I tried making the worlds most simple linear layer. also get these warnings when optimizing

/usr/local/lib/python3.10/dist-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension:
warn(f"Failed to load image Python extension: {e}")
/usr/local/lib/python3.10/dist-packages/intel_extension_for_pytorch/frontend.py:447: UserWarning: For XPU device, the split master weight is unsupported for now, so temp to disable it
warnings.warn("For XPU device, the split master weight is unsupported for now, so temp to disable it")
/usr/local/lib/python3.10/dist-packages/intel_extension_for_pytorch/frontend.py:457: UserWarning: For XPU device to save valuable device memory, temp to do optimization on inplaced model, so make inplace to be true
warnings.warn(
/usr/local/lib/python3.10/dist-packages/intel_extension_for_pytorch/frontend.py:464: UserWarning: For XPU, the weight prepack and sample input are disabled. The onednn layout is automatically chosen to use
warnings.warn(
/usr/local/lib/python3.10/dist-packages/intel_extension_for_pytorch/optim/_optimizer_utils.py:250: UserWarning: Does not suport fused step for <class 'torch.optim.adam.Adam'>, will use non-fused step
warnings.warn("Does not suport fused step for " + str(type(optimizer)) + ", will use non-fused step")
file_1.size() torch.Size([1, 1787747])

I installed pytorch as sudo, so not sure if that causes an issue. Any ideas what this error is about?

@turbobuilt
Copy link
Author

Wow, I have found that this is caused by enabling bfloat16. Not sure why.

@jingxu10 jingxu10 added ARC ARC GPU Crash Execution crashes labels Apr 24, 2023
@jingxu10
Copy link
Contributor

Could you share a sample code that could help us reproducing this error?

@turbobuilt
Copy link
Author

turbobuilt commented Apr 24, 2023

Ok well had to do this before running the data through the model

    batch = batch.bfloat16()
    target = target.bfloat16()

And it got fixed!

@turbobuilt
Copy link
Author

These were the lines causing an issue:

model, optimizer = ipex.optimize(model, optimizer=optimizer, dtype=torch.bfloat16)

with torch.xpu.amp.autocast(enabled=True, dtype=torch.bfloat16):
output = model(batch)

@tristan-k
Copy link

I followed the linux install guide and I'm having the same issue on Ubuntu 22.04 (Kernel 6.5) with a UHD 730 (i5-11400). Not sure how to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ARC ARC GPU Crash Execution crashes
Projects
None yet
Development

No branches or pull requests

3 participants