Fix broken linalg unittests on ARM platform #125438

malfet · 2024-05-02T22:26:15Z

🐛 Describe the bug

If test_linalg.py is run on Mac M1 4 tests are failing:

======================================================================
ERROR: test_vector_norm_cpu_bfloat16 (__main__.TestLinalgCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 2757, in wrapper
    method(*args, **kwargs)
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 432, in instantiated_test
    raise rte
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 419, in instantiated_test
    result = test(self, **param_kwargs)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1289, in test_vector_norm
    run_test_case(
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1262, in run_test_case
    result_dtype_reference = vector_norm_reference(input, ord, dim=dim, keepdim=keepdim, dtype=norm_dtype)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1246, in vector_norm_reference
    result = torch.linalg.norm(input_maybe_flat, ord, dim=dim, keepdim=keepdim, dtype=dtype)
RuntimeError: Found dtype Float but expected BFloat16

To execute this test, run the following from the base repo dir:
     python test/test_linalg.py -k test_vector_norm_cpu_bfloat16

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
ERROR: test_vector_norm_cpu_float16 (__main__.TestLinalgCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 2757, in wrapper
    method(*args, **kwargs)
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 432, in instantiated_test
    raise rte
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 419, in instantiated_test
    result = test(self, **param_kwargs)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1289, in test_vector_norm
    run_test_case(
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1262, in run_test_case
    result_dtype_reference = vector_norm_reference(input, ord, dim=dim, keepdim=keepdim, dtype=norm_dtype)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1246, in vector_norm_reference
    result = torch.linalg.norm(input_maybe_flat, ord, dim=dim, keepdim=keepdim, dtype=dtype)
RuntimeError: Found dtype Float but expected Half

To execute this test, run the following from the base repo dir:
     python test/test_linalg.py -k test_vector_norm_cpu_float16

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
ERROR: test_vector_norm_cpu_float32 (__main__.TestLinalgCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 2757, in wrapper
    method(*args, **kwargs)
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 432, in instantiated_test
    raise rte
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 419, in instantiated_test
    result = test(self, **param_kwargs)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1289, in test_vector_norm
    run_test_case(
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1262, in run_test_case
    result_dtype_reference = vector_norm_reference(input, ord, dim=dim, keepdim=keepdim, dtype=norm_dtype)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 1246, in vector_norm_reference
    result = torch.linalg.norm(input_maybe_flat, ord, dim=dim, keepdim=keepdim, dtype=dtype)
RuntimeError: Found dtype Double but expected Float

To execute this test, run the following from the base repo dir:
     python test/test_linalg.py -k test_vector_norm_cpu_float32

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

======================================================================
FAIL: test_addmv_cpu_float16 (__main__.TestLinalgCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 2757, in wrapper
    method(*args, **kwargs)
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_device_type.py", line 419, in instantiated_test
    result = test(self, **param_kwargs)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 5611, in test_addmv
    self._test_addmm_addmv(torch.addmv, t, m, v)
  File "/Users/nshulga/git/pytorch/pytorch-tmp/test/test_linalg.py", line 5576, in _test_addmm_addmv
    self.assertEqual(res1, res3)
  File "/Users/nshulga/git/pytorch/pytorch/torch/testing/_internal/common_utils.py", line 3640, in assertEqual
    raise error_metas.pop()[0].to_error(
AssertionError: Tensor-likes are not close!

Mismatched elements: 50 / 50 (100.0%)
Greatest absolute difference: 0.08056640625 at index (35,) (up to 0.001 allowed)
Greatest relative difference: 0.74755859375 at index (1,) (up to 0.001 allowed)

To execute this test, run the following from the base repo dir:
     python test/test_linalg.py -k test_addmv_cpu_float16

This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0

----------------------------------------------------------------------
Ran 1032 tests in 44.906s

FAILED (failures=1, errors=3, skipped=83)

Versions

CI

cc @mruberry @ZainRizvi @jianyuh @nikitaved @pearu @walterddr @xwang233 @lezcano @snadampal

The text was updated successfully, but these errors were encountered:

Removes obscure "Issue with numpy version on arm" added by #82213 And replaces it with 4 targeted skips: - test_addmv for `float16` - test_vector_norm for `float16`, `bfloat16` and `float32` Followups to fix them are tracked in #125438 Pull Request resolved: #125377 Approved by: https://github.com/kit1980

Aidyn-A · 2024-05-03T22:32:44Z

Same numerical mismatches test_addmv_cpu_float16 were observed on Grace CPU (ARM).

Aidyn-A · 2024-05-03T22:41:32Z

To be fair, I was the one last modified vector_norm in #125175. But I did not see Found dtype X but expected Y on Grace. Only test_addmv_cpu_float16 show up.

malfet added module: tests Issues related to tests (not the torch.testing module) module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul labels May 2, 2024

malfet mentioned this issue May 2, 2024

[CI] Unskip Linalg tests on ARM #125377

Closed

drisspg added triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: arm Related to ARM architectures builds of PyTorch. Includes Apple M1 labels May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix broken linalg unittests on ARM platform #125438

Fix broken linalg unittests on ARM platform #125438

malfet commented May 2, 2024 •

edited by pytorch-bot bot

Loading

Aidyn-A commented May 3, 2024

Aidyn-A commented May 3, 2024 •

edited

Loading

Fix broken linalg unittests on ARM platform #125438

Fix broken linalg unittests on ARM platform #125438

Comments

malfet commented May 2, 2024 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

Aidyn-A commented May 3, 2024

Aidyn-A commented May 3, 2024 • edited Loading

malfet commented May 2, 2024 •

edited by pytorch-bot bot

Loading

Aidyn-A commented May 3, 2024 •

edited

Loading