Skip to content

Conversation

@Aidyn-A
Copy link
Collaborator

@Aidyn-A Aidyn-A commented Dec 13, 2023

According to the #107256 (comment) the ops tested in test_schema_correctness are not supported with torch.float8_e4m3fn yet. Until they are not supported, it is best to skip the test.

cc @mruberry @ZainRizvi @yanbing-j @vkuzo @albanD @kadeng

@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Dec 13, 2023
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 13, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115757

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 60e73bc with merge base f727bed (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@Aidyn-A Aidyn-A added module: tests Issues related to tests (not the torch.testing module) module: testing Issues related to the torch.testing module (not tests) module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types labels Dec 13, 2023
@albanD
Copy link
Collaborator

albanD commented Dec 13, 2023

I'm confused, why would we skip tests that pass?

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Dec 13, 2023

I'm confused, why would we skip tests that pass?

@albanD, it does not skip that pass, as they all fail. I can make make skips in each individual OpInfo, if that is what you imply.

@albanD
Copy link
Collaborator

albanD commented Dec 13, 2023

But CI is green? :D

@Aidyn-A Aidyn-A requested a review from mruberry as a code owner December 14, 2023 20:00
@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Dec 14, 2023

But CI is green? :D

Right, it is green because there aren't float8 capable devices. We found this failure on H100 machine.

@Aidyn-A Aidyn-A requested a review from soulitzer December 14, 2023 20:04
@mikaylagawarecki mikaylagawarecki added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 14, 2023
@drisspg
Copy link
Contributor

drisspg commented Dec 14, 2023

Yeah we don't currently have any machines capable of testing fp8 in CI/CD. I think this makes sense, and just to confirm this failed for you while running these tests on H100?

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Dec 15, 2023

this failed for you while running these tests on H100?

That is right. We are seeing this failure on H100:

=================================== FAILURES ===================================
_ TestSchemaCheckModeOpInfoCUDA.test_schema_correctness_torch__scaled_mm_cuda_float8_e4m3fn _
Traceback (most recent call last):
  File "/home/user/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/home/user/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/user/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/home/user/python/dist-packages/torch/testing/_internal/common_utils.py", line 2536, in wrapper
    method(*args, **kwargs)
  File "/home/user/python/dist-packages/torch/testing/_internal/common_utils.py", line 2536, in wrapper
    method(*args, **kwargs)
  File "/home/user/python/dist-packages/torch/testing/_internal/common_device_type.py", line 428, in instantiated_test
    raise rte
  File "/home/user/python/dist-packages/torch/testing/_internal/common_device_type.py", line 415, in instantiated_test
    result = test(self, **param_kwargs)
  File "/home/user/python/dist-packages/torch/testing/_internal/common_device_type.py", line 945, in dep_fn
    return fn(slf, *args, **kwargs)
  File "/home/user/python/dist-packages/torch/testing/_internal/common_device_type.py", line 908, in test_wrapper
    return test(*args, **kwargs)
  File "/home/user/git/pytorch/test/test_schema_check.py", line 502, in test_schema_correctness
    op(sample.input, *sample.args, **sample.kwargs)
  File "/home/user/python/dist-packages/torch/testing/_internal/opinfo/core.py", line 1105, in __call__
    return self.op(*args, **kwargs)
  File "/home/user/python/dist-packages/torch/_subclasses/schema_check_mode.py", line 172, in __torch_dispatch__
    if any(
  File "/home/user/python/dist-packages/torch/_subclasses/schema_check_mode.py", line 173, in <genexpr>
    has_mutated(a, b, c)
  File "/home/user/python/dist-packages/torch/_subclasses/schema_check_mode.py", line 63, in has_mutated
    and bitwise_equal(before, after)
  File "/home/user/python/dist-packages/torch/_subclasses/schema_check_mode.py", line 52, in bitwise_equal
    return torch.allclose(lhs, rhs, equal_nan=True)
RuntimeError: "mul_cuda" not implemented for 'Float8_e4m3fn'

To execute this test, run the following from the base repo dir:
     python test/test_schema_check.py -k test_schema_correctness_torch__scaled_mm_cuda_float8_e4m3fn

Copy link
Contributor

@drisspg drisspg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Aidyn-A
Copy link
Collaborator Author

Aidyn-A commented Dec 15, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 15, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

guilhermeleobas pushed a commit to guilhermeleobas/pytorch that referenced this pull request Dec 18, 2023
According to the pytorch#107256 (comment) the ops tested in `test_schema_correctness` are not supported with `torch.float8_e4m3fn` yet. Until they are not supported, it is best to skip the test.

Pull Request resolved: pytorch#115757
Approved by: https://github.com/drisspg
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
According to the pytorch#107256 (comment) the ops tested in `test_schema_correctness` are not supported with `torch.float8_e4m3fn` yet. Until they are not supported, it is best to skip the test.

Pull Request resolved: pytorch#115757
Approved by: https://github.com/drisspg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: floatx (formerly float8) For torch.float8_e5m2 and torch.float8_e4m3 and other sub 8-bit float types module: testing Issues related to the torch.testing module (not tests) module: tests Issues related to tests (not the torch.testing module) open source topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants