Skip to content

Conversation

H-Huang
Copy link
Member

@H-Huang H-Huang commented May 17, 2023

Stack from ghstack (oldest at bottom):

TLDR

Fix decorator to re-enable 26+ distributed tests that were previously being skipped in CI

Explanation

As part of the UCC upstream, we updated the backend tests cases to also include "ucc".

backend_feature["gpu"] = {"nccl", "gloo", "ucc"}
backend_feature["cuda"] = {"nccl", "gloo", "ucc"}
backend_feature["ddp"] = {"nccl", "gloo", "ucc"}

In distributed tests we use a decorator which reads from this config and makes sure all backends are available on the system.

@require_backends_available(DistTestCases.backend_feature["gpu"])

However, UCC is not configured on by default for a certain subset of CI tests, which causes the entire test to be skipped (even if the test is meant for nccl and the backend being tested is nccl).

As the fix, we should just check that only the BACKEND being tested is available

Changes

  • Change logic to only check if the current backend being used is available
  • Rename require_backends_available -> require_backend_is_available

@pytorch-bot
Copy link

pytorch-bot bot commented May 17, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/101704

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit c4db71a:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

H-Huang added a commit that referenced this pull request May 17, 2023
@H-Huang H-Huang added topic: not user facing topic category ciflow/trunk Trigger trunk jobs on your pull request labels May 17, 2023
@H-Huang H-Huang marked this pull request as ready for review May 17, 2023 19:44
Copy link
Contributor

@rohan-varma rohan-varma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch~

@H-Huang
Copy link
Member Author

H-Huang commented May 18, 2023

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants