Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error messages for kernel selection #90783

Closed
wants to merge 1 commit into from

Conversation

mikekgfb
Copy link
Contributor

Summary: Error messages fro kernel selection

Test Plan: sandcastle & github

Differential Revision: D42008661

@pytorch-bot
Copy link

pytorch-bot bot commented Dec 13, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90783

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e522f45:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D42008661

mikekgfb added a commit to mikekgfb/pytorch that referenced this pull request Dec 13, 2022
Summary:
Pull Request resolved: pytorch#90783

Error messages fro kernel selection

Test Plan: sandcastle & github

Differential Revision: D42008661

fbshipit-source-id: 435eba80b509b4bc84143bad7ee9603820b8a152
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D42008661

mikekgfb added a commit to mikekgfb/pytorch that referenced this pull request Dec 13, 2022
Summary:
Pull Request resolved: pytorch#90783

Error messages fro kernel selection

Test Plan: sandcastle & github

Differential Revision: D42008661

fbshipit-source-id: f9639ab716601d31429a8f2d94b848980f3ea08f
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D42008661

mikekgfb added a commit to mikekgfb/pytorch that referenced this pull request Dec 13, 2022
Summary:
Pull Request resolved: pytorch#90783

Error messages fro kernel selection

Test Plan: sandcastle & github

Differential Revision: D42008661

fbshipit-source-id: 787f044eb09815eae16060ac5b6bcfec8d312976
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D42008661

Summary:
Pull Request resolved: pytorch#90783

Error messages fro kernel selection

Test Plan: sandcastle & github

Differential Revision: D42008661

fbshipit-source-id: d71eb0526713f9095990feff75e4548a89094f70
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D42008661

@mikekgfb
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 14, 2022
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: PR #90783 has not been reviewed yet (Rule superuser)

Details for Dev Infra team Raised by workflow job

Copy link
Contributor

@cpuhrsch cpuhrsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why turn all these checks into warnings?

@albanD albanD removed their request for review December 14, 2022 16:23
@mikekgfb
Copy link
Contributor Author

Why turn all these checks into warnings?

Because we're reporting the first error for the first kernel today. So, if you disable mem_efficient, and Flash predictaes do not work, you still get an answer why memory efficient kernel does not work. That's straight up wrong because it's not actionable, and it's also misleading.

So, for the case when there's not kernel, I changed it to dumping all the reasons for all kerbnels, and then error out "and hence you lose, and don't have an executable kernel".

This gives people a way to actually impact.

@drisspg had previously mentioned he could fix the current issue with some if-spaghetti logic to make sure we give the error for the enabled kernel, but also pointed out that it's just not scalable if we ever add more kernels coz combinatorial explosion.

@mikekgfb
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 2 additional jobs have failed, first few of them are: trunk ,trunk / linux-focal-rocm5.2-py3.8 / test (default, 1, 2, linux.rocm.gpu)

Details for Dev Infra team Raised by workflow job

@mikekgfb
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants