New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reland fast TypeMeta/ScalarType conversion #45544
Conversation
[ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 38ebd0e (more details on the Dr. CI page): ✅ None of the CI failures appear to be your fault 💚
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Hi @jeffdaily - I'm hoping you can advise me on next steps on a mysterious ROCm error this PR is triggering. (@ngimel tells me you're the person to ask in such situations :) You can see the failure in this log. Here's the relevant excerpt (the "HEY" messages are me bisecting - the position of the initial MIOpen error oddly precedes any actual op invocation, although the python output might just be out of sync with what's getting thrown out of MIOpen):
Hoping there are some obvious next things to try - I'm new to troubleshooting ROCm errors, so any guidance would be super appreciated (and happy to provide any details on the PR that might be useful). Thanks! |
Hi @bhosmer . I'll take a look at this today. And yes, myself or @sunway513 are the best points of contact for ROCm issues. |
Thank you Jeff! If there's any info I can supply to help troubleshoot, of course just LMK :) |
@bhosmer A few observations while I'm still digging. Looking over the diff for this PR, I'm not completely familiar with the need (or lack) of various C10 macros such as C10_EXPORT, C10_TYPENAME_CONSTEXPR, but I do see between the original and new code some inconsistency in their use. Perhaps we're getting some different behavior between the host compiler (gcc) and our device compiler hipcc (hip-clang). Looking at the top of c10/util/TypeIndex.h, gcc will take the and our hipcc will take the |
@jeffdaily yes! I totally agree that using On C10_EXPORT, I think the only change in this PR is a removal from a definition that became (unconditionally) |
Trying to understand this PR. You've changed the size of |
Yup, you're exactly right. IIRC the reason TensorImpl went down a full 64 is bc |
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Your PR has somehow tickled a latent bug (almost 2 years old) in our miopen integration. I'm testing a fix in our fork before submitting an upstream PR. |
Oh man, I can't thank you enough for tracking this down so quickly! |
@jeffdaily btw, I'm happy to do an upstream PR for the fix if that's easier! |
PR #46852 ready for review to unblock this PR. |
Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, no changes here, right? Just the upstream rocm bugfix.
@ezyang yes, just needed the upstream rocm bugfix. |
This is a reland of #44965, which was reverted because it triggered a latent issue in the ROCm integration, since fixed by @jeffdaily as noted in comments below (thanks Jeff!). Diff is identical except for rebasing and removal of a duplicate definition that Jeff spotted during troubleshooting. See #44965 for description and comments. Differential Revision: [D24006482](https://our.internmc.facebook.com/intern/diff/D24006482) [ghstack-poisoned]
Stack from ghstack:
This is a reland of #44965, which was reverted because it triggered a latent issue in the ROCm integration, since fixed by @jeffdaily as noted in comments below (thanks Jeff!). Diff is identical except for rebasing and removal of a duplicate definition that Jeff spotted during troubleshooting. See #44965 for description and comments.
Differential Revision: D24006482