New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ROCm] AMDGPU XLA compiler bugfixes, HLO slice sorting #39106
Conversation
Deleting temporary files after compilation Deterministic sorting of HLO slices Workaround for a memory overrun in HIP Workaround for HIP's expectation that all modules are different
Hi, For commits we generally try to have a self-descriptive message; is it possible to update the title of this PR? (or if there are multiple bugfixes/features, to split it up?) |
@ekuznetsov139 Can you please check @cheshire's comments and keep us posted. Thanks! |
@ekuznetsov139 Can you please check @cheshire's comments and resolve conflicts?. Thanks! |
@ekuznetsov139 additionally tests would be great, if possible. |
Tests for which part? |
All of it? |
@ekuznetsov139 Can you please check @cheshire's comments and resolve conflicts?. Thanks! |
@ekuznetsov139 gentle ping |
@ekuznetsov139 Any update on this PR? Please. Thanks! |
@gbaned I occasionally come here, look at my changes, try to imagine how any of them (let alone all of them) could be covered with unit tests, and generally fail. Do not close the PR, I'm sure I'll come up with something eventually. |
@ekuznetsov139 From your initial message it seems this PR is doing at least 3 different things. I think it would be simpler to split it first. |
@ekuznetsov139 Can you please check @cheshire's comments and resolve conflicts?. Thanks! |
I've resubmitted part of this PR as #41641. |
This PR:
Implements HSACO cache, to avoid rerunning (expensive) AMDGPU compilation for identical modules
Makes sure that AMDGPU backend deletes temporary files after compilation
Implements deterministic sorting of HLO slices (without it, identical HLOs may result in different IRs)
Supplies workarounds for two bugs in the ROCm backend