New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
rocm: 4.0.1 -> 4.1.0 #117450
rocm: 4.0.1 -> 4.1.0 #117450
Conversation
Result of 2 packages marked as broken and skipped:
1 package failed to build:2 packages skipped due to time constraints:
1 package built successfully:
1 suggestion:
Note that build failures may predate this PR, and could be nondeterministic or hardware dependent. Result of 6 packages skipped due to time constraints:
3 packages built successfully:
1 suggestion:
|
This also migrates this derivation from the old deprecated ROC-smi repository to rocm_smi_lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested OpenCL and rocm-smi and they are still working. Thanks!
postPatch = '' | ||
# Upstream ROCm is installed in an /opt directory. For this reason, | ||
# it does not completely follow FHS layout, creating top-level | ||
# rocm_smi, oam, and bindings top-level directories. Since rocm-smi | ||
# is a package that is typically installed, we change the paths to | ||
# follow FHS more closely. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
postPatch = '' | |
# Upstream ROCm is installed in an /opt directory. For this reason, | |
# it does not completely follow FHS layout, creating top-level | |
# rocm_smi, oam, and bindings top-level directories. Since rocm-smi | |
# is a package that is typically installed, we change the paths to | |
# follow FHS more closely. | |
# Upstream ROCm is installed in an /opt directory. For this reason, | |
# it does not completely follow FHS layout, creating top-level | |
# rocm_smi, oam, and bindings top-level directories. Since rocm-smi | |
# is a package that is typically installed, we change the paths to | |
# follow FHS more closely. | |
postPatch = '' |
This reduces the final derivation size a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But than half of the comments are in the indented block and the others outside?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are written in the phase the are treated as part of the command and updating them causes a rebuild because they are included in the drv. This also grows the size of the drv file which should be avoided especially for bigger comments. If they comments are just a line or two this is not a big problem but for bigger ones this can be avoided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the size of the drv is much of an issue, given that the whole derivation is 3KiB. Rebuilds when changing comments is annoying, but these comments don't really get updated anyway. Unless the comments are not relevant or accurate anymore, but then other parts of the derivation will change as well.
Of course, if a phase has a single comment, I would agree that it is better to put it outside the phase itself. But when a phase has several comments that are part of a 'discourse', I think it is nicer to have them in the same scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clinfo
looks good for me, and hip-4.1.0 in the overlay seems to be working, too.
Thanks for the reviews and testing the changes on actual hardware! |
Just as a heads up, there may be issues on Vega 20 hardware. I only have an RX580, so I don't know if we're hitting this. ETA: Right, so there are patches that are yet to make it into a kernel that are needed for ROCm 4.1 on Radeon VII hardware. Users must either stay on ROCm 4.0, or use an older kernel that works with the 4.1 rock-dkms. We do not have updated packaging for the dkms route, but I think the overlay used to support that. It could be revived if there is a need and the kernel patch upstreaming takes much longer. |
Thanks for the heads-up! I am not sure how we could best solve this. I guess that people who have Radeon VII hardware could use an older revision of nixpkgs for rocm-opencl-icd. I am also a bit worried about future testing, since Polaris GPUs are not officially supported anymore, I guess support could break in the future. Does anyone in the Nix ecosystem have a Vega 7nm, Vega 10, or MI100? |
I don鈥檛 want to drop a burden on anyone, but let me just gently ping @athas and @Wulfsta as two interested parties. A reasonable(?) hope is to stay the course for now, hope that ROCm picks up RDNA2 support within a month or two, and that one of the people who won the able-to-buy-a-GPU lottery comes along to help out by running each new release. |
I'm using ROCm from Nixpkgs these days, although I may switch back to this repository for HIP. I don't have a Vega 20-based GPU, though, so I wouldn't have any opinion anyway. I hope to get an MI100 soon, but it'll probably go in a CentOS system, and I doubt the rest of my lab would be pleased to have me Nixify it! |
@acowley Please, feel free to ping me whenever - I have certainly done it to you enough! I can try to test this sometime in the next few days, but I am particularly busy right now and likely will end up delaying. Also, I have to figure out how to use this right from nixpkgs - I am still using some rocm-nixos fork or branch at this point... Edit: I have a Radeon VII if anyone ever needs me to test something for nixpkgs - but I can't promise it will happen quickly due to my IRL schedule. |
@Wulfsta Thanks, I think everything is known for this release. The main thing is that when new releases come around, we want people to be able to run them at least once before we update nixpkgs. I feel as though OpenCL will be fine since they鈥檙e using ROCm for that in every driver. But in the event that Polaris / gfx803 / RX580 totally loses support, we鈥檇 need to figure out something. I don鈥檛 know for certain, but I think the support issue on AMD鈥檚 side is mostly the specialized kernels for ML middleware that they don鈥檛 want to do for Polaris because it鈥檚 different enough from Vega and no HPC customers are using Polaris. |
Motivation for this change
Update to ROCm 4.1.0. I cannot test this since my Radeon card is on a shelf 馃槈 .
Do not merge this PR unless it is approved by both @acowley and @Flakebi .
Things done
sandbox
innix.conf
on non-NixOS linux)nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
./result/bin/
)nix path-info -S
before and after)