Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NCCL] v2.18.5 #7491

Merged
merged 5 commits into from
Oct 24, 2023
Merged

[NCCL] v2.18.5 #7491

merged 5 commits into from
Oct 24, 2023

Conversation

tylerjthomas9
Copy link
Contributor

@tylerjthomas9 tylerjthomas9 commented Oct 4, 2023

Release notes

From my understanding, @maleadt or someone else must upload the private NCCL binaries to the Yggdrasil servers. The page to get to the NCCL download links is available here.

The goal of bumping NCCL is to add multi gpu training to XGBoost.jl. I have a working NCCL+XGBoost_jll build with CUDA v11 here.

@maleadt
Copy link
Collaborator

maleadt commented Oct 9, 2023

IIUC, looking at Nix, we could build NCCL from https://github.com/NVIDIA/nccl
The reason to try this, is that I'm moving away from uploading private binaries. With all other CUDA binaries, we can use the redist sources, but the NCCL binaries aren't part of that.

@findmyway
Copy link
Contributor

I'm wondering what else is required to get this newer version released?

With all other CUDA binaries, we can use the redist sources, but the NCCL binaries aren't part of that.

Given that, is it fine to manually upload it for the time?

@maleadt
Copy link
Collaborator

maleadt commented Oct 23, 2023

Given that, is it fine to manually upload it for the time?

I'll do it one last time. I've uploaded the following sources:

nccl_2.18.5-1+cuda11.0_aarch64.txz: 2b7ba3a646fdcac66544e99b39cb1720c7b6eaa74a89f753413dcbb450bd4c36
nccl_2.18.5-1+cuda11.0_ppc64le.txz: 70d5156fa8908182d88be9dd0007b4f8fb6bc92509846b7c2579179f6fbe3595
nccl_2.18.5-1+cuda11.0_x86_64.txz: 9e8db61e16db0ed937bd116471ad4963f83d3dab588f9b8fb499a869a3fbe374
nccl_2.18.5-1+cuda12.0_aarch64.txz: aa7b11bbbadb011b155b447dabcc8ee1b521f1ff34a39ef27e766ebe36f28c05
nccl_2.18.5-1+cuda12.0_ppc64le.txz: 32fc6734ed3306b5ad956f503704d186cfcdd8ca21cb0466f9fe5c512f9c799c
nccl_2.18.5-1+cuda12.0_x86_64.txz: d3add341537dbf737352c603fdb2098beccde2c127d77cd8d2d99af8852cef47
nccl_2.18.5-1+cuda12.2_aarch64.txz: 452d6a9511c4b4fe5ceb537f31d53b4f5c95608c743116d493119dfe83307ed2
nccl_2.18.5-1+cuda12.2_ppc64le.txz: c5584c80f0744897e602237947fb31fada2741ec34c872b61ad36f96606b47db
nccl_2.18.5-1+cuda12.2_x86_64.txz: e8f181ad8acc9a46290cf76d9fc4a635a97c5b7c025cd0d1b49cedb71a3fc968

@tylerjthomas9
Copy link
Contributor Author

tylerjthomas9 commented Oct 23, 2023

I'll do it one last time. I've uploaded the following sources:

Thank you for doing this. Something funky is going on with the x86_64 sources.

I have been playing with building nccl, but I am hitting some errors with the builds. I will keep pushing on that front for future builds.

@maleadt
Copy link
Collaborator

maleadt commented Oct 23, 2023

Looks like it's just the hashes that are wrong; you can use what I posted above.

@tylerjthomas9
Copy link
Contributor Author

I have the hashes you posted (and they match the ones I calculated). I tried to switch to the hashes that Buildkite was seeing, but they changed again when the next build happened, so it was likely using the error that you get when you try to download the sources without a valid session.

@maleadt
Copy link
Collaborator

maleadt commented Oct 23, 2023

You're using the wrong filenames, nccl_2.18.5-1+12.2_x86_64.txz instead of nccl_2.18.5-1+cuda12.2_x86_64.txz.

@tylerjthomas9 tylerjthomas9 marked this pull request as ready for review October 23, 2023 21:59
@maleadt maleadt added the cuda 🕹️ Builders related to Nvidia CUDA label Oct 24, 2023
@maleadt maleadt merged commit 2eb24c4 into JuliaPackaging:master Oct 24, 2023
12 checks passed
amontoison pushed a commit to amontoison/Yggdrasil that referenced this pull request Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda 🕹️ Builders related to Nvidia CUDA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants