-
-
Notifications
You must be signed in to change notification settings - Fork 15.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cudaPackages_12: 12.4 -> 12.8 #390885
cudaPackages_12: 12.4 -> 12.8 #390885
Conversation
5085730
to
ed2719b
Compare
I successfully built |
|
@SomeoneSerge i think something went wrong with your review - most of the packages that failed in your review build for me (for example |
809ae9c
to
68baed4
Compare
I don't have the computing power to run I guess that it's up to @SomeoneSerge and/or @ConnorBaker to make a call on this. |
One of your commits is missing a "c" at the beginning ( |
Yes, I think the machine must've run out of storage...
I'll just voice my more general stance on this, for the future: if people become unresponsive it's perfectly reasonable to move stuff in lack of their feedback, assuming due diligence. Gaetan pointed out on matrix that this update is becoming a blocker for other work such as fixing mistral-rs. We've previously already gone by the route of "update and face the fallout", I think we must do so again until we have resources to do more than minimum maintenance |
# NVIDIA B100 Accelerated | ||
archName = "Blackwell"; | ||
computeCapability = "10.0a"; | ||
isJetson = false; | ||
minCudaVersion = "12.8"; | ||
dontDefaultAfter = null; | ||
maxCudaVersion = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of removing metadata let's add some boolean for filtering it out in default capabilities computation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you are suggesting setting let's say dontDefaultAfter = "12.0"
with a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fix commit messages and metadata, and then merge?
The |
Not torch. It builds with older CUDA with no issues. |
Okay, will update the PR later today and merge. I will add myself to the CUDA maintainers team in nixpkgs, if you don't mind, so I can catch the fallout if any. |
…12.0a to save space RTX 50{7,8,9}0 is Compute Capability 12.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Diff LGTM, let's YOLO it?
Give me few hours, I am running the latest build, I will merge then. |
Welcome on the team! Consider opening a PR to https://github.com/NixOS/nixos-homepage/ if you haven't yet |
So it seems the fallout was small. The new build failures are listed here: https://hydra.nix-community.org/eval/356757
and all are caused by the failure in the What puzzles me is that the same failure already happened earlier with older CUDA and was reported here: #348386 Will investigate and try to fix the issue in the upcoming days. |
It was easier than I thought - GCC14 compatibility - fix in #393413 |
I finally managed to fix onnxruntime build with CUDA 12.8 which was AFAIK the only blocker for changing the default.
The fix is to turn off LTO when linking with CUDA, because otherwise the linking takes too much time and then ends up in infinite loop anyway.
Also in the past I enthusiastically added support for Compute Capability 10.0 and 10.1 GPUs but this is causing linking errors for onnxruntime (resulting shared library >2 GB, similar to what we face with magma), so I removed them and only kept CC 12.0 to save space (e.g. ollama upstream also does not build for CC 10.x but only for 12.0) and this makes it possible for onnxruntime shared library to link again.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.