-
Notifications
You must be signed in to change notification settings - Fork 34
Pin nvidia driver version #691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Checked this is idempotent. |
ansible/roles/cuda/tasks/install.yml
Outdated
ansible.builtin.shell: | ||
cmd: >- | ||
dnf module info nvidia-driver:{{ cuda_nvidia_driver_stream }} | | ||
grep -F {{ cuda_nvidia_driver_version }}.el{{ ansible_distribution_major_version }}.{{ ansible_architecture }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some packages don't have el9 suffix (distro independent) so does that mean this isn't a complete list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yeah like nvidia-imex-570-0:570.124.06-1.x86_64
So dnf module info nvidia-driver:570-open
only appears to show packages ending .el9.x86_64
, el9.noarch
and .x86_64
. So maybe its enough to just suffix the version with a .
when grepping - I was trying to avoid the case where you are after e.g. 570-0:570.124.06-1
and just grepping for that also gets you 570-0:570.124.06-10
.
edit: I'd missed the fact its an el9 repo, so I think this should be ok
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Annoying there are a few i686 packages too (which I guess we don't need)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good spot, I'd missed those 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So might need to filter for x86_64 and noarch on top of that you suggested? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can just | unique
the list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh no grep just version+ . obvs returns the entire package name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or the other option is to just install the i686 packages too (not sure how much bloat that adds)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, even with this we get:
Depsolve Error occurred:
Problem 1: conflicting requests
- nothing provides cuda-drivers-570 = 570.133.20 needed by cuda-drivers-fabricmanager-570-570.133.20-1.x86_64 from cuda-rhel9-x86_64
Problem 2: package cuda-drivers-fabricmanager-570.133.20-1.x86_64 from cuda-rhel9-x86_64 requires cuda-drivers-fabricmanager-570 = 570.133.20, but none of the providers can be installed
- conflicting requests
- nothing provides cuda-drivers-570 = 570.133.20 needed by cuda-drivers-fabricmanager-570-570.133.20-1.x86_64 from cuda-rhel9-x86_64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stange, I do note that one isn't installed with the standard dnf module install nvidia-driver
TODO