Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-22.04] Update NVIDIA signing key for package repos #1167

Merged

Conversation

ajdecon
Copy link
Collaborator

@ajdecon ajdecon commented May 2, 2022

See
https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/
for details on the key changes.

This commit:

  • Removes the nvidia-ml repo, which is deprecated and will not be
    updated
  • Updates the nvidia_cuda and nvidia_dcgm roles to use the new key and
    install workflow
  • Updates roles/requirements.txt to point to an updated version of
    nvidia.nvidia_driver

See
https://developer.nvidia.com/blog/updating-the-cuda-linux-gpg-repository-key/
for details on the key changes.

This commit:

- Removes the nvidia-ml repo, which is deprecated and will not be
updated
- Updates the nvidia_cuda and nvidia_dcgm roles to use the new key and
install workflow
- Updates roles/requirements.txt to point to an updated version of
nvidia.nvidia_driver
@ajdecon ajdecon requested a review from dholt May 2, 2022 17:00
@ajdecon ajdecon merged commit 1a7b13c into NVIDIA:release-22.04 May 2, 2022
@cocakohler
Copy link

cocakohler commented May 3, 2022

@ajdecon Just to understand it fully: I deployed DeepOps 22.04 4 days ago. Do i have to expect issues with my installation? How can i update to the lastest 22.04.1 without redelpying the whole K8s cluster ?

@ajdecon
Copy link
Collaborator Author

ajdecon commented May 3, 2022

@cocakohler :

The only changes are to the package repository configuration for the NVIDIA apt and yum repositories. You do not need to re-deploy your cluster, and this change should not impact the operation of your cluster. However, you will need to apply the key rotation changes in order to run package updates on software provided by these repositories.

The only three components in DeepOps that are impacted by these changes are the NVIDIA driver, CUDA, and DCGM. If you installed any of the impacted components, you should be able to update the configuration by re-running the appropriate component playbook (nvidia-driver.yml, nvidia-cuda.yml, or nvidia-dcgm.yml)

Alternatively, you can apply these changes manually using the instructions in the blog post.

If you are running DGX OS 5, please see the DGX OS instructions

@dholt dholt mentioned this pull request Aug 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants