Skip to content

A10 vs A100 Ubuntu AKS Images Nvidia Driver and Cuda Version differences #3994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lynkz-matt-psaltis opened this issue Nov 11, 2023 · 3 comments
Labels

Comments

@lynkz-matt-psaltis
Copy link

I'm trying to understand my path forward with our CUDA workloads for both A10 machines (e.g. Standard_NV6ads_A10_v5) which are currently pinned to NVIDIA-SMI 510.73.08 Driver Version: 510.73.08 CUDA Version: 11.6 where as A100 machines are at CUDA 12.2?

From what I understand, the 510 driver is no longer supported by Nvidia and is not compatible with the CUDA 12 Compatibility Package (See: https://docs.nvidia.com/deploy/cuda-compatibility/ Table 3)

When using nvidia-smi and the latest AKS Ubuntu 22.04 images, my understanding is the Azure team are preinstalling nvidia drivers for the GPU enabled workloads. What's the right path here? Should the team be upgrading the preinstalled driver version to an Nvidia supported version? Should I cut my own images with newer drivers?

@lynkz-matt-psaltis
Copy link
Author

Also found: #3364

@caffeinism
Copy link

Ridiculously low driver version. Any update?

@lynkz-matt-psaltis
Copy link
Author

Looks like the latest image has been updated to v535.54.03 - CUDA 12.2

VM image: AKSUbuntu-2204gen2containerd-202312.06.0

Thanks team!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants