Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

What's the point of NVIDIA Driver Container if it requires >= 418.81.07 NVIDIA Driver as dependency in the first place? #1681

Closed
junwang-wish opened this issue Sep 17, 2022 · 6 comments

Comments

@junwang-wish
Copy link

This is a general question about the motivation of NVIDIA Driver Container. My understanding is that it is one out of three ways to install NVIDIA Driver, supposedly the most portable and easy way.

However it is surprising that NVIDIA Driver Container requires NVIDIA Container Toolkit as a dependency, and NVIDIA Container Toolkit requires >= 418.81.07 NVIDIA Driver as a dependency...

If NVIDIA Driver Container's purpose is to make NVIDIA Driver installation easy, why does it require an installed driver in the first place?

@elezar
Copy link
Member

elezar commented Sep 19, 2022

I think that might be a issue with our documentation.

The NVIDIA Driver Container does not need the NVIDIA Container Toolkit to run, but the container toolkit must be configured to specify a different root option when the NVIDIA Driver Container is used.

@junwang-wish
Copy link
Author

Thanks @elezar , this is the line in the documentation (underlined with red) that could be misleading:
Screen Shot 2022-09-19 at 11 38 41 AM

@nvjmayo
Copy link
Contributor

nvjmayo commented Sep 19, 2022

For GPU Operator's use case. Driver Container is loaded first, then Container Toolkit container is loaded. This satisfies the dependency correctly. Driver Container itself does not depend on Container Toolkit, although it can work with it.

For manually loading Driver Container, such as through docker. You would want to load Driver Container first. And if you choose to use Container Toolkit (you do not have to use it), you'd follow the instructions for configuring /etc/nvidia-container-runtime/config.toml. You would configure the root setting before running any of your application containers that need Container Toolkit. This means you can make your edits before or after installing Driver Container and/or Container Toolkit. But always before running work with docker --gpus all run ...

@nvjmayo
Copy link
Contributor

nvjmayo commented Sep 19, 2022

@junwang-wish
It's my fault that GPU Driver Container's documentation is not in a good state. Sorry for the inconvenience!

@junwang-wish
Copy link
Author

No worries, thanks for the clarification @nvjmayo
However unfortunately, following the doc for installing and using Drive container, while having correct /etc/nvidia-container-runtime/config.toml, and as well as having Container Toolkit installed leads to some errors shown here (I opened it as another issue, so feel free to close this one):

NVIDIA/nvidia-container-toolkit#184

@nvjmayo
Copy link
Contributor

nvjmayo commented Sep 19, 2022

Thank you!

@nvjmayo nvjmayo closed this as completed Sep 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants