-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] - spawn by jupyterhub on K8s ; tensorflow doesn't recognize the GPU cards #1831
Comments
Hi, @EajksEajks!
I don't think you're running this inside the container, because Overall, I also don't think our images are designed to support GPU properly.
This is how Python works, if you reimport the same module twice (in the same process), Python won't do it once again, that's why you only get the message for the first time. |
Hi, @mathbunnyru I was expecting tensorflow-notebook to support GPU cards out of the box as it is pretty unefficient to do machine learning without any proper hardware. Moreover I was misled by the JupyterHub installation instructions which mentions how to assign GPU cards to spawned notebooks. On the K8s cluster we are running on, the gpu-operator from nvidia is installed and the GPU are easily found as typing !nvidia-smi in the notebook shows. Now I understand that the CUDA libs are simply not installed :-) So I'll have a look to the projects you mention to find the way to have them installed. Note that it's a pity that people have to take the source code of your notebook to generate a new one with the CUDA libs installed. It would make much more sense that the tensorflow-notebook supports the GPU cards. Thx for your help. |
I understand your frustration. The thing is we're building the whole bunch of image, not just one. I think it's actually possible, and can be done in this project without hurting anyone. But I haven't yet seen a PR, that tries to achieve it. I'm also not very sure about NVIDIA's license on how we can use their images. For now, I think the easiest way is to just use the project I mentioned. |
What docker image(s) are you using?
tensorflow-notebook
OS system and architecture running docker image
ubuntu 20.04 / amd64
What Docker command are you running?
Dell PowerEdge R740 w/ 2 Nvidia A30 GPU cards
Host OS = Ubuntu 20.04.5
Kubernetes Cluster = 1.25.3
jupyterhub for K8s = 2.0.0
tensorflow-notebook = 2022-11-15
The container is spawned by jupyterhub.
How to Reproduce the problem?
Spawn a server requesting access to 1 or 2 Nvidia A30 GPU cards.
Under the notebook spawned by Jupyter Hub, in a terminal,
nvidia-smi lists the requested amount of GPUs (1 or 2).
In a notebook,
returns 0.
Note also that there is also another strange behavior. When I import tensorflow the first time, I get the following message, but when I import it right away a second time, it doesn't complain anymore.
Command output
No response
Expected behavior
No response
Actual behavior
tensorflow doesn't recognize any GPU card although nvidia-smi does.
Anything else?
No response
The text was updated successfully, but these errors were encountered: