Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

"exec: \"nvidia-smi\": executable file not found in $PATH". #457

Closed
DianaYu0601 opened this issue Sep 2, 2017 · 9 comments
Closed

"exec: \"nvidia-smi\": executable file not found in $PATH". #457

DianaYu0601 opened this issue Sep 2, 2017 · 9 comments

Comments

@DianaYu0601
Copy link

DianaYu0601 commented Sep 2, 2017

Hi,

I'm on an Ubuntu14:0
nvidia-smi works both on my host machine and nvidia/cuda image, but not on my own image

diana@brick:/etc/default$ nvidia-docker run --rm nvidia/cuda nvidia-smi
Sat Sep  2 08:48:00 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:01:00.0     Off |                  N/A |
|  0%   47C    P8    12W / 200W |   1863MiB /  8112MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:02:00.0     Off |                  N/A |
|  0%   47C    P8     7W / 200W |      0MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1080    Off  | 0000:05:00.0     Off |                  N/A |
|  0%   48C    P8     7W / 200W |      0MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

erro goes like this:

diana@brick:/etc/default$ nvidia-docker run --rm ubuntu-tfserving nvidia-smi
container_linux.go:247: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH"
docker: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH".

I tried docker volume rm -f nvidia_driver_375.39 from other answers but was not quite helpful

sudo journalctl -n -u nvidia-docker is like:

diana@brick:/etc/default$ sudo journalctl -n -u nvidia-docker
-- Logs begin at 三 2017-07-26 10:19:26 CST, end at 六 2017-09-02 16:52:25 CST. --
9月 02 16:33:28 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:28 Loading NVIDIA management library
9月 02 16:33:28 brick systemd[1]: Started NVIDIA Docker plugin.
9月 02 16:33:29 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:29 Discovering GPU devices
9月 02 16:33:30 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:30 Provisioning volumes at /var/lib/nvidia-docker/volumes
9月 02 16:33:30 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:30 Serving plugin API at /var/lib/nvidia-docker
9月 02 16:33:30 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:30 Serving remote API at localhost:3476
9月 02 16:33:35 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:35 Received activate request
9月 02 16:33:35 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:33:35 Plugins activated [VolumeDriver]
9月 02 16:48:00 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:48:00 Received mount request for volume 'nvidia_driver_375.66'
9月 02 16:48:00 brick nvidia-docker-plugin[30535]: /usr/bin/nvidia-docker-plugin | 2017/09/02 16:48:00 Received unmount request for volume 'nvidia_driver_375.66'
diana@brick:/etc/default$ nvidia-docker -v
Docker version 17.03.2-ce, build f5ec1e2
diana@brick:/etc/default$ docker volume list
DRIVER              VOLUME NAME
local               0a5edae96ce74f70a56457df315bf3589602428021b0827cbe50615427af4009
local               1c9edb7a524eb537ced44380cb872c1adc43b700d28a5947b0f01a90c5733c3b
local               1d0fedbd4c32c846e427006cc68c397f93fe130c6eab047492c63a23f9f90966
local               35c04b5f4c3d1c45526af21418051610adbe53b524a76234dc48766ab056ab48
local               3975b61bd044067da63cf0c20f1c20f3b228d8c67f600c080ec6b1db113c07d9
local               3dca8aacf69dcd4ea9fcfa0c27336760489922bc9cb7664537ba9a30ab2d1b39
local               4addaa1d4ca4a91942dd304d59251c2df3cc5fb1541fbe3639878b25dfa8b716
local               5b2b82db1e085a6647ba2105f05ceb12899838ebef00734e8111cfc2307dde1b
local               61fd317c9f1d5e38ebb0dbc07812fe6ec632f91a55b8465e682df678c1efb565
local               d9d3d2e7801a505fd766c471b2fe9a312e65d5e3e94f6d661a8ee37d2acebb68
local               dcf92f591845d748b1662229492f871524d0413243ee637e138269d55e28073b
local               ee13cb0ea07ab630db843860fb6e5966da48bdda43b56919b91c93f3b70f414f
local               f2af508db5612f0c355a73c4880444176f2758d1bfed2261d53ae83e12d3362b
nvidia-docker       nvidia_driver_375.66

btw, I try sudo find $(docker volume inspect -f '{{.Mountpoint}}' nvidia_driver_367.57) -name nvidia-smi, and came back like this:

diana@brick:/etc/default$ sudo find $(docker volume inspect -f '{{.Mountpoint}}' nvidia_driver_367.57) -name nvidia-smi
Error: No such volume: nvidia_driver_367.57
diana@brick:/etc/default$ which nvidia-cuda-mps-control nvidia-cuda-mps-server nvidia-debugdump nvidia-persistenced nvidia-smi
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-persistenced
/usr/bin/nvidia-smi
diana@brick:/etc/default$ systemd-run -t --user which nvidia-cuda-mps-control nvidia-cuda-mps-server nvidia-debugdump nvidia-persistenced nvidia-smi
Running as unit run-r48b4eb46cebb4b58a450d15337d611fa.service.
Press ^] three times within 1s to disconnect TTY.
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-persistenced
/usr/bin/nvidia-smi

but i don't know the meaning or how to fix it. @3XX0 Can you help me on this one?
thanks a lot!

@3XX0
Copy link
Member

3XX0 commented Sep 7, 2017

Um maybe something has overridden the PATH environment variable in your image?
nvidia-smi should be under /usr/local/nvidia/bin inside the container

@ryanolson
Copy link

What is the base image for your tensorflow-serving Dockerfile? This is a common error if you do not build an image from one of our base CUDA images.

ubuntu@runc:~$ nvidia-docker run --rm -ti ubuntu:16.04 nvidia-smi
docker: Error response from daemon: oci runtime error: container_linux.go:262: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH".

@DianaYu0601
Copy link
Author

yeah, that seems to be the key problem.
I assumed nvidia-docker would import the system cuda path into docker container.
thanks a lot!

@lxzh
Copy link

lxzh commented Sep 12, 2017

@DianaYu0601 Have you solved this issue?

@DianaYu0601
Copy link
Author

DianaYu0601 commented Sep 12, 2017

@ljf1239848066 yes,

is the base image for your tensorflow-serving Dockerfile? This is a common error if you do not build an image from one of our base CUDA images.

@lxzh
Copy link

lxzh commented Sep 12, 2017

@DianaYu0601 Thanks for your reply.
I got this problem with a pure nvidia/cuda image.

@eliorc
Copy link

eliorc commented Nov 4, 2017

I seem to have a similar problem, but I don't really understood the answer.

nvidia-smi works flawlessly.
My nvidia docker plugin is up and running.

Whenever I sudo nvidia-docker run --rm nvidia/cuda nvidia-smi I get the error

container_linux.go:247: starting container process caused "exec: "nvidia-smi": executable file not found in $PATH"
/usr/bin/docker-current: Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "exec: "nvidia-smi": executable file not found in $PATH".

And I can see in my nvidia-docker-plugin log a similar mount/unmount log...

How do I fix this?

@jakubLangr
Copy link

@DianaYu0601 so what was the fix?

@flx42
Copy link
Member

flx42 commented Nov 14, 2017

Please try our latest version, 2.0. And open a new issue if you still have problems.

@NVIDIA NVIDIA locked and limited conversation to collaborators Nov 14, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants