Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running in Docker #1

Closed
discordianfish opened this issue May 29, 2018 · 2 comments
Closed

Running in Docker #1

discordianfish opened this issue May 29, 2018 · 2 comments

Comments

@discordianfish
Copy link

Hi,

I'm trying to run this in Docker. I've created a Dockerfile here: https://github.com/discordianfish/nvidia_gpu_prometheus_exporter/blob/master/Dockerfile

I've bind-mount /opt/nvidia/lib64 to the container and setup ld.so.conf to find it, yet the exporter still fails:

2018/05/29 17:39:20 Couldn't initialize gonvml: could not load NVML library. Make sure NVML is in the shared library search path.
# cat /etc/ld.so.conf.d/nvidia.conf 
/usr/local/nvidia/lib64
root@nvidia-exporter-ng2g9:~# ls /usr/local/nvidia/lib64
libEGL.so		 libGLESv1_CM_nvidia.so.1	libGLdispatch.so.0  libnvcuvid.so.390.46		 libnvidia-fbc.so	     libnvidia-ml.so.390.46
libEGL.so.1		 libGLESv1_CM_nvidia.so.390.46	libOpenCL.so	    libnvidia-cfg.so			 libnvidia-fbc.so.1	     libnvidia-opencl.so.1
libEGL.so.1.1.0		 libGLESv2.so			libOpenCL.so.1	    libnvidia-cfg.so.1			 libnvidia-fbc.so.390.46     libnvidia-opencl.so.390.46
libEGL_nvidia.so.0	 libGLESv2.so.2			libOpenCL.so.1.0    libnvidia-cfg.so.390.46		 libnvidia-glcore.so.390.46  libnvidia-ptxjitcompiler.so
libEGL_nvidia.so.390.46  libGLESv2.so.2.1.0		libOpenCL.so.1.0.0  libnvidia-compiler.so.390.46	 libnvidia-glsi.so.390.46    libnvidia-ptxjitcompiler.so.1
libGL.la		 libGLESv2_nvidia.so.2		libOpenGL.so	    libnvidia-egl-wayland.so.1		 libnvidia-gtk2.so.390.46    libnvidia-ptxjitcompiler.so.390.46
libGL.so		 libGLESv2_nvidia.so.390.46	libOpenGL.so.0	    libnvidia-egl-wayland.so.1.0.2	 libnvidia-gtk3.so.390.46    libnvidia-tls.so.390.46
libGL.so.1		 libGLX.so			libcuda.so	    libnvidia-eglcore.so.390.46		 libnvidia-ifr.so	     libvdpau_nvidia.so
libGL.so.1.7.0		 libGLX.so.0			libcuda.so.1	    libnvidia-encode.so			 libnvidia-ifr.so.1	     tls
libGLESv1_CM.so		 libGLX_indirect.so.0		libcuda.so.390.46   libnvidia-encode.so.1		 libnvidia-ifr.so.390.46     vdpau
libGLESv1_CM.so.1	 libGLX_nvidia.so.0		libnvcuvid.so	    libnvidia-encode.so.390.46		 libnvidia-ml.so	     xorg
libGLESv1_CM.so.1.2.0	 libGLX_nvidia.so.390.46	libnvcuvid.so.1     libnvidia-fatbinaryloader.so.390.46  libnvidia-ml.so.1
root@nvidia-exporter-ng2g9:~# ldconfig  -v|grep nvidia-ml     
ldconfig: Path `/lib/x86_64-linux-gnu' given more than once
ldconfig: Path `/usr/lib/x86_64-linux-gnu' given more than once
ldconfig: /lib/x86_64-linux-gnu/ld-2.24.so is the dynamic linker, ignoring

	libnvidia-ml.so.1 -> libnvidia-ml.so.390.46
root@nvidia-exporter-ng2g9:~# ldconfig  -v|grep nvidia   
ldconfig: Path `/lib/x86_64-linux-gnu' given more than once
ldconfig: Path `/usr/lib/x86_64-linux-gnu' given more than once
ldconfig: /lib/x86_64-linux-gnu/ld-2.24.so is the dynamic linker, ignoring

/usr/local/nvidia/lib64:
	libnvidia-glcore.so.390.46 -> libnvidia-glcore.so.390.46
	libnvidia-tls.so.390.46 -> libnvidia-tls.so.390.46
	libEGL_nvidia.so.0 -> libEGL_nvidia.so.390.46
	libnvidia-gtk3.so.390.46 -> libnvidia-gtk3.so.390.46
	libnvidia-gtk2.so.390.46 -> libnvidia-gtk2.so.390.46
	libnvidia-fatbinaryloader.so.390.46 -> libnvidia-fatbinaryloader.so.390.46
	libnvidia-opencl.so.1 -> libnvidia-opencl.so.390.46
	libnvidia-compiler.so.390.46 -> libnvidia-compiler.so.390.46
	libnvidia-ml.so.1 -> libnvidia-ml.so.390.46
	libGLESv2_nvidia.so.2 -> libGLESv2_nvidia.so.390.46
	libnvidia-ptxjitcompiler.so.1 -> libnvidia-ptxjitcompiler.so.390.46
	libnvidia-cfg.so.1 -> libnvidia-cfg.so.390.46
	libnvidia-ifr.so.1 -> libnvidia-ifr.so.390.46
	libnvidia-egl-wayland.so.1 -> libnvidia-egl-wayland.so.1.0.2
	libGLX_nvidia.so.0 -> libGLX_nvidia.so.390.46
	libnvidia-fbc.so.1 -> libnvidia-fbc.so.390.46
	libnvidia-eglcore.so.390.46 -> libnvidia-eglcore.so.390.46
	libnvidia-glsi.so.390.46 -> libnvidia-glsi.so.390.46
	libnvidia-encode.so.1 -> libnvidia-encode.so.390.46
	libGLESv1_CM_nvidia.so.1 -> libGLESv1_CM_nvidia.so.390.46
/usr/local/nvidia/lib64/tls: (hwcap: 0x8000000000000000)
	libnvidia-tls.so.390.46 -> libnvidia-tls.so.390.46
@discordianfish
Copy link
Author

I've actually figured this out myself with strace: Looks like I was simply missing permissions to access /dev/nvidia. Maybe the error returned in this case could be imrpoved?

@rohitagarwal003
Copy link
Member

Added instructions to make debugging easier: https://github.com/mindprince/nvidia_gpu_prometheus_exporter#running-inside-a-container

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants