-
Notifications
You must be signed in to change notification settings - Fork 2k
'too many levels of symbolic links' with --runtime=nvidia enabled on large image #588
Comments
Is the NVIDIA driver installed inside the image? |
I always try to avoid (re)installing NV drivers because it is generally asking for trouble in the container at build and runtime ( because of transparancy/awareness of the NV driver of the underlying system ) root@a21424a7b7d7:/# cat /proc/driver/nvidia/version |
The underlying container images on top of which this image is build don't have this problem. ( cuda & deploy-nvidia-docker ) |
Yes that's what I meant. It should not be installed, but maybe it is in your Dockerfile. |
For instance, I'm having this issue with the following simple Dockerfile: FROM nvidia/cuda:9.0-base-ubuntu16.04
RUN apt-get update && apt-get install --no-install-recommends -y nvidia-384 libcuda1-384 I get the following:
|
So, there is a problem with your image. But we can probably have a better error message or fallback here. |
thanks, will look into this. even though I could not find a trace of a driver in the container image, i presume this is the case. |
Inside the container, try |
yup.. bingo nvidia-387 Wil change this behaviour, tnx ! :) root@0719ec7dda56:/# dpkg -l | grep -E 'cuda|nvidia' |
Filed an issue to libnvidia-container. |
1. Issue or feature description
I am working on a fairly fat image for GPU compute cluster+desktop and found that NVidia-docker2 has an issue starting somewhat larger images with multiple image layers. Without the --runtime=nvidia the image work fine, even in Rancher.
2. Steps to reproduce the issue
docker run --runtime=nvidia -p 6080 twobombs/cudacluster sh /root/run
container_linux.go:265: starting container process caused "process_linux.go:368: container init caused "process_linux.go:351: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=8.0 --pid=28288 /var/lib/docker/overlay2/46c7244e1bb0e481c9dabcc77d025fb5897d0de3ba272dbc51eec1f9dd599634/merged]\\nnvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/46c7244e1bb0e481c9dabcc77d025fb5897d0de3ba272dbc51eec1f9dd599634/merged/usr/bin/nvidia-smi: too many levels of symbolic links\\n\"""
docker: Error response from daemon: oci runtime error: container_linux.go:265: starting container process caused "process_linux.go:368: container init caused "process_linux.go:351: running prestart hook 1 caused \"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=cuda>=8.0 --pid=28288 /var/lib/docker/overlay2/46c7244e1bb0e481c9dabcc77d025fb5897d0de3ba272dbc51eec1f9dd599634/merged]\\nnvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/46c7244e1bb0e481c9dabcc77d025fb5897d0de3ba272dbc51eec1f9dd599634/merged/usr/bin/nvidia-smi: too many levels of symbolic links \\n\""".
ERRO[0000] error waiting for container: context canceled
3. Information to attach (optional if deemed irrelevant)
Linux 1604 4.4.0-87-generic #110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
dmesg
: Nothing has been logged yet.nvidia-smi -a
nvidia-smi -a==============NVSMI LOG==============
Timestamp : Sun Dec 24 17:02:21 2017
Driver Version : 384.98
Attached GPUs : 4
GPU 00000000:06:00.0
Product Name : GeForce GTX 590
Product Brand : GeForce
Display Mode : N/A
Display Active : N/A
Persistence Mode : Disabled
Accounting Mode : N/A
Accounting Mode Buffer Size : N/A
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-64cb8516-eb24-ea73-40f2-68f31d0f97ca
Minor Number : 1
VBIOS Version : 70.10.42.00.02
MultiGPU Board : N/A
Board ID : N/A
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization mode : N/A
PCI
Bus : 0x06
Device : 0x00
Domain : 0x0000
Device Id : 0x108810DE
Bus Id : 00000000:06:00.0
Sub System Id : 0x83A31043
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : 40 %
Performance State : P12
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 1472 MiB
Used : 524 MiB
Free : 948 MiB
BAR1 Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : N/A
Average FPS : N/A
Average Latency : N/A
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 44 C
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : N/A
GPU 00000000:07:00.0
Product Name : GeForce GTX 590
Product Brand : GeForce
Display Mode : N/A
Display Active : N/A
Persistence Mode : Disabled
Accounting Mode : N/A
Accounting Mode Buffer Size : N/A
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-77bcf3f2-d4b9-b6cf-4653-a47de1a9514b
Minor Number : 0
VBIOS Version : 70.10.42.00.02
MultiGPU Board : N/A
Board ID : N/A
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization mode : N/A
PCI
Bus : 0x07
Device : 0x00
Domain : 0x0000
Device Id : 0x108810DE
Bus Id : 00000000:07:00.0
Sub System Id : 0x83A31043
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : 0 %
Performance State : P12
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 1474 MiB
Used : 73 MiB
Free : 1401 MiB
BAR1 Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : N/A
Average FPS : N/A
Average Latency : N/A
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 40 C
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : N/A
GPU 00000000:43:00.0
Product Name : GeForce GTX 590
Product Brand : GeForce
Display Mode : N/A
Display Active : N/A
Persistence Mode : Disabled
Accounting Mode : N/A
Accounting Mode Buffer Size : N/A
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-5ea1d88a-6467-587f-f09c-fe800c7f3447
Minor Number : 3
VBIOS Version : 70.10.37.00.02
MultiGPU Board : N/A
Board ID : N/A
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization mode : N/A
PCI
Bus : 0x43
Device : 0x00
Domain : 0x0000
Device Id : 0x108810DE
Bus Id : 00000000:43:00.0
Sub System Id : 0x83A31043
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : 41 %
Performance State : P12
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 1474 MiB
Used : 73 MiB
Free : 1401 MiB
BAR1 Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : N/A
Average FPS : N/A
Average Latency : N/A
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 48 C
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : N/A
GPU 00000000:44:00.0
Product Name : GeForce GTX 590
Product Brand : GeForce
Display Mode : N/A
Display Active : N/A
Persistence Mode : Disabled
Accounting Mode : N/A
Accounting Mode Buffer Size : N/A
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-a2a63dce-dd34-3302-e2b4-cd1d4bbdee98
Minor Number : 2
VBIOS Version : 70.10.37.00.01
MultiGPU Board : N/A
Board ID : N/A
GPU Part Number : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU Virtualization Mode
Virtualization mode : N/A
PCI
Bus : 0x44
Device : 0x00
Domain : 0x0000
Device Id : 0x108810DE
Bus Id : 00000000:44:00.0
Sub System Id : 0x83A31043
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays since reset : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : 0 %
Performance State : P0
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 1474 MiB
Used : 73 MiB
Free : 1401 MiB
BAR1 Memory Usage
Total : N/A
Used : N/A
Free : N/A
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : N/A
Average FPS : N/A
Average Latency : N/A
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending : N/A
Temperature
GPU Current Temp : 65 C
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Processes : N/A
docker version
docker versionClient:
Version: 17.09.1-ce
API version: 1.32
Go version: go1.8.3
Git commit: 19e2cf6
Built: Thu Dec 7 22:24:23 2017
OS/Arch: linux/amd64
Server:
Version: 17.09.1-ce
API version: 1.32 (minimum version 1.12)
Go version: go1.8.3
Git commit: 19e2cf6
Built: Thu Dec 7 22:23:00 2017
OS/Arch: linux/amd64
Experimental: false
NVIDIA packages version from
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
dpkg -l 'nvidia'Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-======================-================-================-=================================================
ii libnvidia-container-to 1.0.0
alpha.2-1 amd64 NVIDIA container runtime library (command-line toalpha.2-1 amd64 NVIDIA container runtime libraryii libnvidia-container1:a 1.0.0
ii nvidia-375 384.98-0ubuntu0~ amd64 Transitional package for nvidia-384
ii nvidia-384 384.98-0ubuntu0~ amd64 NVIDIA binary driver - version 384.98
un nvidia-common (no description available)
ii nvidia-container-runti 1.1.0+docker17.0 amd64 NVIDIA container runtime
ii nvidia-cuda-dev 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development files
ii nvidia-cuda-doc 7.5.18-0ubuntu1 all NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 7.5.18-0ubuntu1 amd64 NVIDIA CUDA Debugger (GDB)
ii nvidia-cuda-toolkit 7.5.18-0ubuntu1 amd64 NVIDIA CUDA development toolkit
un nvidia-docker (no description available)
ii nvidia-docker2 2.0.1+docker17.0 all nvidia-docker CLI wrapper
un nvidia-driver (no description available)
un nvidia-driver-binary (no description available)
un nvidia-legacy-340xx-vd (no description available)
un nvidia-libopencl1 (no description available)
un nvidia-libopencl1-384 (no description available)
un nvidia-libopencl1-dev (no description available)
ii nvidia-modprobe 361.28-1 amd64 utility to load NVIDIA kernel modules and create
ii nvidia-opencl-dev:amd6 7.5.18-0ubuntu1 amd64 NVIDIA OpenCL development files
un nvidia-opencl-icd (no description available)
ii nvidia-opencl-icd-384 384.98-0ubuntu0~ amd64 NVIDIA OpenCL ICD
un nvidia-persistenced (no description available)
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime
ii nvidia-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Profiler for CUDA and OpenCL
ii nvidia-settings 387.22-0ubuntu0~ amd64 Tool for configuring the NVIDIA graphics driver
un nvidia-settings-binary (no description available)
un nvidia-smi (no description available)
un nvidia-vdpau-driver (no description available)
ii nvidia-visual-profiler 7.5.18-0ubuntu1 amd64 NVIDIA Visual Profiler for CUDA and OpenCL
NVIDIA container library version from
nvidia-container-cli -V
nvidia-container-cli -Vversion: 1.0.0
build date: 2017-10-30T23:47+00:00
build revision: ec15c7233bd2de821ad5127cb0de6b52d9d2083c
build compiler: gcc-5 5.4.0 20160609
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
NVIDIA container library logs (see troubleshooting)
Docker command, image and tag used
docker run --runtime=nvidia -p 6080 twobombs/cudacluster sh /root/run
The text was updated successfully, but these errors were encountered: