Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Problems in importing an image. #760

Closed
5 of 8 tasks
Xeriou opened this issue Jun 8, 2018 · 1 comment
Closed
5 of 8 tasks

Problems in importing an image. #760

Xeriou opened this issue Jun 8, 2018 · 1 comment

Comments

@Xeriou
Copy link

Xeriou commented Jun 8, 2018

1. Issue or feature description

I have same problem like this #174
It seem to be resolved in nvidia-docker2, but I can't work successfully

2. Steps to reproduce the issue

I just install nvidia-docker with
apt install nvidia-docker2

pull the container I need
nvidia-docker pull nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

run it
nvidia-docker run -it <image> /bin/bash
install python from apt in container

commit container to another repository
nvidia-docker commit <container> nvtest

save to tar
nvidia-docker save -o nvtest.tar nvtest

remove the nvtest in repository and runtime container

import tar from nvtest.tar
nvidia-docker import nvtest.tar nvtest

run command
docker run --runtime=nvidia --rm a81cce78e47c nvidia-smi

and output is
docker: Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH": unknown.

change nvidia-smi to /bin/bash has same output

and I try to use another method to export container

export container
nvidia-docker export -o nvtest.tgz b93807e16849

import tar from nvtest.tar
cat nvtest.tgz | nvidia-docker import - nvtest2

and I can use bash in container now !
but, the nvcc is not in $PATH (still in /usr/local/cuda/bin)
and when I use ldconfig , the output is

/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.384.130 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1 is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libcuda.so is empty, not checked.
/sbin/ldconfig.real: File /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.384.130 is empty, not checked.

/usr/lib/x86_64-linux-gnu/libcuda.so.384.130 is 0 byte file now @@...

3. Information to attach (optional if deemed irrelevant)

  • Kernel version from uname -a

Linux yichiun-ubuntu1604 4.13.0-43-generic #48~16.04.1-Ubuntu SMP Thu May 17 12:56:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Any relevant kernel output lines from dmesg
  • Driver information from nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Fri Jun 8 15:34:54 2018
Driver Version : 384.130

Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : GeForce GTX 1060 6GB
Product Brand : GeForce
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Disabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 1920
......

  • Docker version from docker version

Client:
Version: 18.03.1-ce
API version: 1.37
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:17:20 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm

Server:
Engine:
Version: 18.03.1-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.5
Git commit: 9ee9f40
Built: Thu Apr 26 07:15:30 2018
OS/Arch: linux/amd64
Experimental: false

  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'

nvidia-container-runtime 2.0.0+docker18.03.1-1 amd64 NVIDIA container runtime
nvidia-container-runtime-hook 1.3.0-1 amd64 NVIDIA container runtime hook
nvidia-docker (no description available)
nvidia-docker2 2.0.3+docker18.03.1-1 all nvidia-docker CLI wrapper

  • NVIDIA container library version from nvidia-container-cli -V

version: 1.0.0
build date: 2018-04-26T22:53+00:00
build revision: 163054a04b21c4455c8cae7e47873d9f2a091f55
build compiler: gcc-5 5.4.0 20160609
build platform: x86_64
build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections

  • NVIDIA container library logs (see troubleshooting)
  • Docker command, image and tag used
@flx42
Copy link
Member

flx42 commented Jun 8, 2018

nvidia-docker save -o nvtest.tar nvtest
nvidia-docker import nvtest.tar nvtest

This is wrong, after a docker save, you need to docker load. Not docker import.
I'm surprised it works, but it's clearly not doing what you expect.

For your second problem, doing docker export will drop all the environment variables.
But you need those environment variables to trigger GPU support:
https://github.com/nvidia/nvidia-container-runtime#environment-variables-oci-spec
A fix would be:

cat nvtest.tgz | docker import --change "ENV NVIDIA_VISIBLE_DEVICES=all" - nvtest2

@flx42 flx42 closed this as completed Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants