Skip to content
This repository has been archived by the owner on Jan 22, 2024. It is now read-only.

Please assist - unable to get update for my distribution #698

Closed
6 tasks
eugenepptech opened this issue Apr 12, 2018 · 4 comments
Closed
6 tasks

Please assist - unable to get update for my distribution #698

eugenepptech opened this issue Apr 12, 2018 · 4 comments

Comments

@eugenepptech
Copy link

eugenepptech commented Apr 12, 2018

Hi All,
Anyone can assist me for my issue unable to run nvidia-docker-compose build on my server?


1. Issue or feature description

unable to run the nvidia-docker-compose build

2. Steps to reproduce the issue

sudo nvidia-docker-compose build

3. Information to attach (optional if deemed irrelevant)

Traceback (most recent call last):
File "/usr/lib/python3.5/urllib/request.py", line 1254, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/lib/python3.5/http/client.py", line 1106, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
self.endheaders(body)
File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/usr/lib/python3.5/http/client.py", line 849, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib/python3.5/socket.py", line 711, in create_connection
raise err
File "/usr/lib/python3.5/socket.py", line 702, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/bin/nvidia-docker-compose", line 55, in
resp = request.urlopen('http://{0}/docker/cli/json'.format(args.nvidia_docker_host)).read().decode()
File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.5/urllib/request.py", line 466, in open
response = self._open(req, data)
File "/usr/lib/python3.5/urllib/request.py", line 484, in _open
'_open', req)
File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
result = func(*args)
File "/usr/lib/python3.5/urllib/request.py", line 1282, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib/python3.5/urllib/request.py", line 1256, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

  • Kernel version from uname -a --> Linux gpusvr 4.4.0-112-generic nvidia-docker.service couldn't start. #135-Ubuntu SMP Fri Jan 19 11:48:46 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 16.04.3 LTS
    Release: 16.04
    Codename: xenial

  • Any relevant kernel output lines from dmesg
    [Thu Apr 12 11:27:38 2018] device veth27f71e4 entered promiscuous mode
    [Thu Apr 12 11:27:38 2018] IPv6: ADDRCONF(NETDEV_UP): veth27f71e4: link is not ready
    [Thu Apr 12 11:27:38 2018] eth0: renamed from vethe4974ab
    [Thu Apr 12 11:27:38 2018] IPv6: ADDRCONF(NETDEV_CHANGE): veth27f71e4: link becomes ready
    [Thu Apr 12 11:27:38 2018] docker0: port 2(veth27f71e4) entered forwarding state
    [Thu Apr 12 11:27:38 2018] docker0: port 2(veth27f71e4) entered forwarding state
    [Thu Apr 12 11:27:53 2018] docker0: port 2(veth27f71e4) entered forwarding state
    [Thu Apr 12 11:50:53 2018] docker0: port 2(veth27f71e4) entered disabled state
    [Thu Apr 12 11:50:53 2018] vethe4974ab: renamed from eth0
    [Thu Apr 12 11:50:53 2018] docker0: port 2(veth27f71e4) entered disabled state
    [Thu Apr 12 11:50:53 2018] device veth27f71e4 left promiscuous mode
    [Thu Apr 12 11:50:53 2018] docker0: port 2(veth27f71e4) entered disabled state

  • Driver information from nvidia-smi -a
    Driver Version : 384.111

  • Docker version from docker version
    Client:
    Version: 17.12.0-ce
    API version: 1.35
    Go version: go1.9.2
    Git commit: c97c6d6
    Built: Wed Dec 27 20:08:17 2017
    OS/Arch: linux/ppc64le

Server:
Engine:
Version: 17.12.0-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.2
Git commit: c97c6d6
Built: Wed Dec 27 20:06:35 2017
OS/Arch: linux/ppc64le
Experimental: false

  • NVIDIA packages version from dpkg -l '*nvidia*' or rpm -qa '*nvidia*'
    Desired=Unknown/Install/Remove/Purge/Hold
    | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
    |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
    ||/ Name Version Architecture Description
    +++-====================================-=======================-=======================-=============================================================================
    un bumblebee-nvidia (no description available)
    rc nvidia-361 361.119-0ubuntu1 ppc64el NVIDIA binary driver - version 361.119
    ii nvidia-384 384.111-0ubuntu1 ppc64el NVIDIA binary driver - version 384.111
    un nvidia-common (no description available)
    un nvidia-cuda-dev (no description available)
    un nvidia-cuda-doc (no description available)
    un nvidia-cuda-gdb (no description available)
    rc nvidia-cuda-toolkit 7.5.18-0ubuntu1 ppc64el NVIDIA CUDA development toolkit
    un nvidia-current (no description available)
    ii nvidia-docker 1.0.1-1 ppc64el NVIDIA Docker container tools
    un nvidia-driver-binary (no description available)
    ii nvidia-driver-local-repo-ubuntu1604- 1.0-1 ppc64el nvidia-driver-local repository configuration files
    un nvidia-libopencl1 (no description available)
    un nvidia-libopencl1-384 (no description available)
    un nvidia-libopencl1-dev (no description available)
    ii nvidia-modprobe 390.31-0ubuntu1 ppc64el Load the NVIDIA kernel driver and create device files
    un nvidia-opencl-dev (no description available)
    un nvidia-opencl-icd (no description available)
    un nvidia-opencl-icd-361 (no description available)
    ii nvidia-opencl-icd-384 384.111-0ubuntu1 ppc64el NVIDIA OpenCL ICD
    un nvidia-persistenced (no description available)
    un nvidia-prime (no description available)
    un nvidia-profiler (no description available)
    ii nvidia-settings 396.18-0ubuntu0~gpu16.0 ppc64el Tool for configuring the NVIDIA graphics driver
    un nvidia-settings-binary (no description available)
    un nvidia-visual-profiler (no description available)

  • NVIDIA container library version from nvidia-container-cli -V
    nvidia-container-cli: command not found

@flx42
Copy link
Member

flx42 commented Apr 12, 2018

Hello @eugenepptech, we do not maintain nvidia-docker-compose.

You should switch to nvidia-docker2 (look at the latest instructions on GitHub), then with a recent version of compose you will be able to set the runtime: docker/compose#5405
For instance: https://github.com/3XX0/prometheus-dcgm/blob/master/docker-compose.yml#L32

@flx42 flx42 closed this as completed Apr 12, 2018
@eugenepptech
Copy link
Author

okies.. thank you for the comment

@eugenepptech
Copy link
Author

We format the server and reinstall again.

Below is the issue. - unable to update the nvidia driver version to 384.66.
Below error show that unable to update the nvidia-container-runtime for ubuntu16.04/ppc64el
Hope can assist. Thank you very much.

================================================================
root@PPTMINSKYSVR:/home/ppt/software# sudo dpkg -i nvidia-driver-local-repo-ubuntu1604-384*.deb
(Reading database ... 125371 files and directories currently installed.)
Preparing to unpack nvidia-driver-local-repo-ubuntu1604-384.66_1.0-1_ppc64el.deb ...
Unpacking nvidia-driver-local-repo-ubuntu1604-384.66 (1.0-1) over (1.0-1) ...
Setting up nvidia-driver-local-repo-ubuntu1604-384.66 (1.0-1) ...
root@PPTMINSKYSVR:/home/ppt/software# sudo apt-get update
Get:1 file:/var/cuda-repo-8-0-local-ga2v2 InRelease
Ign:1 file:/var/cuda-repo-8-0-local-ga2v2 InRelease
Get:2 file:/var/nvidia-driver-local-repo-384.66 InRelease
Ign:2 file:/var/nvidia-driver-local-repo-384.66 InRelease
Get:3 file:/var/cuda-repo-8-0-local-ga2v2 Release [574 B]
Get:4 file:/var/nvidia-driver-local-repo-384.66 Release [574 B]
Get:3 file:/var/cuda-repo-8-0-local-ga2v2 Release [574 B]
Get:4 file:/var/nvidia-driver-local-repo-384.66 Release [574 B]
Hit:7 https://download.docker.com/linux/ubuntu xenial InRelease
Hit:8 http://public.dhe.ibm.com/software/server/POWER/Linux/mldl/ubuntu xenial InRelease
Hit:9 http://sg.ports.ubuntu.com/ubuntu-ports xenial InRelease
Hit:10 http://ports.ubuntu.com/ubuntu-ports xenial-security InRelease
Hit:11 http://sg.ports.ubuntu.com/ubuntu-ports xenial-updates InRelease
Ign:12 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el InRelease
Hit:13 http://sg.ports.ubuntu.com/ubuntu-ports xenial-backports InRelease
Ign:14 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el InRelease
Hit:15 https://nvidia.github.io/nvidia-docker/ubuntu16.04/ppc64el InRelease
Ign:16 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Release
Ign:17 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Release
Ign:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Ign:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Ign:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Ign:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Ign:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Ign:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Ign:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Ign:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Ign:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Ign:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Err:18 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Packages
404 Not Found
Ign:19 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en_SG
Ign:20 https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Translation-en
Err:21 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Packages
404 Not Found
Ign:22 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en_SG
Ign:23 https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Translation-en
Reading package lists... Done
W: The repository 'https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el Release' does not have a Release file.
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
N: See apt-secure(8) manpage for repository creation and user configuration details.
W: The repository 'https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el Release' does not have a Release file.
N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
N: See apt-secure(8) manpage for repository creation and user configuration details.
E: Failed to fetch https://nvidia.github.io/libnvidia-container/ubuntu16.04/ppc64el/Packages 404 Not Found
E: Failed to fetch https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/ppc64el/Packages 404 Not Found
E: Some index files failed to download. They have been ignored, or old ones used instead.

@eugenepptech eugenepptech changed the title Please assist - unable to run nvidia-docker-compose build Please assist - unable to get update for my distribution Apr 13, 2018
@flx42
Copy link
Member

flx42 commented Apr 13, 2018

Oh, you're using a Power machine. We still haven't released the packages for this architecture, sorry.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants