Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repository Setup: Docker Build Failing #1

Open
fidsusj opened this issue Jul 29, 2022 · 3 comments
Open

Repository Setup: Docker Build Failing #1

fidsusj opened this issue Jul 29, 2022 · 3 comments

Comments

@fidsusj
Copy link

fidsusj commented Jul 29, 2022

Dear GCNSplit maintainers,

I was trying to set up the repository as shown in the README.md. When running docker build -t local-torch-geometric . I received the following error:

> [ 4/27] RUN apt-get update && apt-get install -y --no-install-recommends cuda-cudart-10-1=10.1.243-1 cuda-compat-10-1 && ln -s cuda-10.1 /usr/local/cuda && rm -rf /var/lib/apt/lists/*:                                
NVIDIA/nvidia-docker#7 0.287 Get:1 http://ports.ubuntu.com/ubuntu-ports bionic InRelease [242 kB]                                                                                                                                                                      
NVIDIA/nvidia-docker#7 0.320 Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease [1581 B]                                                                                                                                      
NVIDIA/nvidia-docker#7 0.388 Err:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease                                                                                                                                               
...
NVIDIA/nvidia-docker#7 3.611 Reading package lists...
NVIDIA/nvidia-docker#7 4.173 W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
NVIDIA/nvidia-docker#7 4.173 E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease' is not signed.

There are fixes as proposed in NVIDIA/nvidia-container-toolkit#258, which leads to new errors in the following steps regarding locating the cuda-cudart-10-1 and cuda-compat-10-1 package.

Could you check the Dockerfiles needed for installation again (with clearing the docker build cache beforehand with docker builder prune) and update them to be up-to-date? Or could you provide a pre-built docker image in Docker Hub when the build works?

Thanks a lot and kind regards!

@soniahorchidan
Copy link
Collaborator

Hello! Thank you for raising this issue and for your interest in the project!
I just updated the Dockerfile, and the installation should hopefully be fixed now. Could you please try again and let us know if it worked?

@fidsusj
Copy link
Author

fidsusj commented Aug 23, 2022

Thanks a lot for including the fix! Unfortunately, this leads to the following consecutive error as expected:

 > [ 9/32] RUN apt-get update && apt-get install -y --no-install-recommends         cuda-cudart-10-1=10.1.243-1         cuda-compat-10-1 &&     ln -s cuda-10.1 /usr/local/cuda &&     rm -rf /var/lib/apt/lists/*:                             
NVIDIA/nvidia-docker#12 0.557 Hit:1 http://ports.ubuntu.com/ubuntu-ports bionic InRelease           
NVIDIA/nvidia-docker#12 0.654 Get:2 http://ports.ubuntu.com/ubuntu-ports bionic-updates InRelease [88.7 kB]
NVIDIA/nvidia-docker#12 0.995 Get:3 http://ports.ubuntu.com/ubuntu-ports bionic-backports InRelease [74.6 kB]
NVIDIA/nvidia-docker#12 1.191 Get:4 http://ports.ubuntu.com/ubuntu-ports bionic-security InRelease [88.7 kB]
NVIDIA/nvidia-docker#12 1.370 Get:5 http://ports.ubuntu.com/ubuntu-ports bionic-updates/universe arm64 Packages [2064 kB]
NVIDIA/nvidia-docker#12 2.375 Get:6 http://ports.ubuntu.com/ubuntu-ports bionic-updates/main arm64 Packages [2018 kB]
NVIDIA/nvidia-docker#12 3.015 Get:7 http://ports.ubuntu.com/ubuntu-ports bionic-security/main arm64 Packages [1639 kB]
NVIDIA/nvidia-docker#12 3.478 Get:8 http://ports.ubuntu.com/ubuntu-ports bionic-security/universe arm64 Packages [1367 kB]
NVIDIA/nvidia-docker#12 10.28 Err:9 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
NVIDIA/nvidia-docker#12 10.28   Temporary failure resolving 'developer.download.nvidia.com'
NVIDIA/nvidia-docker#12 20.28 Err:10 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
NVIDIA/nvidia-docker#12 20.28   Temporary failure resolving 'developer.download.nvidia.com'
NVIDIA/nvidia-docker#12 20.30 Fetched 7339 kB in 20s (367 kB/s)
NVIDIA/nvidia-docker#12 20.30 Reading package lists...
NVIDIA/nvidia-docker#12 20.87 W: Failed to fetch https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/InRelease  Temporary failure resolving 'developer.download.nvidia.com'
NVIDIA/nvidia-docker#12 20.87 W: Failed to fetch https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/InRelease  Temporary failure resolving 'developer.download.nvidia.com'
NVIDIA/nvidia-docker#12 20.87 W: Some index files failed to download. They have been ignored, or old ones used instead.
NVIDIA/nvidia-docker#12 20.89 Reading package lists...
NVIDIA/nvidia-docker#12 21.44 Building dependency tree...
NVIDIA/nvidia-docker#12 21.53 Reading state information...
NVIDIA/nvidia-docker#12 21.55 E: Unable to locate package cuda-cudart-10-1
NVIDIA/nvidia-docker#12 21.55 E: Unable to locate package cuda-compat-10-1
------
executor failed running [/bin/sh -c apt-get update && apt-get install -y --no-install-recommends         cuda-cudart-$CUDA_PKG_VERSION         cuda-compat-10-1 &&     ln -s cuda-10.1 /usr/local/cuda &&     rm -rf /var/lib/apt/lists/*]: exit code: 100

I can query https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/InRelease manually, but https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/InRelease returns 404 nevertheless. This should also not be just a temporary failure, since I tried this over several weeks now.

I'm not a CUDA expert and do not find any relatable issues online, do you have an idea what the problem might be?

@soniahorchidan
Copy link
Collaborator

Hello!

I am sorry the fix did not work. I will be looking into this problem again soon. In the meantime, please install the dependencies manually and run the code outside the Docker container. I will get back once the problem is solved.

Best regards,
Sonia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants