Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run the voltaml/volta_diffusion:v0.1 docker image #3

Closed
wywywywy opened this issue Nov 23, 2022 · 5 comments
Closed

Unable to run the voltaml/volta_diffusion:v0.1 docker image #3

wywywywy opened this issue Nov 23, 2022 · 5 comments

Comments

@wywywywy
Copy link

wywywywy commented Nov 23, 2022

-> % sudo docker run -it --gpus all voltaml/volta_diffusion:v0.1 bash
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/e049fdb3bc56fecdeefb3b950034cbc757eeb166b152330d00ef6e8a2972af06/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown.
ERRO[0000] error waiting for container: context canceled

This is probably because when --gpus=all is specified, the Docker engine will try and mount all the nvidia & cuda bits & pieces into the container. But some of the files in the image (e.g. /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1) are actually links rather than files, so the mounting process is not successful.

Please can you open source the Dockerfile as well.

@Pop115
Copy link

Pop115 commented Nov 25, 2022

Same issue here, found an issue related to this on nvidia-docker repo NVIDIA/nvidia-container-toolkit#289

I made a Dockerfile containing this

RUN rm -rf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1 /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

and executed it with
docker build -t voltaml/volta_diffusion -f Dockerfile .

And it seems to work

@VoltaML
Copy link
Collaborator

VoltaML commented Nov 25, 2022

Added the Dockerfile. Please check and close the issue if its working.

@Pop115
Copy link

Pop115 commented Nov 25, 2022

Tried building with the command by using your Dockerfile
docker build -t voltaml/volta_diffusion -f Dockerfile .
but got the following error

 => ERROR [6/6] RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt                                                                                                                                                                                                                                             1.6s
------
 > [6/6] RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt:
NVIDIA/nvidia-docker#9 0.729 ERROR: Could not open requirements file: [Errno 2] No such file or directory: '/code/requirements.txt'
------
executor failed running [/bin/sh -c pip install --no-cache-dir --upgrade -r /code/requirements.txt]: exit code: 1

Please add instructions in the readme if the command is not correct

@JackCloudman
Copy link

Download this file
https://gist.github.com/JackCloudman/7143c7aeaafa54ed35b3f6cfe8a30c57

docker build -t voltaml/volta_diffusion:v0.1 -f Dockerfile .
docker run -it --gpus=all -p "8888:8888" voltaml/volta_diffusion:v0.1 jupyter lab --port=8888 --no-browser --ip 0.0.0.0 --allow-root

@harishprabhala
Copy link
Collaborator

Updated to docker v0.2. Please test

XmYx added a commit to XmYx/voltaML-fast-stable-diffusion-low-mem that referenced this issue Dec 9, 2022
@Stax124 Stax124 closed this as completed Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants