Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added a dockerfile #34

Merged
merged 6 commits into from Apr 6, 2023
Merged

added a dockerfile #34

merged 6 commits into from Apr 6, 2023

Conversation

HardwayLinka
Copy link
Contributor

@HardwayLinka HardwayLinka commented Apr 5, 2023

fixed #6

Please note that you need to map the NVIDIA drivers from the host system to the container when launching it, in order for PyTorch programs within the container to use the GPU. For example, you can use the following command to launch the container:
docker run --gpus all -it <image_name>

@ErikDombi
Copy link
Contributor

Getting an exception when trying to install pip requirements

> [6/7] RUN pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/cu111/torch_stable.html:
#0 1.169 Looking in links: https://download.pytorch.org/whl/cu111/torch_stable.html
#0 1.870 Collecting torch==1.9.0+cu111
#0 1.891   Downloading https://download.pytorch.org/whl/cu111/torch-1.9.0%2Bcu111-cp38-cp38-linux_x86_64.whl (2041.3 MB)
#0 111.0 ERROR: Exception:
#0 111.0 Traceback (most recent call last):
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 425, in _error_catcher
#0 111.0     yield
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 507, in read
#0 111.0     data = self._fp.read(amt) if not fp_closed else b""
#0 111.0   File "/usr/share/python-wheels/CacheControl-0.12.6-py2.py3-none-any.whl/cachecontrol/filewrapper.py", line 62, in read
#0 111.0     data = self.__fp.read(amt)
#0 111.0   File "/usr/lib/python3.8/http/client.py", line 459, in read
#0 111.0     n = self.readinto(b)
#0 111.0   File "/usr/lib/python3.8/http/client.py", line 503, in readinto
#0 111.0     n = self.fp.readinto(b)
#0 111.0   File "/usr/lib/python3.8/socket.py", line 669, in readinto
#0 111.0     return self._sock.recv_into(b)
#0 111.0   File "/usr/lib/python3.8/ssl.py", line 1241, in recv_into
#0 111.0     return self.read(nbytes, buffer)
#0 111.0   File "/usr/lib/python3.8/ssl.py", line 1099, in read
#0 111.0     return self._sslobj.read(len, buffer)
#0 111.0 socket.timeout: The read operation timed out
#0 111.0
#0 111.0 During handling of the above exception, another exception occurred:
#0 111.0
#0 111.0 Traceback (most recent call last):
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/cli/base_command.py", line 186, in _main
#0 111.0     status = self.run(options, args)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/commands/install.py", line 357, in run
#0 111.0     resolver.resolve(requirement_set)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 177, in resolve
#0 111.0     discovered_reqs.extend(self._resolve_one(requirement_set, req))
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 333, in _resolve_one
#0 111.0     abstract_dist = self._get_abstract_dist_for(req_to_install)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/legacy_resolve.py", line 282, in _get_abstract_dist_for
#0 111.0     abstract_dist = self.preparer.prepare_linked_requirement(req)
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 480, in prepare_linked_requirement
#0 111.0     local_path = unpack_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 282, in unpack_url
#0 111.0     return unpack_http_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 158, in unpack_http_url
#0 111.0     from_path, content_type = _download_http_url(
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/operations/prepare.py", line 303, in _download_http_url
#0 111.0     for chunk in download.chunks:
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/utils/ui.py", line 160, in iter
#0 111.0     for x in it:
#0 111.0   File "/usr/lib/python3/dist-packages/pip/_internal/network/utils.py", line 15, in response_chunks
#0 111.0     for chunk in response.raw.stream(
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 564, in stream
#0 111.0     data = self.read(amt=amt, decode_content=decode_content)
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 529, in read
#0 111.0     raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
#0 111.0   File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
#0 111.0     self.gen.throw(type, value, traceback)
#0 111.0   File "/usr/share/python-wheels/urllib3-1.25.8-py2.py3-none-any.whl/urllib3/response.py", line 430, in _error_catcher
#0 111.0     raise ReadTimeoutError(self._pool, None, "Read timed out.")
#0 111.0 urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='download.pytorch.org', port=443): Read timed out.
------
failed to solve: executor failed running [/bin/sh -c pip3 install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/cu111/torch_stable.html]: exit code: 2

@bladexxx
Copy link

bladexxx commented Apr 5, 2023

Thanks for this dockerfile. I can run it in my local.
I tried this Dockerfile, it seems image building is fine, but got below error after building. @HardwayLinka any idea for this issue?

=> exporting to image 84.6s
=> => exporting layers 84.5s
=> => writing image sha256:e205dffc7dede48b4c016d16b98086b646e58f70c07061305a09289a2d79f46a 0.0s
=> => naming to docker.io/library/jarvis-jarvis 0.0s
[+] Running 2/2

  • Network jarvis_default Created 0.1s
  • Container jarvis-jarvis-1 Created 0.2s
    Attaching to jarvis-jarvis-1
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | ==========
    jarvis-jarvis-1 | == CUDA ==
    jarvis-jarvis-1 | ==========
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | CUDA Version 11.4.2
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | This container image and its contents are governed by the NVIDIA Deep Learning Container License.
    jarvis-jarvis-1 | By pulling and using the container, you accept the terms and conditions of this license:
    jarvis-jarvis-1 | https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
    jarvis-jarvis-1 | Use the NVIDIA Container Toolkit to start this container with GPU support; see
    jarvis-jarvis-1 | https://docs.nvidia.com/datacenter/cloud-native/ .
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | *************************
    jarvis-jarvis-1 | ** DEPRECATION NOTICE! **
    jarvis-jarvis-1 | *************************
    jarvis-jarvis-1 | THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
    jarvis-jarvis-1 | https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md
    jarvis-jarvis-1 |
    jarvis-jarvis-1 | python3: can't open file 'models_server.py': [Errno 2] No such file or directory
    jarvis-jarvis-1 exited with code 2

@tricktreat
Copy link
Contributor

tricktreat commented Apr 5, 2023

@HardwayLinka Thanks for the dockerfile. There are some minor changes in the code, please update the commit accordingly.

Copy link
Contributor

@tricktreat tricktreat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope these two comments can be addressed before merging. Thanks.

Dockerfile Outdated Show resolved Hide resolved
docker-compose.yml Outdated Show resolved Hide resolved
Copy link

@roggrat roggrat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this image exist ? the closest cuda version image I could find is 11.3.1-cudnn8-runtime-ubuntu16.04

@tricktreat tricktreat merged commit e7792fd into microsoft:main Apr 6, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create dockerfile
5 participants