Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use nvidia-container-toolkit instead of nvidia-docker2 to expose GPUs in Cortex local #1223

Closed
vishalbollu opened this issue Jul 16, 2020 · 7 comments · Fixed by #1366
Closed
Assignees
Labels
enhancement New feature or request
Projects
Milestone

Comments

@vishalbollu
Copy link
Contributor

Description

Cortex local currently relies on setting up a docker runtime with nvidia-docker2 to access gpus. This method is deprecated as of Docker version 19.03. For Docker versions >= 19.03, GPUs should be accessible via --gpus all flag after installing nvidia-container-toolkit https://github.com/NVIDIA/nvidia-docker#quickstart.

If it is possible, support both ways of exposing GPUs to Cortex local.

@vishalbollu vishalbollu added the enhancement New feature or request label Jul 16, 2020
@vishalbollu vishalbollu added this to Prioritize in Cortex via automation Jul 16, 2020
@deliahu deliahu moved this from Prioritize to Current sprint in Cortex Aug 27, 2020
@deliahu deliahu added the v0.20 label Aug 27, 2020
@deliahu deliahu self-assigned this Aug 27, 2020
@deliahu deliahu moved this from Current sprint to In progress in Cortex Sep 16, 2020
@deliahu deliahu assigned vishalbollu and unassigned deliahu Sep 21, 2020
Cortex automation moved this from In progress to Done Sep 21, 2020
@dakshvar22
Copy link

Hi, is nvidia-docker2 still supported? I have docker version < 19.03 and nvidia-docker2 install and I would like to leverage GPU without having to upgrade docker. Currently with v0.20 it doesn't seem like GPU is made available inside the service. Can you please clarify?

@deliahu
Copy link
Member

deliahu commented Oct 11, 2020

@dakshvar22 yes, it should fall back on nvidia-docker2 if nvidia-container-toolkit is not found. What is the error message that you see when you try?

@dakshvar22
Copy link

dakshvar22 commented Oct 11, 2020 via email

@deliahu
Copy link
Member

deliahu commented Oct 11, 2020

@dakshvar22 are you running an example from the cortex repo (if so, which one?), or your own API (if so, which predictor type?). Also, what is the base image you're using for the API container, or are you using the default?

@dakshvar22
Copy link

dakshvar22 commented Oct 11, 2020 via email

@deliahu
Copy link
Member

deliahu commented Oct 11, 2020

@dakshvar22 do you mind sharing your cortex.yaml file, as well as a simple Dockerfile and predictor.py which can be used to reproduce it?

For example, the Dockerfile might just show:

FROM cortexlabs/python-predictor-gpu-slim:0.20.0-cuda10.1
RUN ... # install your dependencies

And your predictor.py might just show:

class PythonPredictor:
    def __init__(self, config):
        print(is_gpu_visible())  # replace is_gpu_visible() with the the appropriate function call

    def predict(self, payload):
        return "ok"

@vishalbollu
Copy link
Contributor Author

vishalbollu commented Oct 13, 2020

@dakshvar22 In addition the the information requested by @deliahu, it would also be helpful if you can share the output docker info.

@deliahu deliahu added this to the v0.20 milestone Nov 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

3 participants