New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for `runtime` option in service definitions #5405

Merged
merged 1 commit into from Jan 4, 2018

Conversation

Projects
None yet
5 participants
@cuckoohello

cuckoohello commented Nov 27, 2017

Works with nvidia-docker2

For docker compose

version: '3'
services:
  nvidia:
    image: nvidia/cuda
    runtime: nvidia
    command: nvidia-smi

docker-compose up

$ docker-compose up
Creating network "nvidia_default" with the default driver
Creating nvidia_nvidia_1 ...
Creating nvidia_nvidia_1 ... done
Attaching to nvidia_nvidia_1
nvidia_1  | Mon Nov 27 02:22:54 2017
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_1  | | NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
nvidia_1  | |-------------------------------+----------------------+----------------------+
nvidia_1  | | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
nvidia_1  | | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
nvidia_1  | |===============================+======================+======================|
nvidia_1  | |   0  GeForce GTX 1080    Off  | 00000000:01:00.0 Off |                  N/A |
nvidia_1  | | 28%   42C    P0    39W / 180W |      0MiB /  8112MiB |      0%      Default |
nvidia_1  | +-------------------------------+----------------------+----------------------+
nvidia_1  | |   1  GeForce GTX 1080    Off  | 00000000:03:00.0 Off |                  N/A |
nvidia_1  | |  0%   41C    P0    36W / 180W |      0MiB /  8114MiB |      0%      Default |
nvidia_1  | +-------------------------------+----------------------+----------------------+
nvidia_1  |
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_1  | | Processes:                                                       GPU Memory |
nvidia_1  | |  GPU       PID   Type   Process name                             Usage      |
nvidia_1  | |=============================================================================|
nvidia_1  | |  No running processes found                                                 |
nvidia_1  | +-----------------------------------------------------------------------------+
nvidia_nvidia_1 exited with code 0

close #5360

Signed-off-by: Yafeng Shan cuckoo@kokonur.me

@shin-

Thank you for your contribution!

However, there are 2 major problems with this:

  1. The config schema for v3 can not be changed. Additions to the schema may be suggested on the docker/cli repo and then ported back to Compose. For this PR, support for the option should be added in the 2.3 schema instead
  2. There's no tests.
@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Nov 29, 2017

@shin- Thank you for your kind reply.

The config schema for v3 can not be changed. Additions to the schema may be suggested on the docker/cli repo and then ported back to Compose. For this PR, support for the option should be added in the 2.3 schema instead

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

There's no tests.

I have tried to add some tests before, but docker only provides docker-runc runtime by default, and nvidia runtime needs a machine with a nvidia gpu and nivida-docker installed. So if I just add some tests for nvidia runtime, it won't pass on a general computer.

Could you give me some suggestions, thank you.

cuckoohello commented Nov 29, 2017

@shin- Thank you for your kind reply.

The config schema for v3 can not be changed. Additions to the schema may be suggested on the docker/cli repo and then ported back to Compose. For this PR, support for the option should be added in the 2.3 schema instead

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

There's no tests.

I have tried to add some tests before, but docker only provides docker-runc runtime by default, and nvidia runtime needs a machine with a nvidia gpu and nivida-docker installed. So if I just add some tests for nvidia runtime, it won't pass on a general computer.

Could you give me some suggestions, thank you.

@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Nov 30, 2017

Member

Cool, thank you for applying the schema changes!

For the tests, I think there are 3 things we would like to do:

  1. Add unit tests that make sure the option is picked up in the config (look at these tests for example)
  2. Create an integration test that explicitly specifies runc as the runtime and make sure it works as expected
  3. Create a test decorator that checks whether a particular runtime is available and skips the test if it isn't. You can model that one after no_cluster. Then you can write a custom runtime test with a @if_runtime_available('nvidia') decorator that will only run if nvidia is available.

Let me know if any of that is unclear and I'll do my best to assist.

Member

shin- commented Nov 30, 2017

Cool, thank you for applying the schema changes!

For the tests, I think there are 3 things we would like to do:

  1. Add unit tests that make sure the option is picked up in the config (look at these tests for example)
  2. Create an integration test that explicitly specifies runc as the runtime and make sure it works as expected
  3. Create a test decorator that checks whether a particular runtime is available and skips the test if it isn't. You can model that one after no_cluster. Then you can write a custom runtime test with a @if_runtime_available('nvidia') decorator that will only run if nvidia is available.

Let me know if any of that is unclear and I'll do my best to assist.

@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Nov 30, 2017

Thank you. Let me have a try first.

cuckoohello commented Nov 30, 2017

Thank you. Let me have a try first.

@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Dec 2, 2017

@shin- I have created both unit and integration tests for runtime option, and pushed the code with a force push.

But jenkins always report error due to not from a trusted source.

Please have a look, thank you.

  Getting remote pull request #5405...

    Checking pull request #5405
    (not from a trusted source)
      ‘Jenkinsfile’ found
    Met criteria
Changes detected: PR-5405 (02cfa11ddf88decdfdf4c4b70d719618108e2153 → f700b73964338fb134165e4242859b06cafb2e1e)
Scheduled build for branch: PR-5405

  1 pull requests were processed (query completed)

  1 pull requests were processed

Finished examining docker/compose

cuckoohello commented Dec 2, 2017

@shin- I have created both unit and integration tests for runtime option, and pushed the code with a force push.

But jenkins always report error due to not from a trusted source.

Please have a look, thank you.

  Getting remote pull request #5405...

    Checking pull request #5405
    (not from a trusted source)
      ‘Jenkinsfile’ found
    Met criteria
Changes detected: PR-5405 (02cfa11ddf88decdfdf4c4b70d719618108e2153 → f700b73964338fb134165e4242859b06cafb2e1e)
Scheduled build for branch: PR-5405

  1 pull requests were processed (query completed)

  1 pull requests were processed

Finished examining docker/compose
@anibali

This comment has been minimized.

Show comment
Hide comment
@anibali

anibali Dec 3, 2017

Sorry to randomly interject, but I've been following this PR in eager anticipation of replacing nvidia-docker-compose for good.

@cuckoohello If you look at the build output at https://jenkins.dockerproject.org/job/docker/job/compose/job/PR-5405/9/console you will see that there are in fact errors/warnings that you can address. Namely:

04:21:10 [all_py27] tests/integration/testcases.py:190:1: E302 expected 2 blank lines, found 1

04:21:10 [all_py27] tests/integration/testcases.py:200:106: E501 line too long (127 > 105 characters)

04:21:10 [all_py27] tests/integration/project_test.py:979:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:999:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:1015:6: F821 undefined name 'if_runtime_available'

04:21:10 [all_py27] tests/integration/project_test.py:1019:21: F821 undefined name 'V2_3'

Thanks for working on this enhancement, it will certainly be very useful for CUDA users in the future!

anibali commented Dec 3, 2017

Sorry to randomly interject, but I've been following this PR in eager anticipation of replacing nvidia-docker-compose for good.

@cuckoohello If you look at the build output at https://jenkins.dockerproject.org/job/docker/job/compose/job/PR-5405/9/console you will see that there are in fact errors/warnings that you can address. Namely:

04:21:10 [all_py27] tests/integration/testcases.py:190:1: E302 expected 2 blank lines, found 1

04:21:10 [all_py27] tests/integration/testcases.py:200:106: E501 line too long (127 > 105 characters)

04:21:10 [all_py27] tests/integration/project_test.py:979:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:999:21: F821 undefined name 'V2_3'

04:21:10 [all_py27] tests/integration/project_test.py:1015:6: F821 undefined name 'if_runtime_available'

04:21:10 [all_py27] tests/integration/project_test.py:1019:21: F821 undefined name 'V2_3'

Thanks for working on this enhancement, it will certainly be very useful for CUDA users in the future!

Yafeng Shan
add support for `runtime` option in service definitions
close #5360

Signed-off-by: Yafeng Shan <cuckoo@kokonur.me>
@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Dec 4, 2017

@anibali Thank you. I use gitlab-ci before, and don't use jenkins, not find the console.

@shin- Finally, both the test and integration tests work finnally. And provide if_runtime_available test scripts. Please have a look, thank you.

cuckoohello commented Dec 4, 2017

@anibali Thank you. I use gitlab-ci before, and don't use jenkins, not find the console.

@shin- Finally, both the test and integration tests work finnally. And provide if_runtime_available test scripts. Please have a look, thank you.

@shin- shin- added this to the 1.19.0 milestone Dec 4, 2017

@andyneff

This comment has been minimized.

Show comment
Hide comment
@andyneff

andyneff Dec 28, 2017

Contributor

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

@cuckoohello, have you made any progress on this?

Contributor

andyneff commented Dec 28, 2017

I have recommitted the code which only adding runtime option in the 2.3 schema, and will have another pull request to docker/cli repo for v3 schema.

@cuckoohello, have you made any progress on this?

@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Dec 30, 2017

@andyneff I have tried to add support for docker/cli, but find there're much more work need to be done. At least there projects need to be adapted.

I will try to manage this next week, and give you feedback

cuckoohello commented Dec 30, 2017

@andyneff I have tried to add support for docker/cli, but find there're much more work need to be done. At least there projects need to be adapted.

I will try to manage this next week, and give you feedback

@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Jan 4, 2018

Member

@cuckoohello Sorry, it was a busy month. Changes LGTM now for the most part, just a few typos but I'll fix those in a separate commit. Thanks a lot!

Member

shin- commented Jan 4, 2018

@cuckoohello Sorry, it was a busy month. Changes LGTM now for the most part, just a few typos but I'll fix those in a separate commit. Thanks a lot!

@shin-

shin- approved these changes Jan 4, 2018

@shin- shin- merged commit e4e92dd into docker:master Jan 4, 2018

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
dco-signed All commits are signed

@anibali anibali referenced this pull request Jan 16, 2018

Closed

Nvidia-docker2 #23

@fredericgermain

This comment has been minimized.

Show comment
Hide comment
@fredericgermain

fredericgermain Jan 26, 2018

I'm not sure about this, but is it supposed to work on schema v3?

I just tried the exemple above on master, it only works for me if I change the schema version to 2.3.

fredericgermain commented Jan 26, 2018

I'm not sure about this, but is it supposed to work on schema v3?

I just tried the exemple above on master, it only works for me if I change the schema version to 2.3.

@cuckoohello

This comment has been minimized.

Show comment
Hide comment
@cuckoohello

cuckoohello Jan 27, 2018

@fredericgermain You can look at the discussion above first.

If you want to use runtime option for schema v3 and above, the only thing you need to do is just modify the config schema for the corresponding json file.

But currently config schema json files for v3 and above in docker compose bump from upstream docker repo, don't accept any modification. So if you want to make these files changed, you need to pull request to upstream repo docker, and docker depends on repo docker-cli, you need to add runtime support for swarm first.... A lot of work....

I don't have enough time to do this now...

cuckoohello commented Jan 27, 2018

@fredericgermain You can look at the discussion above first.

If you want to use runtime option for schema v3 and above, the only thing you need to do is just modify the config schema for the corresponding json file.

But currently config schema json files for v3 and above in docker compose bump from upstream docker repo, don't accept any modification. So if you want to make these files changed, you need to pull request to upstream repo docker, and docker depends on repo docker-cli, you need to add runtime support for swarm first.... A lot of work....

I don't have enough time to do this now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment