Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Nvidia DevicePlugin GPU Support on AWS #5502

Merged
merged 1 commit into from
Nov 19, 2018

Conversation

dcwangmit01
Copy link
Contributor

@dcwangmit01 dcwangmit01 commented Jul 23, 2018

* Supports DevicePlugin GPU Mode AND Legacy Accelerators GPU Mode
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 23, 2018
@mikesplain
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jul 23, 2018
@dcwangmit01
Copy link
Contributor Author

Tagging those who reviewed/commented on prior GPU PR #4971.

This PR adds Kops GPU support for kubernetes >= 1.11.0 via DevicePlugins.

@115100, @bhack, @chrislovecnm, @erez-rabih, @faheem-cliqz, @flx42, @justinsb, @KashifSaadat, @mikesplain, @mindprince, @richardbrks, @rtluckie

@bhack
Copy link
Contributor

bhack commented Jul 24, 2018

Is cuda mandatory?

@dcwangmit01
Copy link
Contributor Author

@bhack Cuda is not mandatory, but included here to follow the same practice as the other GPU hook images.

@bhack
Copy link
Contributor

bhack commented Aug 3, 2018

@dcwangmit01 ok but I suppose this will duplicate the node storage consumption when we use pod with image derived from nvidia/cuda like Tensorflow etc..

@bhack
Copy link
Contributor

bhack commented Aug 4, 2018

@flx42
Copy link

flx42 commented Aug 4, 2018

FYI this base image doesn't have any CUDA library (apart from libcudart), so you won't get any layer sharing with other CUDA apps; only if you run multiple instances of this image.

@bhack
Copy link
Contributor

bhack commented Aug 4, 2018

@flx42 Yes but I meant that in 90% of the cases the next command is to apt-get cuda-*

@flx42
Copy link

flx42 commented Aug 4, 2018

I agree CUDA toolkit libraries shouldn't be part of a driver install.

@dcwangmit01 dcwangmit01 changed the title Implemented Nvidia DevicePlugin GPU Support Implemented Nvidia DevicePlugin GPU Support on AWS Sep 28, 2018
@dcwangmit01 dcwangmit01 mentioned this pull request Sep 28, 2018
Copy link

@darkyat darkyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested and works. Can this be merged, @mikesplain?

@faheem-nadeem
Copy link

Tested and works as expected. Awesome work thanks.

@justinsb
Copy link
Member

This is purely additive, so I'm going to merge it in :-)

I've added myself a TODO item to build the docker image as part of the kops release and repoint it, so we won't be relying on your docker images @dcwangmit01 !

Thank you so much @dcwangmit01

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 19, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dcwangmit01, justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2018
@k8s-ci-robot k8s-ci-robot merged commit a0fcf95 into kubernetes:master Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants