Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

chore: update nvidia device plugin #3225

Merged
merged 3 commits into from May 12, 2020

Conversation

sozercan
Copy link
Member

@sozercan sozercan commented May 11, 2020

Reason for Change:

updates nvidia-device-plugin version and moves the image to mcr

Note for versioning change from nvidia's side: https://github.com/NVIDIA/k8s-device-plugin#versioning

Issue Fixed:

Requirements:

Notes:

@sozercan
Copy link
Member Author

/assign @jackfrancis

@codecov
Copy link

codecov bot commented May 11, 2020

Codecov Report

Merging #3225 into master will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3225      +/-   ##
==========================================
- Coverage   71.45%   71.43%   -0.03%     
==========================================
  Files         147      147              
  Lines       25643    25653      +10     
==========================================
  Hits        18324    18324              
- Misses       6177     6187      +10     
  Partials     1142     1142              
Impacted Files Coverage Δ
pkg/api/k8s_versions.go 100.00% <100.00%> (ø)
cmd/get_logs.go 17.27% <0.00%> (-0.83%) ⬇️
pkg/engine/templates_generated.go 39.64% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78e3b18...54d36c4. Read the comment docs.

@acs-bot acs-bot added size/S and removed size/XS labels May 11, 2020
@sozercan sozercan force-pushed the nvidia-device-plugin-update branch from b5e9df9 to 4759fd1 Compare May 11, 2020 21:28
@mboersma mboersma added the gpu GPU-related issues and fixes label May 11, 2020
@mboersma
Copy link
Member

1.0.0-beta.6 > 1.11 is confusing, but documented as you pointed out.

@sozercan
Copy link
Member Author

@mboersma yea nvidia also deprecated nvidia-docker2 but it's still required for k8s 😕 https://github.com/NVIDIA/nvidia-docker#upgrading-with-nvidia-docker2-deprecated

@mboersma
Copy link
Member

mboersma commented May 12, 2020

I ran the cuda-vector-add test against a new Standard_NC12 1.19.0-alpha.3 cluster off this branch, and it passed:

% k logs -f cuda-vector-add    
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done

The nodes report nvidia.com/gpu: "2" and cpu: "12" and that the nvidia-device-plugin pods are running

    Image:          mcr.microsoft.com/oss/nvidia/k8s-device-plugin:1.0.0-beta6

Copy link
Member

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@acs-bot acs-bot added the lgtm label May 12, 2020
@mboersma mboersma merged commit 6272a70 into Azure:master May 12, 2020
@acs-bot
Copy link

acs-bot commented May 12, 2020

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mboersma, sozercan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sozercan sozercan deleted the nvidia-device-plugin-update branch May 12, 2020 16:43
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved gpu GPU-related issues and fixes lgtm size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants