New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature requirements] specify hw devices in container #60748

Open
resouer opened this Issue Mar 3, 2018 · 12 comments

Comments

Projects
None yet
7 participants
@resouer
Copy link
Member

resouer commented Mar 3, 2018

Is this a BUG REPORT or FEATURE REQUEST?:
/kind feature

What happened:
The previous discussion happened in #5607, which is pretty old and with no background of CRI, no to mention Device Plugin. Also, I believe the original requirement in #5607 should have already been fixed by Device Plugin.

While new requirements showing up is: how to specify devices in container by user, or per DP requirement? And how can we make this work with current DP design.

containers:
  - name: demo2
    image: sfc-dev-plugin:latest
    # I am not proposing this API, just to clarify the use case
    hostDevices:
    - /dev/foo1
    - /dev/foo2

This is actually needed during implementing many Device Plugins. One example is RDMA: https://github.com/hustcat/k8s-rdma-device-plugin, in which case,/dev/infiniband/rdma_cm should be passed in all container which use RDMA device for run RDMA application in container.

@hustcat Please correct me if I mis-understood sth.

Others devices including: /dev/dri/renderD128 /dev/infiniband etc.

What you expected to happen:

We may need to collect user requirements first, and re-visit DP to see how to support this.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

cc @RenaudWasTaken @vikaschoudhary16 @derekwaynecarr @vishh @jiayingz

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@resouer

This comment has been minimized.

Copy link
Member

resouer commented Mar 5, 2018

Here's one solution (workaround), which has been used by us to handle #49964

Return whatever the container need in AllocateResponse, so those devices will be attached in container, while kubelet does not (and no need) aware that.

rpc Allocate(AllocateRequest) returns (AllocateResponse) {}

We can do this in a separate dummy DP.

@vikaschoudhary16

This comment has been minimized.

Copy link
Member

vikaschoudhary16 commented Mar 5, 2018

seems related #59380 (comment)

As we discussed offline, it seems another example of unlimited resources like /dev/kvm

@vikaschoudhary16

This comment has been minimized.

Copy link
Member

vikaschoudhary16 commented Mar 5, 2018

while kubelet does not (and no need to) aware that.

In normal case also, kubelet is not aware what is being passed to CRI. :)

@resouer

This comment has been minimized.

Copy link
Member

resouer commented Mar 5, 2018

Yes, I believe /dev/infiniband/rdma_cm case can drop in #59380. We will move discussion there.

I will still keep this issue open to track other use cases as well.

@hustcat

This comment has been minimized.

Copy link
Contributor

hustcat commented Mar 5, 2018

@resouer Great!! It's wonderful, I like this simple way to support passing host devices to container :)

@hustcat

This comment has been minimized.

Copy link
Contributor

hustcat commented Mar 16, 2018

If device plugin supporting resource share among pods, then DP can cover this problem. https://docs.google.com/document/d/1ZgKH_K4SEfdiE_OfxQ836s4yQWxZfSjS288Tq9YIWCA/edit?disco=AAAAB1hQAk8

@balboah

This comment has been minimized.

Copy link

balboah commented May 4, 2018

Other use cases is for accessing /dev/video0 for processing webcam input and other IoT things without adding a privileged security context

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Aug 2, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@nikhita

This comment has been minimized.

Copy link
Member

nikhita commented Aug 10, 2018

/remove-lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Nov 8, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Dec 8, 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@resouer

This comment has been minimized.

Copy link
Member

resouer commented Jan 5, 2019

/remove-lifecycle rotten

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment