Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UpdateContainerResources method to CRI #46105

Merged

Conversation

@sjenning
Copy link
Contributor

commented May 19, 2017

This is first step toward support for opinionated cpu pinning for certain guaranteed pods.

In order to do this, the kubelet needs to be able to dynamically update the cpuset at the container level, which is managed by the container runtime. Thus the kubelet needs a method to communicate over the CRI so the runtime can then modify the container cgroup.

This is used in the situation where a core is added or removed from the shared pool to become a exclusive core for a new G pod. The cpuset for all containers in the shared pool will need to be updated to add or remove that core.

Opening this up now so we can start discussion. The need for a change to the CRI might be unexpected.

@derekwaynecarr @vishh @ConnorDoyle

NONE
@ncdc

This comment has been minimized.

Copy link
Member

commented May 19, 2017

/unassign
/assign @derekwaynecarr

@k8s-ci-robot k8s-ci-robot assigned derekwaynecarr and unassigned ncdc May 19, 2017

@ConnorDoyle

This comment has been minimized.

Copy link
Member

commented May 19, 2017

/assign

@sjenning sjenning force-pushed the sjenning:update-conatiner-resource-cri branch 2 times, most recently from f2e0369 to 62e772c May 19, 2017

@derekwaynecarr

This comment has been minimized.

Copy link
Member

commented May 20, 2017

This would be for the next kube version. We should clarify that the container must not be restarted when this is used. We also need to think about downward API signals into the container when changes are made. FYI @mrunalp

@feiskyer

This comment has been minimized.

Copy link
Member

commented Jun 1, 2017

This is used in the situation where a core is added or removed from the shared pool to become a exclusive core for a new G pod. The cpuset for all containers in the shared pool will need to be updated to add or remove that core.

@sjenning Could you clarify the detailed steps to achieve this? Will kubelet monitor this scenario or add a new resource update API in apiserver?

@sjenning

This comment has been minimized.

Copy link
Contributor Author

commented Jun 3, 2017

@feiskyer no, there is no new API and no update to the resource in the apiserver. This is solely a CPU placement decision/optimization made by the kubelet.

A detailed proposal about how this will work is kubernetes/community#654

@ConnorDoyle

This comment has been minimized.

Copy link
Member

commented Jul 20, 2017

/keep-open

@sjenning sjenning referenced this pull request Jul 21, 2017
4 of 6 tasks complete

@sjenning sjenning force-pushed the sjenning:update-conatiner-resource-cri branch from 62e772c to 54b4e20 Jul 27, 2017

@sjenning

This comment has been minimized.

Copy link
Contributor Author

commented Jul 27, 2017

@vishh @dchen1107 PTAL. I'd like to get this reviewed/approved/merged in advance of the main CPU manager PR. This new method on the CRI will also be needed for vertical pod autoscaling as mentioned in the last sig-node meeting

/release-note-none

@sjenning sjenning changed the title [WIP] Add UpdateContainerResources method to CRI Add UpdateContainerResources method to CRI Jul 27, 2017

@@ -362,6 +362,11 @@ func (r *FakeRuntimeService) ContainerStatus(containerID string) (*runtimeapi.Co
return &status, nil
}

func (r *FakeRuntimeService) UpdateContainerResources(string, *runtimeapi.LinuxContainerResources) error {
/* SETH: implement this */

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Jul 27, 2017

Member

placeholder?

@@ -625,6 +625,11 @@ func (f *FakeDockerClient) RemoveContainer(id string, opts dockertypes.Container
return fmt.Errorf("container not stopped")
}

func (f *FakeDockerClient) UpdateContainerResources(id string, updateConfig dockercontainer.UpdateConfig) error {
/* SETH: implement this */

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Jul 27, 2017

Member

placeholder

@mrunalp

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2017

LGTM 👍

@@ -61,6 +61,8 @@ service RuntimeService {
// ContainerStatus returns status of the container. If the container is not
// present, returns an error.
rpc ContainerStatus(ContainerStatusRequest) returns (ContainerStatusResponse) {}
// UpdateContainerResources updates ContainerConfig of the container.
rpc UpdateContainerResources(UpdateContainerResourcesRequest) returns (UpdateContainerResourcesResponse) {}

This comment has been minimized.

Copy link
@feiskyer

feiskyer Jul 28, 2017

Member

Whether all resources are supported for updating? If not, we should document the list of supported resources.

This comment has been minimized.

Copy link
@sjenning

sjenning Jul 28, 2017

Author Contributor

where should they be documented?

This comment has been minimized.

Copy link
@feiskyer

feiskyer Jul 28, 2017

Member

Currently, CRI interfaces are documented via their comments.

This comment has been minimized.

Copy link
@derekwaynecarr

derekwaynecarr Jul 28, 2017

Member

rather than enumerate a full list of resources, prefer we comment that a runtime should return an error if a resource is changed that cannot be mutated.

This comment has been minimized.

Copy link
@feiskyer

feiskyer Jul 28, 2017

Member

@derekwaynecarr yep, that's a better way.

@ConnorDoyle

This comment has been minimized.

Copy link
Member

commented Jul 28, 2017

@derekwaynecarr regarding your question about updating devices, those are currently outside of the LinuxContainerResources message. I think if we want to support updating devices in the future then the UpdateContainerResources message can be expanded to include another field for Devices.

@ConnorDoyle

This comment has been minimized.

Copy link
Member

commented Jul 28, 2017

/approve

@sjenning sjenning force-pushed the sjenning:update-conatiner-resource-cri branch from d3068aa to 9fbf8f5 Aug 1, 2017

@sjenning

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2017

@dchen1107 @vishh test have been all green for a week. Can I get approval on this?

@derekwaynecarr derekwaynecarr added this to the v1.8 milestone Aug 10, 2017

@derekwaynecarr

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

I will shephard in the work related to cpu pinning enhancements.

/approve-no-issue
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Aug 10, 2017

@derekwaynecarr

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

/approve no-issue

@sjenning

This comment has been minimized.

Copy link
Contributor Author

commented Aug 10, 2017

@mrunalp we'll need to add support for this to cri-o

@@ -43,6 +43,8 @@ type ContainerManager interface {
ListContainers(filter *runtimeapi.ContainerFilter) ([]*runtimeapi.Container, error)
// ContainerStatus returns the status of the container.
ContainerStatus(containerID string) (*runtimeapi.ContainerStatus, error)
// UpdateContainerResources updates the cgroup resources for the container.
UpdateContainerResources(containerID string, resources *runtimeapi.LinuxContainerResources) error

This comment has been minimized.

Copy link
@vishh

vishh Aug 10, 2017

Member

Why have an Update method just for Resources? Why not name it UpdateContainer and have a whitelist of fields that can be updated?

This comment has been minimized.

Copy link
@sjenning

sjenning Aug 10, 2017

Author Contributor

LinuxContainerResources are really the only API object that makes sense here. The function name mirrors that resource. What else about the container would you want to be able to change while the container was running? Knowing that the runtime would need to support any such update as well.

This comment has been minimized.

Copy link
@vishh

vishh Aug 11, 2017

Member

If in the future we end up updating any other properties of containers, we will have to introduce a new top level API. I was wondering if we can make this API generic and control what features can be updated via input options.

This comment has been minimized.

Copy link
@ConnorDoyle

ConnorDoyle Aug 12, 2017

Member

Another option is to mirror the protobuf message and define a new struct like ContainerUpdate which for now has only a LinuxContainerResources member. Or just reuse the generated protobuf type.

Note: this merged already.

@vishh

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

@derekwaynecarr @sjenning I have a question on API naming.

cc @yujuhong

@fejta-bot

This comment has been minimized.

Copy link

commented Aug 10, 2017

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@dchen1107

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

As a record, we discussed this at Tuesday's resource management workgroup meeting. I am ok with introducing the ability of updating container resource at CRI level. In general, I do have some concerns we expose at Kubernetes API level for now.

/lgtm

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2017

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ConnorDoyle, dchen1107, derekwaynecarr, sjenning

Associated issue requirement bypassed by: derekwaynecarr

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@derekwaynecarr

This comment has been minimized.

Copy link
Member

commented Aug 11, 2017

@dchen1107 - i agree we should not expose updates via kubernetes api objects yet. as discussed, this is just for internal use by the kubelet.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Aug 11, 2017

Automatic merge from submit-queue (batch tested with PRs 49488, 50407, 46105, 50456, 50258)

@k8s-github-robot k8s-github-robot merged commit b9b875f into kubernetes:master Aug 11, 2017

10 checks passed

Submit Queue Queued to run github e2e tests a second time.
Details
cla/linuxfoundation sjenning authorized
Details
pull-kubernetes-bazel Job succeeded.
Details
pull-kubernetes-e2e-gce-etcd3 Jenkins job succeeded.
Details
pull-kubernetes-e2e-kops-aws Jenkins job succeeded.
Details
pull-kubernetes-federation-e2e-gce Jenkins job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Jenkins job succeeded.
Details
pull-kubernetes-node-e2e Jenkins job succeeded.
Details
pull-kubernetes-unit Jenkins job succeeded.
Details
pull-kubernetes-verify Jenkins job succeeded.
Details
@sjenning

This comment has been minimized.

Copy link
Contributor Author

commented Aug 11, 2017

@ConnorDoyle CRI PR merged! Can you rebase the CPU Manager branch?

@ConnorDoyle

This comment has been minimized.

Copy link
Member

commented Aug 11, 2017

@sjenning my pleasure!

@sjenning sjenning deleted the sjenning:update-conatiner-resource-cri branch Aug 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.