Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dynamic volume limits #64154

Merged
merged 4 commits into from Jun 2, 2018

Conversation

@gnufied
Copy link
Member

gnufied commented May 22, 2018

Implement dynamic volume limits depending on node type.

xref kubernetes/community#2051

Add Alpha support for dynamic volume limits based on node type

@k8s-ci-robot k8s-ci-robot requested review from caesarxuchao and davidopp May 22, 2018

@gnufied gnufied force-pushed the gnufied:impelemnt-volume-count branch 4 times, most recently from 3676b0c to 2ec50f1 May 23, 2018

@gnufied gnufied changed the title Impelemnt dynamic volume limits Implement dynamic volume limits May 23, 2018

@gnufied

This comment has been minimized.

Copy link
Member Author

gnufied commented May 24, 2018

/retest

//
// Add support for volume plugins to report node specific
// volume limits
DynamicVolumeThreshold utilfeature.Feature = "DynamicVolumeThreshold"

This comment has been minimized.

@jsafrane

jsafrane May 24, 2018

Member

I am not sure about the feature name. DynamicVolumeLimits? AttachableVolumeLimits (= the predicate name)? There is no "Threshold" used in implementation of the feature.

This comment has been minimized.

@gnufied

gnufied May 24, 2018

Author Member

Limit is basically another word for threshold right? I just copied what was in specs. It should be pretty easy to rename the feature flag but I don't feel very strongly about this either ways.

This comment has been minimized.

@jingxu97

jingxu97 May 24, 2018

Contributor

Since we use "Limit" in other places, I will vote for "limit". Also agree on "AttachableVolume" part


// a map of volume unique name and volume limit key
newVolumes := make(map[string]string)
if err := c.filterAttachableVolumes(pod.Spec.Volumes, pod.Namespace, newVolumes); err != nil {

This comment has been minimized.

@jsafrane

jsafrane May 24, 2018

Member

newVolumes, err = c.filterAttachableVolumes(pod.Spec.Volumes, pod.Namespace) ?

This comment has been minimized.

@gnufied

gnufied May 24, 2018

Author Member

ah, now I know why it is written like that. It is because second execution of this function(line#452) adds volumes in a loop, we will have merge maps and iterate result again to do that. This saves some execution time.


instanceType, err := instances.InstanceType(context.TODO(), plugin.host.GetNodeName())
if err != nil {
glog.V(3).Infof("Failed to get instance type from AWS cloud provider")

This comment has been minimized.

@jsafrane

jsafrane May 24, 2018

Member

log the error

// volumes that can be attached to a node.
type VolumePluginWithAttachLimits interface {
VolumePlugin
// Return number of volumes attachable on a node from the plugin

This comment has been minimized.

@jsafrane

jsafrane May 24, 2018

Member
  • Please add some description what is the return map - what's the key and what's value.
  • I'd emphasize that it's called by kubelet and it returns limit for the node where this function is called.
// - storage-limits-aws-ebs
// - storage-limits-csi-cinder
// The key should respect character limit of ResourceName type
VolumeLimitKey(spec *Spec) string

This comment has been minimized.

@jsafrane

jsafrane May 24, 2018

Member

Add a note that this function is called by scheduler (or can be called by any Kubernetes component, not just kubelet).

@gnufied gnufied force-pushed the gnufied:impelemnt-volume-count branch 2 times, most recently from 6da2ccb to 9a790ce May 24, 2018

@derekwaynecarr

This comment has been minimized.

Copy link
Member

derekwaynecarr commented May 24, 2018

i will help review kubelet changes.

/assign @derekwaynecarr

@gnufied gnufied force-pushed the gnufied:impelemnt-volume-count branch from 9a790ce to 75576df May 24, 2018

@@ -415,6 +415,136 @@ func (c *MaxPDVolumeCountChecker) filterVolumes(volumes []v1.Volume, namespace s
return nil
}

func (c *MaxPDVolumeCountChecker) attachableLimitPredicate(

This comment has been minimized.

@jingxu97

jingxu97 May 24, 2018

Contributor

Add comment for this function?

This comment has been minimized.

@gnufied

gnufied May 24, 2018

Author Member

fixed

@jingxu97

This comment has been minimized.

Copy link
Contributor

jingxu97 commented May 24, 2018

cc @bsalamat Could you please help take a look at the scheduler change? Thanks!

@gnufied gnufied force-pushed the gnufied:impelemnt-volume-count branch 2 times, most recently from 9fb7a39 to a5330e3 May 24, 2018

@gnufied

This comment has been minimized.

Copy link
Member Author

gnufied commented May 24, 2018

For scheduler

/assign @bsalamat @aveshagarwal

for api and other misc. changes

/assign @liggitt

@gnufied

This comment has been minimized.

Copy link
Member Author

gnufied commented May 24, 2018

/test pull-kubernetes-kubemark-e2e-gce-big
/test pull-kubernetes-e2e-kops-aws

@msau42

This comment has been minimized.

Copy link
Member

msau42 commented Jun 1, 2018

/lgtm

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Jun 1, 2018

[MILESTONENOTIFIER] Milestone Pull Request: Up-to-date for process

@aveshagarwal @bsalamat @derekwaynecarr @gnufied @liggitt @msau42

Pull Request Labels
  • sig/scheduling: Pull Request will be escalated to these SIGs if needed.
  • priority/critical-urgent: Never automatically move pull request out of a release milestone; continually escalate to contributor and SIG through all available channels.
  • kind/feature: New functionality.
Help
@gnufied

This comment has been minimized.

Copy link
Member Author

gnufied commented Jun 1, 2018

/test pull-kubernetes-e2e-gce

@bsalamat
Copy link
Member

bsalamat left a comment

Scheduler changes look good to me. Just a minor comment.

@@ -92,10 +92,14 @@ func IsOvercommitAllowed(name v1.ResourceName) bool {
!IsHugePageResourceName(name)
}

func IsStorageLimitResourceName(name v1.ResourceName) bool {

This comment has been minimized.

@bsalamat

bsalamat Jun 1, 2018

Member

I would rename this to IsVolumeLimitResourceName to be consistent with the rest of the code.

This comment has been minimized.

@gnufied

gnufied Jun 1, 2018

Author Member

fixed

gnufied added some commits May 23, 2018

Implement API changes needed for dynamic volume limits
define alpha feature and make api changes

@gnufied gnufied force-pushed the gnufied:impelemnt-volume-count branch from b660bb3 to 1f9404d Jun 1, 2018

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jun 1, 2018

@msau42

This comment has been minimized.

Copy link
Member

msau42 commented Jun 1, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jun 1, 2018

@bsalamat
Copy link
Member

bsalamat left a comment

/lgtm

for scheduler changes

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jun 1, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, derekwaynecarr, gnufied, liggitt, msau42, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

Copy link

fejta-bot commented Jun 2, 2018

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

k8s-github-robot commented Jun 2, 2018

Automatic merge from submit-queue (batch tested with PRs 64613, 64596, 64573, 64154, 64639). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit e5686a3 into kubernetes:master Jun 2, 2018

18 checks passed

Submit Queue Queued to run github e2e tests a second time.
Details
cla/linuxfoundation gnufied authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped
pull-kubernetes-local-e2e-containerized Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
return nil, fmt.Errorf("Expected Azure cloudprovider, got %s", cloud.ProviderName())
}

volumeLimits := map[string]int64{

This comment has been minimized.

@andyzhangx

andyzhangx Jun 4, 2018

Member

Question: @gnufied it looks like I need to write code here to query volume limits according to azure node type , and assign new value to volumeLimits in azure_dd.go, right?

This comment has been minimized.

@gnufied

gnufied Jun 4, 2018

Author Member

yes that is correct. I did not know how we can get those limits for Azure Disk and hence I went with existing defaults.

@yujuhong

This comment has been minimized.

Copy link
Member

yujuhong commented Jun 4, 2018

This broke the node e2e alpha suite: https://k8s-testgrid.appspot.com/sig-node-kubelet#kubelet-alpha

Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: E0602 14:36:19.366284    1727 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /usr/local/go/src/runtime/asm_amd64.s:573
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /usr/local/go/src/runtime/panic.go:502
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /usr/local/go/src/runtime/panic.go:63
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /usr/local/go/src/runtime/signal_unix.go:388
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/volume/azure_dd/azure_dd.go:116
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:341
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:779
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:1117
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:1111
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:1099
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:324
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:73
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_node_status.go:362
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet.go:1367
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
Jun 02 14:36:19 tmp-node-e2e-53507b65-cos-stable-63-10032-71-0 kubelet[1727]: /usr/local/go/src/runtime/asm_amd64.s:2361
@gnufied

This comment has been minimized.

Copy link
Member Author

gnufied commented Jun 4, 2018

@yuanying looking. ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.