Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix code to handle delayed binding volumes #73863

Merged
merged 2 commits into from Apr 1, 2019

Conversation

@gnufied
Copy link
Member

commented Feb 8, 2019

This fixes counting of volumes that are not immediate binded.

fixes #73724

/sig storage

Count PVCs that are unbound towards attach limit
@msau42

This comment has been minimized.

Copy link
Member

commented Feb 8, 2019

/assign

@gnufied

This comment has been minimized.

Copy link
Member Author

commented Feb 11, 2019

/retest

@bsalamat

This comment has been minimized.

Copy link
Member

commented Feb 11, 2019

Please follow instructions here to post scheduler benchmarks before and after this PR.

@gnufied

This comment has been minimized.

Copy link
Member Author

commented Feb 11, 2019

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug and removed needs-kind labels Feb 11, 2019

@gnufied

This comment has been minimized.

Copy link
Member Author

commented Feb 11, 2019

Something is wrong with my setup of scheduler integration tests:

I0211 15:18:29.083115   22229 retry.go:149] clientv3/auth-retry: error "context canceled" on pinned endpoint "127.0.0.1:2379"                                                                                       
E0211 15:18:29.083160   22229 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}                                                                     
E0211 15:18:29.083204   22229 writers.go:172] apiserver was unable to write a JSON response: http: Handler timeout                                                                                                  
I0211 15:18:29.083304   22229 wrap.go:47] GET /apis/rbac.authorization.k8s.io/v1/clusterroles: (1m0.000470325s) 504                                                                                                 
goroutine 1560647 [running]:                      
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc01b901c70, 0x1f8)                                                                                                        
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:204 +0xd2                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc01b901c70, 0x1f8)                                                                                                         
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:183 +0x35                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).timeout(0xc03fab6aa0, 0xc020a2ef30)                                                                                               
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:218 +0xb5                                                       
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.(*timeoutHandler).ServeHTTP(0xc04d746040, 0x5db9ce0, 0xc01b901c70, 0xc01482fa00)                                                                       
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:118 +0x2aa                                                      
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.WithWaitGroup.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f900)                                                                                             
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/waitgroup.go:47 +0xd6                                                      
net/http.HandlerFunc.ServeHTTP(0xc04d774b10, 0x5db9ce0, 0xc01b901c70, 0xc01482f900)                
        /data/go/src/net/http/server.go:1964 +0x44                                                                                  
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters.WithRequestInfo.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f800)                                                                                        
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/endpoints/filters/requestinfo.go:39 +0x217                                                
net/http.HandlerFunc.ServeHTTP(0xc04d774b40, 0x5db9ce0, 0xc01b901c70, 0xc01482f800)  
        /data/go/src/net/http/server.go:1964 +0x44
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters.withPanicRecovery.func1(0x5db9ce0, 0xc01b901c70, 0xc01482f800)                                                                                         
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/filters/wrap.go:46 +0x127                                                          
net/http.HandlerFunc.ServeHTTP(0xc04d746060, 0x5dbada0, 0xc033987420, 0xc01482f800)
        /data/go/src/net/http/server.go:1964 +0x44 
k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server.(*APIServerHandler).ServeHTTP(0xc04d774b70, 0x5dbada0, 0xc033987420, 0xc01482f800)                                                                             
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/vendor/k8s.io/apiserver/pkg/server/handler.go:189 +0x51                                                               
k8s.io/kubernetes/test/integration/util.StartApiserver.func1(0x5dbada0, 0xc033987420, 0xc01482f800)
        /home/shared/go_projects/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/integration/util/util.go:44 +0x72                                                                                
net/http.HandlerFunc.ServeHTTP(0xc01b95c690, 0x5dbada0, 0xc033987420, 0xc01482f800)                                                            
        /data/go/src/net/http/server.go:1964 +0x44                                                                
net/http.serverHandler.ServeHTTP(0xc01d1a8000, 0x5dbada0, 0xc033987420, 0xc01482f800)                              
        /data/go/src/net/http/server.go:2741 +0xab
net/http.(*conn).serve(0xc013255220, 0x5dc4ea0, 0xc04f058980)                                               
        /data/go/src/net/http/server.go:1847 +0x646                                                                                                          
created by net/http.(*Server).Serve                                                                        
        /data/go/src/net/http/server.go:2851 +0x2f5                                                                                                          
                                                                                                                     
logging error output: "{\"metadata\":{},\"status\":\"Failure\",\"message\":\"Timeout: request did not complete within 1m0s\",\"reason\":\"Timeout\",\"details\":{},\"code\":504}\n"                                 
 [scheduler_perf.test/v0.0.0 (linux/amd64) kubernetes/$Format 127.0.0.1:55076]                                                               
E0211 15:18:29.083507   22229 master_utils.go:234] last health content: ""                                                                                    
F0211 15:18:29.083530   22229 master_utils.go:235] timed out waiting for the condition                                 
E0211 15:18:29.440434   22229 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}                                                                
W0211 15:18:29.083562   22229 storage_scheduling.go:95] unable to get PriorityClass system-node-critical: the server was unable to return a response in the time allotted, but may still be processing the request (get priorityclasses.scheduling.k8s.io system-node-critical). Retrying...
exit status 255                                                                                                             
FAIL    k8s.io/kubernetes/test/integration/scheduler_perf       203.972s                         

@bsalamat fwiw - this PR should not introduce any regression in scheduler performance because all it does is - it ensures we are counting unbound but scheduled PVCs towards volume limit.

@gnufied

This comment has been minimized.

Copy link
Member Author

commented Feb 11, 2019

@bsalamat so I ran../test-performance.sh which gave me..

Before:

Scheduled 3000 Pods in 3 seconds (1000 per second on average). min QPS was 207

After:

Scheduled 3000 Pods in 3 seconds (1000 per second on average). min QPS was 206
@gnufied

This comment has been minimized.

Copy link
Member Author

commented Feb 15, 2019

@msau42 addressed review comments PTAL.

@gnufied gnufied force-pushed the gnufied:fix-attach-limit-delayed branch from ce3e1d0 to 2385eaf Feb 15, 2019

@k8s-ci-robot k8s-ci-robot added the size/L label Feb 15, 2019

@gnufied gnufied force-pushed the gnufied:fix-attach-limit-delayed branch from 2385eaf to d1fada4 Feb 22, 2019

@childsb childsb added this to the v1.14 milestone Mar 6, 2019

@nikopen

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

@childsb @gnufied @msau42
we're in code freeze, release is in 2 weeks from now. Is this bug urgent and release-blocking?

@spiffxp

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

/milestone clear
v1.14 release lead here, given the lack of activity on this PR, I'm removing this from the milestone. Please discuss in #sig-release in slack if it's urgent this land as part of v1.14

@k8s-ci-robot k8s-ci-robot removed this from the v1.14 milestone Mar 13, 2019

@gnufied gnufied force-pushed the gnufied:fix-attach-limit-delayed branch from d1fada4 to a3a93c5 Mar 14, 2019

@gnufied

This comment has been minimized.

Copy link
Member Author

commented Mar 14, 2019

I guess we can merge this after code freeze.

@msau42 addressed review comments. PTAL.

@msau42

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Mar 26, 2019

@spiffxp

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

/priority important-soon

@bsalamat
Copy link
Member

left a comment

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Apr 1, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, gnufied

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 76b1d83 into kubernetes:master Apr 1, 2019

17 checks passed

cla/linuxfoundation gnufied authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.