Large kubemark performance tests failing with timeout during ns deletion #53327

shyamjvs · 2017-10-02T13:34:32Z

We've been seeing these failures continuously in kubemark-5000 for quite some now - https://k8s-testgrid.appspot.com/sig-scalability#kubemark-5000
Even in kubemark-500 we're occasionally seeing flakes - https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce/8806

On first look I'm seeing that hollow-nodes are going pending during test namespace deletion:

I1002 02:45:51.134] Oct  2 02:45:51.134: INFO: POD                     NODE               PHASE    GRACE  CONDITIONS
I1002 02:45:51.135] Oct  2 02:45:51.134: INFO: load-small-14408-4r45s  hollow-node-lmblz  Pending  1s     [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC ContainersNotReady containers with unready status: [load-small-14408]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC  }]
I1002 02:45:51.135] Oct  2 02:45:51.134: INFO: load-small-14408-q8tfd  hollow-node-lm9c4  Pending  1s     [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC ContainersNotReady containers with unready status: [load-small-14408]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2017-10-01 20:26:46 +0000 UTC  }]

Digging into it now. Sorry for the delay, I've been busy with release scalability validation.

cc @kubernetes/sig-scalability-bugs @wojtek-t @gmarek

The text was updated successfully, but these errors were encountered:

shyamjvs · 2017-10-02T13:48:14Z

I'm seeing multiple such error lines from hollow-kubelet logs of currently live cluster:

E1002 13:41:13.582577       7 kubelet.go:1220] Image garbage collection failed multiple times in a row: invalid capacity 0 on image filesystem
E1002 13:41:13.998714       7 kubelet_network.go:305] Failed to ensure that nat chain KUBE-MARK-DROP exists: error creating chain "KUBE-MARK-DROP": executable file not found in $PATH: 
E1002 13:41:14.435637       7 eviction_manager.go:238] eviction manager: unexpected err: failed to get root cgroup stats: failed to get cgroup stats for "/": unexpected number of containers: 0
E1002 13:41:24.435805       7 eviction_manager.go:238] eviction manager: unexpected err: failed to get root cgroup stats: failed to get cgroup stats for "/": unexpected number of containers: 0

@kubernetes/sig-node-bugs @yujuhong @Random-Liu - Any ideas?

shyamjvs · 2017-10-02T13:56:20Z

Seems like kubelet's ImageGCPeriod is 5m and I'm seeing such logs every 5 min - so it's pretty much failing all the time.

Fyi - I'm also seeing one such error log once at the start:

E1002 04:31:13.470857       7 eviction_manager.go:238] eviction manager: unexpected err: failed to get node info: node 'hollow-node-5n2f6' not found

[Edit]: Neglect this error.. It's only seen initially when node info is not available yet. Later it goes fine and reaches next stage where root cgroup stats are fetched but that too fails (as I think hollow-kubelet doesn't expose stats). But these errors are unrelated to our problem iiuc.

shyamjvs · 2017-10-03T13:31:03Z

Ok.. So here's what I found out from the live cluster:

Few namespaces are in terminating state forever during the test's deletion phase and then the job eventually is timing out.
Those namespaces are in that state because they have some density test pod that is in terminating state forever:

e2e-tests-density-30-10-r6h65   density-latency-pod-1309-sjgj8   0/1       Terminating   0          10h
e2e-tests-density-30-35-fmg9s   density-latency-pod-3284-zl2gp   0/1       Terminating   0          9h
e2e-tests-density-30-36-c525x   density-latency-pod-2735-rrdnk   0/1       Terminating   0          9h
e2e-tests-density-30-8-kd2sc    density-latency-pod-1307-4fj7d   0/1       Terminating   0          10h
...

Looking at one such pod, I found that the pod deletion timestamp had been marked and termination grace period set to 30s (but the pod was never deleted - even hours after the timestamp):

Name:				density-latency-pod-1309-sjgj8
Namespace:			e2e-tests-density-30-10-r6h65
Node:				hollow-node-zvx46/10.64.69.26
Start Time:			Mon, 02 Oct 2017 06:25:33 +0000
Labels:				name=density-latency-pod-1309
				type=density-latency-pod
Annotations:			kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicationController","namespace":"e2e-tests-density-30-10-r6h65","name":"density-latency-pod-1309","uid"...
Status:				Terminating (expires Mon, 02 Oct 2017 06:26:03 +0000)
Termination Grace Period:	30s
IP:				
Created By:			ReplicationController/density-latency-pod-1309
Controlled By:			ReplicationController/density-latency-pod-1309

Seems like the GC didn't ensure the pod's deletion. Though from the controller-manager logs, here's what I found:

I1002 05:34:50.054625       9 replication_controller.go:450] Too few "e2e-tests-density-30-10-r6h65"/"density-latency-pod-1309" replicas, need 1, creating 1
I1002 05:34:50.056816       9 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"e2e-tests-density-30-10-r6h65", Name:"density-latency-pod-1309", UID:"68bcbf97-a733-11e7-a55f-42010a8e0fd3", APIVersion:"v1", ResourceVersion:"2846356", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: density-latency-pod-1309-26pgw
I1002 06:25:29.924919       9 garbagecollector.go:375] processing item [v1/Pod, namespace: e2e-tests-density-30-10-r6h65, name: density-latency-pod-1309-26pgw, uid: 68be2c02-a733-11e7-a55f-42010a8e0fd3]
I1002 06:25:29.924946       9 replication_controller.go:597] Replication Controller has been deleted e2e-tests-density-30-10-r6h65/density-latency-pod-1309
I1002 06:25:29.926630       9 garbagecollector.go:479] delete object [v1/Pod, namespace: e2e-tests-density-30-10-r6h65, name: density-latency-pod-1309-26pgw, uid: 68be2c02-a733-11e7-a55f-42010a8e0fd3] with propagation policy Background
I1002 06:25:33.260880       9 replication_controller.go:450] Too few "e2e-tests-density-30-10-r6h65"/"density-latency-pod-1309" replicas, need 1, creating 1
I1002 06:25:33.267788       9 event.go:218] Event(v1.ObjectReference{Kind:"ReplicationController", Namespace:"e2e-tests-density-30-10-r6h65", Name:"density-latency-pod-1309", UID:"68bcbf97-a733-11e7-a55f-42010a8e0fd3", APIVersion:"v1", ResourceVersion:"2847363", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: density-latency-pod-1309-sjgj8
E1002 06:25:33.270014       9 replication_controller.go:421] replicationcontrollers "density-latency-pod-1309" not found
E1002 06:25:33.505893       9 replication_controller.go:421] replicationcontrollers "density-latency-pod-1309" not found
I1002 06:25:33.753529       9 garbagecollector.go:375] processing item [v1/Pod, namespace: e2e-tests-density-30-10-r6h65, name: density-latency-pod-1309-sjgj8, uid: 7ea33bf3-a73a-11e7-a55f-42010a8e0fd3]
I1002 06:25:33.756248       9 garbagecollector.go:479] delete object [v1/Pod, namespace: e2e-tests-density-30-10-r6h65, name: density-latency-pod-1309-sjgj8, uid: 7ea33bf3-a73a-11e7-a55f-42010a8e0fd3] with propagation policy Background
I1002 06:25:36.664103       9 replication_controller.go:597] Replication Controller has been deleted e2e-tests-density-30-10-r6h65/density-latency-pod-1309

Strangely the replication_controller created this pod even after the corresponding rc was deleted (see that the earlier replica density-latency-pod-1309-26pgw was also deleted). And then it was almost immediately marked for deletion by the GC.
I manually tried deleting the pod multiple times, and each time it says successfully deleted pod - though the pod is still there in the same 'Terminating' state.

Smells like a bug in GC. Also it doesn't look right that we can have a pod in such a 'forever undeletable' state.

@gmarek Any idea about this? Or whom should I be cc'ing here?

gmarek · 2017-10-03T13:40:25Z

I think it's a desired behavior if Kubelet is unresponsive. We stopped issuing 'hard deletes' from the control place because stateful sets @smarterclayton. Now control plane always only marks Pod for deletion and waits for kubelet to clean up resources and ack the delete sending the second delete request.

shyamjvs · 2017-10-03T13:40:33Z

fyi - The corresponding hollow-kubelet was alive when I checked.

gmarek · 2017-10-03T13:45:22Z

Then we need logs and involve @dchen1107, @yujuhong or someone else from the Node team.

shyamjvs · 2017-10-03T13:49:13Z

Let me turn on hollow-node logs for 5k with --v=2 for kubelet (would that be sufficient @yujuhong?)

gmarek · 2017-10-03T13:49:59Z

My guess is that it's hard to tell up front;) Certainly won't hurt to have more logs.

wojtek-t · 2017-10-03T13:55:39Z

Are we sure that nodes are not cpu-starved? I think we've seen something like that in the past in such scenario.

shyamjvs · 2017-10-03T14:26:30Z

I'm seeing the following logs throughout the test from real kubelet:

I1003 05:00:17.859851    1684 eviction_manager.go:325] eviction manager: no resources are starved

The hollow-kubelet resource usages are not directly available from the logs, but took a look from the live cluster now and seems normal so far (I'll look again while deletion).

shyamjvs · 2017-10-10T19:26:14Z

I digged deeper into this today and here's what I found:

Earlier when kubemark-scale had quite some green runs in a row, the runs took lesser time consistently (~15hr) and then it increased (~18hr) which is when we also start seeing many failures
This increase seems to be due to:
- If load test is running first, the namespace deletion is leading to timeout (due to pods remaining undeleted)
- If density test is running first, the deletion phase is taking ~3h longer than before (8m -> 2h (for deleting latency pods) and 1h -> 1h40m (for deleting saturation pods))
This jump in time is seen across runs 614 and 615 (ref: https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-kubemark-gce-scale?before=624)
Here's the diff of commits across those runs

shyamjvs · 2017-10-10T19:27:01Z

Will look at other smaller scale kubemark jobs too if we're seeing a similar issue there.

shyamjvs · 2017-10-10T19:30:53Z

Ok.. It seems like there's also a bump in time for kubemark-500 job (from 47m -> 1h) across runs 8256 and 8257 (https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce?before=8275). And this was around the same time we saw the bump for kubemark-5000. This is great, we now have a much smaller diff.

shyamjvs · 2017-10-10T19:35:20Z

We're also seeing a small difference for kubemark-100 (from 40m -> 45m) across runs 4369 and 4370 (https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-kubemark-100-gce?before=4385) but the diff is the same. From the diff:

Wait for container cleanup before deletion Wait for container cleanup before deletion #50350 (my main suspicion is on this - this was the one also causing kubelet thundering herd issue delete pods API call latencies shot up on large cluster tests #51899 )
Enabling aggregator functionality on kubemark, gce Enabling aggregator functionality on kubemark, gce #51837 (unlikely as only pod deletions seem to have been affected)
fix bug on kubectl deleting uninitialized resources fix bug on kubectl deleting uninitialized resources #51186 (unlikely as we're not using kubectl in our tests or in control-plane components iiuc)

wojtek-t · 2017-10-11T06:23:38Z

@shyamjvs - since this is visible in kubemark-500, can you please locally revert the PR you suspect and verify if it's that?

shyamjvs · 2017-10-11T09:39:30Z

Yup.. Doing that now.

shyamjvs · 2017-10-11T11:04:48Z

Some updates:

The increase in time of deletion phase seems to not be there anymore in the newer runs of kubemark-scale. Looking at kubemark-500, it seems to have restored across runs 8844 and 8845 (https://k8s-gubernator.appspot.com/builds/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce?before=8863) and the diff suggests @dashpole 's kubelet-gc fix helped.
Interestingly, his PR also fixed this issue with kubemark (besides the thundering herd issue which wasn't seen on kubemark). @dashpole - Any idea why that happened? Btw - I'm seeing that you defined a fake podDeletionProvider used by fake kubelet runtime-manager, however we don't seem to be using that in hollow-kubelet
We're seeing a second regression now (which started slightly before the above was fixed). The time taken by cluster saturation phase in density test went up from 50m -> 7h. Also observed in kubemark-500 at a smaller scale (2m40s -> 5m20s). I'm now locally checking the diff.

shyamjvs · 2017-10-11T13:25:15Z

Unable to reproduce the difference locally. Also can't see any obvious test-infra changes across those runs.
Will try to locally reproduce test-infra environment as much as possible.

wojtek-t · 2017-10-11T13:53:57Z

As discussed offline, this is clearly related to scheduler:

before it was able to scheduler 100pods/s
after it is only scheduling 50pods/s
[both consistently]

It seems to me that this is qps-related issue.

BTW - looking into kubemark-scale it seems there might be yet another regression in scheduler.

@kubernetes/sig-scheduling-bugs

shyamjvs · 2017-11-28T20:10:17Z

@dashpole From the above, I believe that the problem is not that the killContainer() failed with error for that container. But instead the container was not even seen as part of the pod (because the container ID field was empty) which probably explains why you couldn't see those logs. It was because the for loop you referred to was over an empty list (which didn't include the container at all).

Does that sound reasonable to you?

dashpole · 2017-11-28T20:30:03Z

Its a bug in the fake docker client filtering. Thanks to @Random-Liu for helping debug this with me

shyamjvs · 2017-11-28T20:45:29Z

Ok... I think I've found the cause. Before deleting a pod, kubelet is checking if all containers have been deleted (and this change was introduced by @dashpole in his PR #50350). However the way we're checking that all containers have been deleted is by checking that the length of the pod's containerStatuses field is empty list. However, this is missing the case when that list is non-empty but all the entries of that list correspond to non-existent containers (i.e ones which have empty containerID field as they don't correspond to any real container). We should still allow deletion of the pod in such case.

This is a bug IMO. I'll send out a fix for this. It should most likely also solve this regression.

shyamjvs · 2017-11-28T20:48:42Z

Sorry.. Just saw your above comment after posting mine. What did you find out?
Also, the thing that I explained in my comment seems to be a bug. WDYT?

dashpole · 2017-11-28T21:37:08Z

where do you see that non-existent containers have an empty containerID?

dashpole · 2017-11-28T21:44:55Z

oh, I see

dashpole · 2017-11-28T21:49:39Z

So the key is that we are not looking at the status in the status manager. We were looking at the containerStatuses from the status provided by the runtime.
The reason that the container status is completely empty is because it was set by the status manager here.
The status from the runtime is based on what is returned by ListContainers in the fake_client. This should still exist even though the status in the status manager has been overridden.

dashpole · 2017-11-28T22:05:36Z

The other important thing to note here, is that the PodResourcesAreReclaimed function and the HandlePodCleanups function use different caches. The podCache (used by PodResourcesAreReclaimed) is just based on the most recent PLEG events, whereas the runtimeCache (used by HandlePodCleanups) is populated by querying the runtime. Hence the bug in filtering means we do not see the container in HandlePodCleanups, but we do see it in PodResourcesAreReclaimed.

shyamjvs · 2017-11-28T22:44:57Z

Discussed offline with @dashpole and he updated his PR to fix the bug in the docker client mock. Turns out my guess in #53327 (comment) was right that there are issues on the mock side :) Also, I closed my PR as it's not needed.

@dashpole Let's get your change in asap, so we can have some result by tomorrow. Hoping for a green run.

@shyamjvs

Automatic merge from submit-queue (batch tested with PRs 55893, 55906, 55026). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. [Test Fix] Mock docker network dependencies and fix filtering bug This PR only affects the mocked docker runtime, and has no impact on the kubelet. Issue #53327 When kubernetes creates a pod using the docker shim, it creates a container which contains the pod's network namespace, and then creates containers which specify that namespace. The current mocked docker does not mock this interaction, and thus allows a container to be created even when the container whose network it is joining does not exist. This allows the mocked kubelet to end up in a state where the pod does not exist, but a container in the pod does, and this breaks pod deletion. This fixes the above by only allowing containers to be started if the container whose network it is trying to join is running. Additionally, this PR fixes a filtering bug where we were incorrectly comparing docker container statuses. /assign @shyamjvs can you test this to see if it fixes the issue? /assign @Random-Liu for approval after @shyamjvs confirms this works.

shyamjvs · 2017-11-30T00:45:06Z

Closing this as we finally have the kubemark-5000 job green - https://k8s-testgrid.appspot.com/sig-scalability-kubemark#kubemark-5000 (run 737)
Thanks everyone for the effort!

/close

shyamjvs · 2017-11-30T21:49:06Z

Seems like this issue is seen again in the following run - https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-kubemark-gce-scale/738
It was a timeout while deleting namespace and the build-log suggests that it's because of undeleted pods. Reopening the bug for further investigation.

cc @dashpole

shyamjvs · 2017-11-30T22:04:54Z

My strong feeling is that your fix is working almost as desired, except sometimes by chance it is failing to delete certain containers. I suspect if your toDockerContainerStatus() function is handling all the cases properly (wrt container states)?

shyamjvs · 2017-11-30T22:11:18Z

Sadly, we don't have the hollow-kubelet logs for kubemark-5000 as we're not collecting them due to their explosive volume.

dashpole · 2017-11-30T22:15:55Z

The function is based on toDockerContainerStatus and toRuntimeAPIContainerState. It should be correct. It is possible there is another bug somewhere.

shyamjvs · 2017-12-04T21:04:01Z

@dashpole I managed to get one of the hollow-kubelet logs from the currently running kubemark-5000 cluster where we're seeing a forever undeleted pod 'load-medium-1010-gh8nw'. Attaching it here -

kubelet-hollow-node-gnc8w.log

And the logs concerning to that particular pod are:

I1204 14:46:20.998627       9 kubelet.go:1836] SyncLoop (ADD, "api"): "load-medium-1010-gh8nw_e2e-tests-load-30-nodepods-11-ksdr9(e428a834-d901-11e7-9448-42010a8e0fd3)"
I1204 14:46:21.102577       9 reconciler.go:217] operationExecutor.VerifyControllerAttachedVolume started for volume "default-token-xchk5" (UniqueName: "kubernetes.io/secret/e428a834-d901-11e7-9448-42010a8e0fd3-default-token-xchk5") pod "load-medium-1010-gh8nw" (UID: "e428a834-d901-11e7-9448-42010a8e0fd3") 
I1204 14:46:21.202808       9 reconciler.go:262] operationExecutor.MountVolume started for volume "default-token-xchk5" (UniqueName: "kubernetes.io/secret/e428a834-d901-11e7-9448-42010a8e0fd3-default-token-xchk5") pod "load-medium-1010-gh8nw" (UID: "e428a834-d901-11e7-9448-42010a8e0fd3") 
I1204 14:46:21.208597       9 operation_generator.go:522] MountVolume.SetUp succeeded for volume "default-token-xchk5" (UniqueName: "kubernetes.io/secret/e428a834-d901-11e7-9448-42010a8e0fd3-default-token-xchk5") pod "load-medium-1010-gh8nw" (UID: "e428a834-d901-11e7-9448-42010a8e0fd3") 
I1204 14:46:21.290338       9 kubelet.go:1852] SyncLoop (DELETE, "api"): "load-medium-1010-gh8nw_e2e-tests-load-30-nodepods-11-ksdr9(e428a834-d901-11e7-9448-42010a8e0fd3)"
I1204 14:46:21.299284       9 kuberuntime_manager.go:385] No sandbox for pod "load-medium-1010-gh8nw_e2e-tests-load-30-nodepods-11-ksdr9(e428a834-d901-11e7-9448-42010a8e0fd3)" can be found. Need to start a new one
I1204 14:46:21.359059       9 kubelet_pods.go:1098] Killing unwanted pod "load-medium-1010-gh8nw"
I1204 14:46:21.795311       9 kubelet.go:1881] SyncLoop (PLEG): "load-medium-1010-gh8nw_e2e-tests-load-30-nodepods-11-ksdr9(e428a834-d901-11e7-9448-42010a8e0fd3)", event: &pleg.PodLifecycleEvent{ID:"e428a834-d901-11e7-9448-42010a8e0fd3", Type:"ContainerDied", Data:"d26c6dc1df8a493f"}
E1204 14:46:21.897191       9 pod_workers.go:186] Error syncing pod e428a834-d901-11e7-9448-42010a8e0fd3 ("load-medium-1010-gh8nw_e2e-tests-load-30-nodepods-11-ksdr9(e428a834-d901-11e7-9448-42010a8e0fd3)"), skipping: failed to "StartContainer" for "load-medium-1010" with RunContainerError: "failed to start container \"4d3a382426ddc7b9\": failed to start container \"4d3a382426ddc7b9\": Error response from daemon: cannot join network of a non running container: d26c6dc1df8a493f"
I1204 14:46:23.332238       9 kubelet_pods.go:1098] Killing unwanted pod "load-medium-1010-gh8nw"

Clearly we're not seeing those 'pod is terminated, but some containers have not been cleaned up' msgs forever (like we were seeing earlier) - which seems to be because of your fix. However, the pod still remains undeleted. My suspicion is sth around the network-related hack you added in the hollow-kubelet as part of the fix. Could you PTAL?

dashpole · 2017-12-04T21:09:08Z

Looking...

dashpole · 2017-12-04T21:27:23Z

This looks like the culprit log line: remote_runtime.go:246] RemoveContainer "4d3a382426ddc7b9" from runtime service failed: rpc error: code = Unknown desc = failed to remove container "4d3a382426ddc7b9": container not stopped
It looks like even though we are returning an error when starting the container, we have already added that container to the "RunningContainers" list in the fake client during CreateContainer.
I spoke with @Random-Liu, and we think the best solution for now is to change ListContainers to remove those containers which have not been started when options.All is not set. I will post a PR.

dashpole · 2017-12-04T22:31:03Z

Ref: #56821

@Random-Liu

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. [Test Fix] Fake docker client can remove containers which have not been started **What this PR does / why we need it**: During kubemark-5000 scalability tests, which use the fake docker client, we encountered a bug where containers where the pod was deleted before the container was started could not be deleted. This is because we only remove pods from the `ExitedContainers` list. Containers are only added to this when they have been created, started, and then stopped. However, containers that have only been created, but not started cannot be deleted. This PR fixes this issue by allowing containers with `State.Running=false` to be deleted. **Which issue(s) this PR fixes**: Ref #53327 **Release note**: ```release-note NONE ``` /sig node /kind bug /priority critical-urgent /assign @Random-Liu @dchen1107 @shyamjvs

shyamjvs · 2017-12-10T16:21:41Z

The job is now green - https://k8s-testgrid.appspot.com/sig-scalability-kubemark#kubemark-5000
We've had 3 successful runs (750, 752, 753). 751 failed, but that's due to failure while creating master (might help once @porridge moves the job to us-east1-b?, if he hasn't already).

Closing this issue. Thanks!

porridge · 2017-12-11T07:18:52Z

Yes, moving to -b is my plan. I kept it in -a so far to limit the number of things changing at the same time.

porridge · 2017-12-11T07:37:24Z

kubernetes/test-infra#5891

@wojtek-t

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Optimize random string generator to avoid multiple locks & use bit-masking Ref kubernetes/kubernetes#53327 We recently started seeing a 50% decrease in scheduling throughput (for e.g in kubemark-500 scale job) and turns out kubernetes/kubernetes#53135 introduced it. The reason is [this call](https://github.com/jsafrane/kubernetes/blob/2caae38d323720a96be34606589f41488ba82b87/plugin/pkg/scheduler/algorithm/predicates/predicates.go#L272) to create a random 32-length string. From the code of the `rand` utility (which is being heavily used throughout the system for randomizing object names), I noticed following performance issues: - to create an n-length string, we are making n calls to `rand.Intn()` each of which does a lock+unlock operation on the RNG.. while just 1 lock+unlock operation is enough for all - we're choosing one character (from an alphabet of 27 chars) per each random integer.. while we can select 10 characters using a single int63 (by masking and bit-shifting) as 1 character uses just 5 bits of randomness - the character set is defined as a global slice (mutable), so the compiler needs to fetch length of the slice on each invocation to `len()` (we're making n of those).. while we can just use a const string (immutable) which will make len directly available as a cached constant (yes, go does it!) This PR is making the above fixes. I'll try to add some benchmarking to measure the difference (as @wojtek-t suggested). /cc @kubernetes/sig-scalability-misc @kubernetes/sig-scheduling-bugs @kubernetes/sig-api-machinery-misc @wojtek-t @smarterclayton Kubernetes-commit: f1d9962fec2ee2d52f41bcbc14b920a3ab66d9a9

shyamjvs self-assigned this Oct 2, 2017

k8s-ci-robot added sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. kind/bug Categorizes issue or PR as related to a bug. labels Oct 2, 2017

shyamjvs changed the title ~~Large kubemark performance tests failing with timeout during deletion~~ Large kubemark performance tests failing with timeout during ns deletion Oct 2, 2017

shyamjvs added kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. priority/P0 labels Oct 2, 2017

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/bug Categorizes issue or PR as related to a bug. labels Oct 2, 2017

shyamjvs mentioned this issue Oct 3, 2017

Enable hollow-node logs for kubemark-scale kubernetes/test-infra#4846

Merged

shyamjvs mentioned this issue Oct 11, 2017

Optimize random string generator to avoid multiple locks & use bit-masking #53720

Merged

k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Oct 11, 2017

shyamjvs mentioned this issue Nov 28, 2017

Kubelet deletes pod even if non-existent containers present in containerStatuses #56510

Closed

k8s-ci-robot closed this as completed Nov 30, 2017

shyamjvs reopened this Nov 30, 2017

dashpole mentioned this issue Dec 4, 2017

[Test Fix] Fake docker client can remove containers which have not been started #56821

Merged

shyamjvs closed this as completed Dec 10, 2017

AishSundar mentioned this issue Apr 11, 2018

pr:pull-kubernetes-kubemark-e2e-gce-big flaked 24 times in the past week #62267

Closed

Large kubemark performance tests failing with timeout during ns deletion #53327

Large kubemark performance tests failing with timeout during ns deletion #53327

Comments

shyamjvs commented Oct 2, 2017

shyamjvs commented Oct 2, 2017

shyamjvs commented Oct 2, 2017 • edited

shyamjvs commented Oct 3, 2017

gmarek commented Oct 3, 2017

shyamjvs commented Oct 3, 2017

gmarek commented Oct 3, 2017

shyamjvs commented Oct 3, 2017

gmarek commented Oct 3, 2017

wojtek-t commented Oct 3, 2017

shyamjvs commented Oct 3, 2017

shyamjvs commented Oct 10, 2017

shyamjvs commented Oct 10, 2017

shyamjvs commented Oct 10, 2017

shyamjvs commented Oct 10, 2017

wojtek-t commented Oct 11, 2017

shyamjvs commented Oct 11, 2017

shyamjvs commented Oct 11, 2017

shyamjvs commented Oct 11, 2017 • edited

wojtek-t commented Oct 11, 2017

shyamjvs commented Nov 28, 2017 • edited

dashpole commented Nov 28, 2017

shyamjvs commented Nov 28, 2017 • edited

shyamjvs commented Nov 28, 2017

dashpole commented Nov 28, 2017

dashpole commented Nov 28, 2017

dashpole commented Nov 28, 2017

dashpole commented Nov 28, 2017

shyamjvs commented Nov 28, 2017 • edited

shyamjvs commented Nov 30, 2017

shyamjvs commented Nov 30, 2017

shyamjvs commented Nov 30, 2017 • edited

shyamjvs commented Nov 30, 2017

dashpole commented Nov 30, 2017

shyamjvs commented Dec 4, 2017

dashpole commented Dec 4, 2017

dashpole commented Dec 4, 2017

dashpole commented Dec 4, 2017

shyamjvs commented Dec 10, 2017

porridge commented Dec 11, 2017 via email

porridge commented Dec 11, 2017 via email

shyamjvs commented Oct 2, 2017 •

edited

shyamjvs commented Oct 11, 2017 •

edited

shyamjvs commented Nov 28, 2017 •

edited

shyamjvs commented Nov 28, 2017 •

edited

shyamjvs commented Nov 28, 2017 •

edited

shyamjvs commented Nov 30, 2017 •

edited