/stats/summary endpoints failing with "Context Deadline Exceeded" on Windows nodes with very high CPU usage #95735

marosset · 2020-10-20T18:00:19Z

What happened:
Windows nodes running with high (>90%) CPU usage are having calls to /stats/summary time out with "Context Deadline Exceeded".

When this happens metrics-server cannot gather accurate CPU/memory usage for the node or pods running on the node which results in

HPA not functioning
kubectl top node/pod returning instead of values

What you expected to happen:
Windows nodes should be able to run with high CPU utilization without negatively impacting Kubernetes functionality

How to reproduce it (as minimally and precisely as possible):

Create a cluster with Windows Server 2019 nodes
Schedule CPU intensive workloads into cluster
- ex: https://gist.github.com/marosset/ad8a9769bdbaa5229fb67cf249e2c1ff#file-iis-2019-burn-cpu-yaml
Scale workload to put node into high CPU state
Query node metrics with kubectl top node

Anything else we need to know?:
This issue appears to be easier to reproduce on 1.18.x nodes vs 1.16.x or 1.17.x nodes

Ex:

metrics-server logs show
E1020 10:58:24.539304 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:akswin000000: unable to fetch metrics from Kubelet akswin000000 (10.240.0.115): Get https://10.240.0.115:10250/stats/summary?only_cpu_and_memory=true: context deadline exceeded

Kubelet logs show

E1002 00:26:59.894818    4680 remote_runtime.go:495] ListContainerStats with filter &ContainerStatsFilter{Id:,PodSandboxId:,LabelSelector:map[string]string{},} 
from runtime service failed: rpc error: code = Unknown desc = operation timeout: context deadline exceeded

E1002 00:43:44.801208    4680 handler.go:321] HTTP InternalServerError serving /stats/summary: Internal Error: failed to list pod stats: 
failed to list all container stats: rpc error: code = Unknown desc = operation timeout: context deadline exceeded

and

E1002 00:46:31.844828    4680 remote_runtime.go:495] ListContainerStats with filter &ContainerStatsFilter{Id:,PodSandboxId:,LabelSelector:map[string]string{},} 
from runtime service failed: rpc error: code = Unknown desc = operation timeout: context deadline exceeded

Environment:

Kubernetes version (use kubectl version): 1.18.8
Cloud provider or hardware configuration: AKS / Azure
OS (e.g: cat /etc/os-release): WIndows Server 2019
Kernel (e.g. uname -a):
Install tools:
Network plugin and version (if this is a network-related bug):
Others: Docker EE version 19.3.11

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2020-10-20T18:00:26Z

@marosset: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

marosset · 2020-10-20T18:02:11Z

/sig windows
/area kubelet

jsturtevant · 2020-10-21T15:47:58Z

I was able to reproduce context deadline exceeded on 1.18.8 during a scale operation: kubectl scale --replicas 9 -n default deployment iis-2019-burn-cpu where it fully subscribes all CPU's in Allocatable:

describe node 3654k8s00000000
Capacity:                                                                                       
  cpu:                            2                                           
  ephemeral-storage:              30945276Ki                                                                                                                 
  memory:                         8388148Ki                                                                                                                  
  pods:                           30                                          
Allocatable:                                                                                             
  cpu:                            2                                           
  ephemeral-storage:              30945276Ki                                                                                                                 
  memory:                         6290996Ki
<clip>
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                       Requests    Limits
  --------                       --------    ------
  cpu                            2 (100%)    2 (100%)
  memory                         2400m (0%)  6400m (0%)
  ephemeral-storage              0 (0%)      0 (0%)
  attachable-volumes-azure-disk  0           0

k top nodes
NAME                                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%     
k8s-agentlinux-36542735-vmss000000   62m          3%     920Mi           12%         
k8s-master-36542735-0                170m         8%     2255Mi          31%         
3654k8s00000000                      <unknown>                           <unknown>               <unknown>               <unknown>

Metric server logs:

 23:24:22.986388       1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:3654k8s00000000: [unable to get CPU for container "iis" in pod default/iis-2019-burn-cpu-56b7bcf8b5-2ndl4 on node "3654k8s00000000", discarding data: missing cpu usage metric, unable to get CPU for container "iis" in pod default/iis-2019-burn-cpu-56b7bcf8b5-skqhv on node "3654k8s00000000", discarding data: missing cpu usage metric, unable to get CPU for container "iis" in pod default/iis-2019-burn-cpu-56b7bcf8b5-446lb on node "3654k8s00000000", discarding data: missing cpu usage metric, unable to get CPU for container "iis" in pod default/iis-2019-burn-cpu-56b7bcf8b5-zt2x4 on node "3654k8s00000000", discarding data: missing cpu usage metric]
E1020 23:25:27.264682       1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:3654k8s00000000: unable to fetch metrics from Kubelet 3654k8s00000000 (3654k8s00000000): Get https://3654k8s00000000:10250/stats/summary?only_cpu_and_memory=true: context deadline exceeded

I do not see "context deadline" in kubelet but I did find some logs assocated:

> I1020 23:24:54.460804    5052 kubelet.go:1953] SyncLoop (PLEG): "iis-2019-burn-cpu-56b7bcf8b5-zt2x4_default(10d79830-998d-4664-a95f-3d417d5d4a38)", event: &pleg.PodLifecycleEvent{ID:"10d79830-998d-4664-a95f-3d417d5d4a38", 
Type:"ContainerStarted", Data:"357eb5a243edf86b91f41b7e84765b07e0e3e402c7f58e2f4f4d8f30eb9a6131"}                                                                                                                                 
  E1020 23:25:28.857884    5052 handler.go:309] Error writing response: http2: stream closed                                                                                                                                      
  time="2020-10-20T23:25:40Z" level=error msg="failed to waitForNotification: callbackNumber 12470 does not exist in callbackMap"

Once the pods are scheduled the node remains at 100% cpu and I no longer see errors but kubectl top nodes is not accurate:

 kubectl top nodes
NAME                                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%  
3654k8s00000000                      1200m        60%    1704Mi          27%

on the node (run after the pod deployment stabilized)

GET-COUNTER -Counter "\Processor(_Total)\% Processor Time" -SampleInterval 2 -MaxSamples 3

Timestamp                 CounterSamples
---------                 --------------
10/21/2020 3:43:10 PM     \\3654k8s00000000\processor(_total)\% processor time :
                          100


10/21/2020 3:43:46 PM     \\3654k8s00000000\processor(_total)\% processor time :
                          99.9556389707573


10/21/2020 3:43:54 PM     \\3654k8s00000000\processor(_total)\% processor time :
                          100

jsturtevant · 2020-10-21T16:54:03Z

Adding cpu=500m to kubelete's --system-reserved= flag and I no longer see the errors. This reserves some cpu for kubelet and other processes. We call this out in the documentation for memory but do not do the same for cpu.

marosset · 2020-10-21T17:09:56Z

Also, without setting system reserved CPU, it looks like kubectl top node does not accurately reflect CPU usage on the nodes when CPU utilization is very high. When I got some nodes into this state I did not see any error messages in logs from metrics-server and also I noticed that that querying the metrics API directly from http://localhost:8001/apis/metrics.k8s.io/v1beta1/nodes showed recent timestamps.

jsturtevant · 2020-10-22T22:00:43Z

I run into this again during scaling, collectiong logs for analysis here:

k scale --replicas=16 deployment validate-windows-cpu-consumption
k get pods
NAME                                                READY   STATUS              RESTARTS   AGE
validate-windows-cpu-consumption-5dd7bb8c89-44nxf   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-5p9zv   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-6h59l   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-6xzbq   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-7m8vq   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-7prvj   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-7vgh9   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-bkqxc   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-c5sj8   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-dqr2k   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-ftgns   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-jcqwm   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-k7djt   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-mxxlk   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-n5fdw   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-nnscp   1/1     Running             0          3m40s
validate-windows-cpu-consumption-5dd7bb8c89-prx2f   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-qk7q5   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-r76rg   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-tcv6x   0/1     Pending             0          20s
validate-windows-cpu-consumption-5dd7bb8c89-wvnzd   0/1     Terminating         0          85s
validate-windows-cpu-consumption-5dd7bb8c89-zx8q5   0/1     ContainerCreating   0          20s

k top nodes  
NAME                                 CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%     
k8s-agentlinux-30512767-vmss000000   71m          3%     854Mi           11%         
k8s-master-30512767-0                195m         9%     1724Mi          23%         
3051k8s00000000                      <unknown>                           <unknown>               <unknown>               <unknown>

Metric Server logs:

k logs -n kube-system metrics-server-b6dbc49fc-tc9wr 
E1022 20:55:01.280170       1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:3051k8s00000000: unable to fetch metrics from Kubelet 3051k8s00000000 (10.240.0.65): request failed - "500 Internal Server Error", response: "Internal Error: failed to list pod stats: failed to list all container stats: rpc error: code = Unknown desc = container 22e29768e0f0df2496a61ac0f59f462bafe61e0148a30b111d3090d930f3c3e8 encountered an error during Properties: failure in a Windows system call: The requested virtual machine or container operation is not valid in the current state. (0xc0370105)"
E1022 20:55:43.002766       1 reststorage.go:135] unable to fetch node metrics for node "3051k8s00000000": no metrics known for node
E1022 20:56:17.838434       1 reststorage.go:135] unable to fetch node metrics for node "3051k8s00000000": no metrics known for node

Kubelet logs:

W1022 20:55:01.271376    5032 empty_dir.go:453] Warning: Failed to clear quota on c:\var\lib\kubelet\pods\a3dffcba-7458-4c11-88f0-dd208413198e\volumes\kubernetes.io~secret\default-token-f89l9: not
implemented
E1022 20:55:01.272353    5032 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/secret/a3dffcba-7458-4c11-88f0-dd208413198e-default-token-f89l9  
podName:a3dffcba-7458-4c11-88f0-dd208413198e nodeName:}" failed. No retries permitted until 2020-10-22 20:55:17.2723536 +0000 GMT m=+8511.317416101 (durationBeforeRetry 16s). Error:
"UnmountVolume.TearDown failed for volume \"default-token-f89l9\" (UniqueName: \"kubernetes.io/secret/a3dffcba-7458-4c11-88f0-dd208413198e-default-token-f89l9\") pod
\"a3dffcba-7458-4c11-88f0-dd208413198e\" (UID: \"a3dffcba-7458-4c11-88f0-dd208413198e\") : remove
c:\\var\\lib\\kubelet\\pods\\a3dffcba-7458-4c11-88f0-dd208413198e\\volumes\\kubernetes.io~secret\\default-token-f89l9: The process cannot access the file because it is being used by another process."      
time="2020-10-22T20:55:01Z" level=error msg="hcsshim::ComputeSystem::Properties - End Operation - Error" cid=22e29768e0f0df2496a61ac0f59f462bafe61e0148a30b111d3090d930f3c3e8 error="Properties
22e29768e0f0df2496a61ac0f59f462bafe61e0148a30b111d3090d930f3c3e8: The requested virtual machine or container operation is not valid in the current state."
E1022 20:55:01.273353    5032 remote_runtime.go:495] ListContainerStats with filter &ContainerStatsFilter{Id:,PodSandboxId:,LabelSelector:map[string]string{},} from runtime service failed: rpc error:      
code = Unknown desc = container 22e29768e0f0df2496a61ac0f59f462bafe61e0148a30b111d3090d930f3c3e8 encountered an error during Properties: failure in a Windows system call: The requested virtual machine or  
container operation is not valid in the current state. (0xc0370105)
E1022 20:55:01.274356    5032 handler.go:321] HTTP InternalServerError serving /stats/summary: Internal Error: failed to list pod stats: failed to list all container stats: rpc error: code = Unknown desc  
= container 22e29768e0f0df2496a61ac0f59f462bafe61e0148a30b111d3090d930f3c3e8 encountered an error during Properties: failure in a Windows system call: The requested virtual machine or container operation  
is not valid in the current state. (0xc0370105)
I1022 20:55:01.344609    5032 kuberuntime_manager.go:422] No sandbox for pod "validate-windows-cpu-consumption-5dd7bb8c89-zx8q5_default(1405d68f-c97e-4232-aafe-b279f30e87d1)" can be found. Need to start   
a new one
W1022 20:55:02.034274    5032 cri_stats_provider_windows.go:89] Failed to get HNS endpoint "" with error 'json: cannot unmarshal array into Go value of type hns.HNSEndpoint', continue to get stats for     
other endpoints
time="2020-10-22T20:55:03Z" level=error msg="hcsshim::ComputeSystem::Properties - End Operation - Error" cid=c1f33c99795ccde9566fcd1f53fbcd9f0f90e98c843db08e855481827b2bf678 error="Properties
c1f33c99795ccde9566fcd1f53fbcd9f0f90e98c843db08e855481827b2bf678: The requested virtual machine or container operation is not valid in the current state."

jsturtevant · 2020-10-23T18:43:38Z

@jeremyje @pjh have you experienced this at all?

jeremyje · 2020-10-23T19:01:09Z

I believe @jingxu97 has seen high CPU issues before. Not particularly on this. I've heard there were kernel issues in the past that triggered high CPU cases.

marosset · 2020-10-23T19:03:13Z

Some investigations on our end are showing that the DiagTrack service running in the infra/pause container can consume non-trivial amounts of CPU.
I'll create an issue to address that problem as it might help out some here.
That certainly will not eliminate the problem we are seeing tho.

marosset · 2020-10-23T19:26:47Z

#95840 is an issue for DiagTrack

marosset · 2020-10-26T19:54:02Z

The differences in behavior between 1.17.* and 1.18.* clusters looks like it is caused by #86101 (which was NOT backported to any prior releases)

K8s version <= 1.17 behavior
Prior to these changes CPU weights for containers were inferred based on limit values set in the pod specs.

The observed behavior on Windows system when CPU weights were used was that under high-CPU load pods/containers were not guaranteed their scheduling time.

The good news here is that system critical services like kubelet/docker/etc were still able to get enough CPU time for the nodes to function properly.

K8s version >= 1.18 behavior
After these changes CPU limits were enforced for pods.
The observed behavior on 1.18 clusters is the when a Windows node enters a high-CPU state is that containers are guaranteed their scheduling time but host resources start to get starved. Usually this manifests as the node entering the NotReady state, metrics not working correctly (because calls to /stats/summary are timing out) or both.

Suggestion / possible fixes

Set the Windows process priority of the kubelet to AboveAverage \ High to help ensure it gets CPU time
- Kubelet can do this itself though golang.org/x/sys/window calls (GetCurrentProcess() and Get/SetPriorityClass())
Set reserve CPU with with --system-reserves=[cpu=500m]
- This needs to scaled based on number of CPUs on host based on my experimentations
- Currently guidance on k8s.io for Windows only mentions reserving system-memory
- It is still possible to over-commit CPU and starve kubelet/docker/HCS if users schedule pods with mis-matching requests and limits
Update k8s.io documentation.

ravisantoshgudimetla · 2020-10-29T18:32:49Z

Set the Windows process priority of the kubelet to AboveAverage \ High to help ensure it gets CPU time

Kubelet can do this itself though golang.org/x/sys/window calls (GetCurrentProcess() and Get/SetPriorityClass())

I can work on this, I am wondering if we should do the same for CRI(like docker) and CNI processes.

marosset · 2020-10-29T18:44:35Z

@ravisantoshgudimetla Sure, if you want to work on this go ahead.
I think we should probably try to make this be toggleable via a kubelet flag.
I believe there are other windows specific flags already implemented in the kubelet (like one to run as a windows service IIRC)

jsturtevant · 2020-10-29T23:37:24Z

docker/containerd should probably have a flag that allows them to run as high priory as well but would likely be different issues in the respective projects

marosset · 2021-09-02T22:41:02Z

/reopen
I don't think this was ever fully addressed

k8s-ci-robot · 2021-09-02T22:41:17Z

@marosset: Reopened this issue.

In response to this:

/reopen
I don't think this was ever fully addressed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jsturtevant · 2021-09-02T23:43:49Z

also related to #104283

k8s-triage-robot · 2021-12-01T23:51:43Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2021-12-31T23:54:48Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-01-31T00:32:08Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-01-31T00:32:25Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

marosset added the kind/bug Categorizes issue or PR as related to a bug. label Oct 20, 2020

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Oct 20, 2020

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 20, 2020

k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. area/kubelet and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 20, 2020

jsturtevant mentioned this issue Oct 23, 2020

test: Windows Metrics failure when High CPU Azure/aks-engine#3962

Merged

8 tasks

jsturtevant mentioned this issue Oct 27, 2020

fix: reserve cpu for windows kubelet for high cpu scenarios Azure/aks-engine#3975

Closed

8 tasks

claudiubelu mentioned this issue Oct 28, 2020

pause image: Disable DiagTrack service on Windows image #95950

Merged

ravisantoshgudimetla mentioned this issue Oct 30, 2020

[kubelet] Allow priority to be set for kubelet process on Windows #96051

Merged

k8s-ci-robot closed this as completed in #96051 Nov 2, 2020

marosset added this to Done (v1.19) in SIG-Windows Nov 5, 2020

marosset mentioned this issue Nov 9, 2020

Automated cherry pick of #96051 #96157

Merged

This was referenced Nov 28, 2020

Add priority flag for containerd on Windows hosts containerd/containerd#4773

Open

Add priority flag to kube-proxy on Windows hosts #96935

Closed

Add priority flag for hybrid-overlay on Windows hosts ovn-org/ovn-kubernetes#1873

Open

jsturtevant mentioned this issue Dec 15, 2020

Use the kubelet priority flag for Windows kubelet process Azure/aks-engine#4130

Closed

jsturtevant mentioned this issue Feb 3, 2021

"kubectl top node" does not return resource usage for Windows nodes #98704

Closed

oshoval mentioned this issue Feb 10, 2021

sriov, Let SyncVMI DeadlineExceeded warning be non fatal in tests kubevirt/kubevirt#5002

Merged

k8s-ci-robot reopened this Sep 2, 2021

SIG-Windows automation moved this from Done (v1.19) to Backlog Sep 2, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 1, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 31, 2021

k8s-ci-robot closed this as completed Jan 31, 2022

SIG-Windows automation moved this from Backlog to Done (v1.23) Jan 31, 2022

oshoval mentioned this issue Mar 29, 2023

e2e, SR-IOV: Ignore SyncVMI timeout warning kubevirt/kubevirt#9490

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/stats/summary endpoints failing with "Context Deadline Exceeded" on Windows nodes with very high CPU usage #95735

/stats/summary endpoints failing with "Context Deadline Exceeded" on Windows nodes with very high CPU usage #95735

marosset commented Oct 20, 2020

k8s-ci-robot commented Oct 20, 2020

marosset commented Oct 20, 2020

jsturtevant commented Oct 21, 2020

jsturtevant commented Oct 21, 2020

marosset commented Oct 21, 2020

jsturtevant commented Oct 22, 2020

jsturtevant commented Oct 23, 2020

jeremyje commented Oct 23, 2020

marosset commented Oct 23, 2020

marosset commented Oct 23, 2020

marosset commented Oct 26, 2020 •

edited

ravisantoshgudimetla commented Oct 29, 2020

marosset commented Oct 29, 2020

jsturtevant commented Oct 29, 2020

marosset commented Sep 2, 2021

k8s-ci-robot commented Sep 2, 2021

jsturtevant commented Sep 2, 2021

k8s-triage-robot commented Dec 1, 2021

k8s-triage-robot commented Dec 31, 2021

k8s-triage-robot commented Jan 31, 2022

k8s-ci-robot commented Jan 31, 2022

/stats/summary endpoints failing with "Context Deadline Exceeded" on Windows nodes with very high CPU usage #95735

/stats/summary endpoints failing with "Context Deadline Exceeded" on Windows nodes with very high CPU usage #95735

Comments

marosset commented Oct 20, 2020

k8s-ci-robot commented Oct 20, 2020

marosset commented Oct 20, 2020

jsturtevant commented Oct 21, 2020

jsturtevant commented Oct 21, 2020

marosset commented Oct 21, 2020

jsturtevant commented Oct 22, 2020

jsturtevant commented Oct 23, 2020

jeremyje commented Oct 23, 2020

marosset commented Oct 23, 2020

marosset commented Oct 23, 2020

marosset commented Oct 26, 2020 • edited

ravisantoshgudimetla commented Oct 29, 2020

marosset commented Oct 29, 2020

jsturtevant commented Oct 29, 2020

marosset commented Sep 2, 2021

k8s-ci-robot commented Sep 2, 2021

jsturtevant commented Sep 2, 2021

k8s-triage-robot commented Dec 1, 2021

k8s-triage-robot commented Dec 31, 2021

k8s-triage-robot commented Jan 31, 2022

k8s-ci-robot commented Jan 31, 2022

marosset commented Oct 26, 2020 •

edited