Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many ConfigMaps and Pods slow down cluster, until it becomes unavailable (since 1.12) #74412

Closed
qmfrederik opened this issue Feb 22, 2019 · 38 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Milestone

Comments

@qmfrederik
Copy link

What happened:

I schedule multiple jobs in my cluster. Each job uses a different ConfigMap which contains the configuration for that job.

This worked well on version 1.11 of Kubernetes. After upgrading to 1.12 or 1.13, I've noticed that doing this will cause the cluster to significantly slow down; up to the point where nodes are being marked as NotReady and no new work is being scheduled.

For example, consider a scenario in which I schedule 400 jobs, each with its own ConfigMap, which print "Hello World" on a single-node cluster would.

On v1.11, it takes about 10 minutes for the cluster to process all jobs. New jobs can be scheduled.
On v1.12 and v1.13, it takes about 60 minutes for the cluster to process all jobs. After this, no new jobs can be scheduled.

What you expected to happen:

I did not expect this scenario to cause my nodes to become unavailable in Kubernetes 1.12 and 1.13, and would have expected the behavior which I observe in 1.11.

How to reproduce it (as minimally and precisely as possible):

The easiest way seems to be to schedule, on a single-node cluster, about 300 jobs:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: job-%JOB_ID%
data:
# Just some sample data
  game.properties: |
    enemies=aliens
---
apiVersion: batch/v1
kind: Job
metadata:
  name: job-%JOB_ID%
spec:
  template:
    spec:
      containers:
      - name: busybox
        image: busybox
        command: [ "/bin/echo" ]
        args: [ "Hello, World!" ]
        volumeMounts:
        - name: config-volume
          mountPath: /etc/config
      volumes:
        - name: config-volume
          configMap:
            name: job-%JOB_ID%
      restartPolicy: Never
  backoffLimit: 4

I can consistently reproduce this issue in a VM-based environment, which I configure using Vagrant. You can find the full setup here: https://github.com/qmfrederik/k8s-job-repro

Anything else we need to know?:

Happy to provide further information as needed

Environment:

  • Kubernetes version (use kubectl version): v1.12 through v1.13
  • Cloud provider or hardware configuration: bare metal
  • OS (e.g: cat /etc/os-release): 18.04.1 LTS (Bionic Beaver)
  • Kernel (e.g. uname -a): Linux vagrant 4.15.0-29-generic Add DESIGN.md to document core design. #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools: kubeadm
  • Others:
@qmfrederik qmfrederik added the kind/bug Categorizes issue or PR as related to a bug. label Feb 22, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 22, 2019
@qmfrederik
Copy link
Author

@kubernetes/sig-cluster-lifecycle-bugs @wojtek-t

It looks like this is a regression introduced in v1.12. It looks like #64752 may be related.

@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 22, 2019
@k8s-ci-robot
Copy link
Contributor

@qmfrederik: Reiterating the mentions to trigger a notification:
@kubernetes/sig-cluster-lifecycle-bugs

In response to this:

@kubernetes/sig-cluster-lifecycle-bugs @wojtek-t

It looks like this is a regression introduced in v1.12. It looks like #64752 may be related.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@qmfrederik
Copy link
Author

On v1.12, the kubelet logs will contain messages like this:

kubelet[877]: E0219 09:06:14.637045     877 reflector.go:134] object-"default"/"job-162273560": Failed to list *v1.ConfigMap: Get https://172.13.13.13:6443/api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-162273560&limit=500&resourceVersion=0: http2: no cached connection was available

and

kubelet[877]: E0219 09:32:57.379751     877 reflector.go:134] k8s.io/kubernetes/pkg/kubelet/kubelet.go:451: Failed to list *v1.Node: Get https://172.13.13.13:6443/api/v1/nodes?fieldSelector=metadata.name%3Dxxxxx&limit=500&resourceVersion=0: http2: no cached connection was available

and the kube-controller-manager logs:

I0219 09:29:46.875397       1 node_lifecycle_controller.go:1015] Controller detected that all Nodes are not-Ready. Entering master disruption mode.
I0219 09:30:16.877715       1 node_lifecycle_controller.go:1042] Controller detected that some Nodes are Ready. Exiting master disruption mode.

@spiffxp
Copy link
Member

spiffxp commented Feb 25, 2019

/sig scalability

@k8s-ci-robot k8s-ci-robot added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Feb 25, 2019
@spiffxp
Copy link
Member

spiffxp commented Feb 25, 2019

@kubernetes/sig-scalability-bugs

@wojtek-t
Copy link
Member

@qmfrederik - couple questions:

  1. are you using official k8s build (I'm asking to confirm go version that was used to it).

The easiest way seems to be to schedule, on a single-node cluster, about 300 jobs:

You can't start more than 110 pods on a node. So I'm assuming you're talking about creating them roughly at once, and then wait as they will be proceeding (we first schedule ~100 of them (there are ~10 system pods running on that node), and then as they will be finishing we will be scheduling new pods on that node. Am I right?

@kubernetes/sig-node-bugs @yujuhong - FYI

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 25, 2019
@YueHonghui
Copy link
Contributor

YueHonghui commented Feb 26, 2019

We have encountered this problem too(#74302). This is related to max concurrent http2 streams and the change of configmap manager of kubelet. By default, max concurrent http2 stream of http2 server in kube-apiserver is 250, and every configmap will consume one stream to watch in kubelet at least from version 1.13.x. Kubelet will stuck to communicate to kube-apiserver and then become NotReady if too many pods with configmap scheduled to it. A work around is to change the config http2-max-streams-per-connection of kube-apiserver to a bigger value.

@qmfrederik
Copy link
Author

@wojtek-t To answer your questions:

  1. Yes, these are official Kubernetes builds. Running on Ubuntu 18.04, using the packages from the Kubernetes apt repository
  2. Correct, I schedule all jobs at once using a script, and then wait while they are proceeding.

What @YueHonghui says seems consistent with what I'm experiencing.

So, it appears that kubelet is still watching configmaps of pods which have completed, and ultimately you hit the max concurrent HTTP2 stream limit. Would it make sense for kubelet to stop watching configmaps once the pod which consumes that configmap has completed?

@wojtek-t
Copy link
Member

@YueHonghui - hmm.. I thought that once we hit the limit per connection (250), we will automatically open a new connection...

Would it make sense for kubelet to stop watching configmaps once the pod which consumes that configmap has completed

It should stop watching:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/util/manager/watch_based_manager.go#L143

@qmfrederik
Copy link
Author

It should stop watching:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/util/manager/watch_based_manager.go#L143

Are there any logs, metrics,... I can capture to see whether kubelet actually stops watching? Like a metric for concurrent HTTP2 connections or something similar?

@wojtek-t
Copy link
Member

Are there any logs, metrics,... I can capture to see whether kubelet actually stops watching? Like a metric for concurrent HTTP2 connections or something similar?

Unfortunately it's not easy...
What you can do (I admit it's a bit painful) is to look into kube-apiserver logs (you probably need --v=3 - it's where we log all api calls) and check if there are logs saying that a corresponding "watch request" has finished around the time when pod is being deleted - these will be logs like these:

I0226 13:46:18.330681       1 wrap.go:47] GET /api/v1/namespaces/test-93ne60-1/secrets?fieldSelector=metadata.name%3Ddefault-token-pkdz9&resourceVersion=44330&timeout=9m11s&timeoutSeconds=551&watch=true: (1m55.976728767s) 200 [kubelet/v1.15.0 (linux/amd64) kubernetes/465f7eb 35.237.12.113:35324]

@qmfrederik
Copy link
Author

@wojtek-t Thanks, I'll give that a try (probably tomorrow) and let you know.

@yue9944882
Copy link
Member

/cc @deads2k @lavalamp @MikeSpreitzer

this issue might be suggesting that policing/prioritizing requests merely by user/groups is not sufficient for cluster robustness. we should also classify these requests w/ a finer granularity w/i components like kubelet according to verbs/resources.

@lavalamp
Copy link
Member

@yue9944882 This issue is due to the golang bug of limiting HTTP2 connections for no reason, no?

@yue9944882
Copy link
Member

yue9944882 commented Feb 27, 2019

@yue9944882 This issue is due to the golang bug of limiting HTTP2 connections for no reason, no?

yes, to clarify, i mean we can probably limit the WATCH connections (under 250) for kubelet at server-side to "make room" for the patch calls at client-side. will this help the case?

@YueHonghui
Copy link
Contributor

@yue9944882 @wojtek-t @lavalamp I have posted goroutine stacktrace of kubelet to #74302 (comment) . In the case we have encountered, kubelet doesn't use new connection to communicate to kube-apiserver when hit limit of max concurrent streams. This seems due to golang bug of http2 connection pool implementing.

@MikeSpreitzer
Copy link
Member

@yue9944882 : what I am hearing here is that the problem is not the apiservers being overloaded but rather a connection management problem in the kubelet. So this is not begging to be solved by traffic policing in the apiservers.

@wojtek-t
Copy link
Member

but rather a connection management problem in the kubelet.

yes

@MikeSpreitzer @lavalamp ^^

@qmfrederik
Copy link
Author

@wojtek-t I took the time to reproduce this (on v.1.13.3) with logging enabled (v=3) on the API server.
Pretty much the same scenario - schedule 400 jobs (each linked to a different configmap), and wait for them to be scheduled and completed.

Even after all jobs have completed (i.e. the only running pods are the kube-system pods), up to every 10 seconds, ~230 new entries (which seems awfully close to 250) for a watch for a configmap appear on the log.

Here are a couple of cycles, simplified:

I0227 13:24:02.821104       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1351&resourceVersion=62916&timeout=6m59s&timeoutSeconds=419&watch=true: (30.346053154s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:33510]
[...]
I0227 13:24:03.536994       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1244 timeout=7m43s
[...]
I0227 13:24:13.750770       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1393&resourceVersion=62916&timeout=9m34s&timeoutSeconds=574&watch=true: (10.111835228s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:34634]
[...]
I0227 13:24:14.100636       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1125 timeout=5m57s
[...]
I0227 13:24:24.464031       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1173&resourceVersion=62916&timeout=6m36s&timeoutSeconds=396&watch=true: (10.021946099s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:35548]
[...]
I0227 13:24:25.467773       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1113 timeout=5m27s
[...]
I0227 13:24:34.549198       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1267&resourceVersion=62916&timeout=5m14s&timeoutSeconds=314&watch=true: (8.750917122s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:36178]
[...]
I0227 13:24:35.190009       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1339 timeout=9m11s
[...]
I0227 13:25:04.545461       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1220&resourceVersion=62916&timeout=8m39s&timeoutSeconds=519&watch=true: (29.249889205s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:36956]
[...]
I0227 13:25:04.558286       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1157 timeout=8m35s
[...]
I0227 13:25:14.551850       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1061&resourceVersion=62916&timeout=8m31s&timeoutSeconds=511&watch=true: (9.631001277s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:37962]
[...]
I0227 13:25:15.224414       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1285 timeout=5m27s
[...]
I0227 13:25:24.601452       1 wrap.go:47] GET /api/v1/namespaces/default/configmaps?fieldSelector=metadata.name%3Djob-1101&resourceVersion=62916&timeout=6m47s&timeoutSeconds=407&watch=true: (9.084488207s) 200 [kubelet/v1.13.3 (linux/amd64) kubernetes/721bfa7 10.0.2.15:38570]
[...]
I0227 13:25:25.607063       1 get.go:247] Starting watch for /api/v1/namespaces/default/configmaps, rv=62916 labels= fields=metadata.name=job-1376 timeout=5m24s
[...]

Here are the full logs: kube-api-server-logs.txt.gz

So it appears that somewhere, at least one process did not stop watching.

Happy to run further tests/provide additional information if it helps you.

@wojtek-t
Copy link
Member

Heh... - I think I know where the problem is.
The problem is that we delete the reference to the pod (and this is what stops the watch) when we UnregisterPod:

func (c *cacheBasedManager) UnregisterPod(pod *v1.Pod) {

And this one is triggered only by pod deletion:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/pod/pod_manager.go#L207

The problem is that pods that are owned by Jobs are not deleted (they are eventually garbage-collected).
So what happens, is that eventually, you end up with many more pods being effectively on that node (though a lot of them are already in "Succeeded" state).

So it seems there are two problems here:

  • one is that we should probably unregister pod that is no-longer running (and won't be restarted)
  • second (and more serious in my opinion) is why new connection is not created when we approach the limit

Also:

yes, to clarify, i mean we can probably limit the WATCH connections (under 250) for kubelet at server-side to "make room" for the patch calls at client-side. will this help the case?

@yue9944882 - this won't help in general, because it may be valid to have more then 250 connections (if there are more than that many different secrets/configmaps).
Why we don't create a new connection if we approach the limit of streams in a single one?

@wojtek-t wojtek-t added this to the v1.14 milestone Feb 27, 2019
@YueHonghui
Copy link
Contributor

@wojtek-t
Maybe we should upgrade golang version to 1.12 according to g1.12 release notes ?

The Transport no longer handles MAX_CONCURRENT_STREAMS values advertised from HTTP/2 servers as strictly as it did during Go 1.10 and Go 1.11. The default behavior is now back to how it was in Go 1.9: each connection to a server can have up to MAX_CONCURRENT_STREAMS requests active and then new TCP connections are created as needed. In Go 1.10 and Go 1.11 the http2 package would block and wait for requests to finish instead of creating new connections. To get the stricter behavior back, import the golang.org/x/net/http2 package directly and set Transport.StrictMaxConcurrentStreams to true.

@wojtek-t
Copy link
Member

Yeah - but that doesn't solve the problem of previous releases...

@yujuhong yujuhong added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 27, 2019
@shyamjvs
Copy link
Member

One thought besides the actual fixes themselves is we should add scale tests for per-node secrets/configmaps limit, so we can catch such issues in future. At least a kubelet integration test.

@liggitt
Copy link
Member

liggitt commented Mar 1, 2019

Definitely. Where would the best place to do that be?

@shyamjvs
Copy link
Member

shyamjvs commented Mar 1, 2019

I would think https://github.com/kubernetes/kubernetes/tree/master/test/e2e_node? Actually, our density test in its current form allows setting secretsPerPod - https://github.com/kubernetes/kubernetes/blob/master/test/e2e/scalability/density.go#L550. If we change that to a higher value, we would test this case.

cc @wojtek-t @krzysied - Can we make such change in cluster-loader?

@liggitt
Copy link
Member

liggitt commented Mar 1, 2019

If those are using replicasets then the secrets will not be unique per pod, right? Unique pods, with unique secrets, would more reliably exercise this

@shyamjvs
Copy link
Member

shyamjvs commented Mar 1, 2019

You're right. It's probably better to do that (though I think chances of overlapping secrets on node will be quite low in case of our large cluster tests with many RCs).

Also, thinking a bit more, it seems like we'll need to test 2 things:

  • single kubelet is able to handle so many watches (which probably can even be done with just an integration test)
  • apiserver is able to handle so many connections/streams in large clusters, with each node using the reasonably big number of configs

Seems like first one is more important than second.

@liggitt
Copy link
Member

liggitt commented Mar 1, 2019

  • the golang http/2 behavior of not opening new connections once the stream limit is reached (fixed in go1.12, unclear yet whether it will be possible to rebuild patch releases of k8s 1.12/1.13 with a new go version)

it looks like since we vendor golang.org/x/net/http2, we actually have to bump our vendored copy to the go1.12 level to pick up the fix. I was able to write a simple integration test against the cached secret manager. On master, it consistently failed after establishing 236 watches. With our vendored http2 copy bumped to go1.12 levels, it worked all the way up to 10,000 distinct watches (which is where I stopped checking)

rphillips added a commit to rphillips/machine-config-operator that referenced this issue Mar 4, 2019
Watched based strategy has a couple bugs, 1) golang http2 max
streams blocking when the stream limit is reached and 2) the kubelet not
cleaning up watches for terminated pods.

This patch configures the cache based strategy. Once golang 1.12 is in
use, and the kubelet patch is merged we can use the watch based
strategy.

ref: kubernetes/kubernetes#74412
ref: kubernetes/kubernetes#74412 (comment)
@xmudrii
Copy link
Member

xmudrii commented Mar 7, 2019

@liggitt @qmfrederik Hello 👋

I'd like to remind that code freeze is starting today PST EOD! ❄️ 🏔️ As far as I see #74781 is punted for 1.15 and somewhat related PR #71501 doesn't have a milestone. There is still #74809, but that doesn't seem to be fixing this issue. Is this issue still relevant to 1.14 or should it be moved to another milestone?

@liggitt
Copy link
Member

liggitt commented Mar 7, 2019

#74755 fixed this issue.

cherrypicks:
1.13: #74841
1.12: #74842

the other PRs are still relevant, but don't block this issue

/close

@k8s-ci-robot
Copy link
Contributor

@liggitt: Closing this issue.

In response to this:

#74755 fixed this issue.

cherrypicks:
1.13: #74841
1.12: #74842

the other PRs are still relevant, but don't block this issue

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@qmfrederik
Copy link
Author

@liggitt #74841 hasn't been merged yet, so this now got fixed for 1.12 and master, but not 1.13. Is there any chance of getting #74841 merged?

@liggitt
Copy link
Member

liggitt commented Mar 8, 2019

Is there any chance of getting #74841 merged

yes, it is in process

@wojtek-t
Copy link
Member

Sorry - it was unfortunate that I was OOO exactly whe this was discovered. Reverting (given that it was 3-line change) seemed like the most reasonable option.

Regarding testing that, I actually don't think we need large cluster to test this.
This can be fully tested in small (even 1-node cluster) and i think that's the direction we should go with.
Two weeks ago or sth I added this new job (it was supposed to be experimental): https://testgrid.k8s.io/sig-scalability-node#node-throughput but I think these kinds of jobs are what we should do.
Adding a simple test where we will be starting 500 jobs or sth on 1-node cluster (with non-negligible amount of secrets and/or configmaps should be super simple).
[ @shyamjvs I wouldn't invest into our old-style tests, i would like to deprecate them very soon]

openstack-gerrit pushed a commit to openstack-archive/stx-config that referenced this issue Apr 2, 2019
The application-apply of the stx-openstack application on
simplex configurations has been failing since the barbican
chart was added to the application. The failure was due
to lost node status messages from the kubelet to the
kube-apiserver, which causes the node to be marked
NotReady and endpoints to be removed.

The root cause is the kubernetes bug here:
kubernetes/kubernetes#74412

In short, the addition of the barbican chart added enough
new secrets/configmaps that the kubelet hit the limit of
http2-max-streams-per-connection. As done upstream, the
fix is to change the following kubelet config:
configMapAndSecretChangeDetectionStrategy (from Watch to
Cache).

Change-Id: Ic816a91984c4fb82546e4f43b5c83061222c7d05
Closes-bug: 1820928
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
slittle1 pushed a commit to starlingx-staging/openstack-armada-app-test that referenced this issue Sep 3, 2019
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.

1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by kubernetes/kubernetes#74412
because this was a golang deficiency, and is fixed by the newer
version of golang.

2) Enforced the kubernetes 1.15.3 version

3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

4) cgroup validation checking now includes the pids subfolder.

5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased

Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
slittle1 pushed a commit to starlingx-staging/platform-armada-app that referenced this issue Sep 4, 2019
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.

1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by kubernetes/kubernetes#74412
because this was a golang deficiency, and is fixed by the newer
version of golang.

2) Enforced the kubernetes 1.15.3 version

3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

4) cgroup validation checking now includes the pids subfolder.

5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased

Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
slittle1 pushed a commit to starlingx-staging/puppet that referenced this issue Sep 4, 2019
The application-apply of the stx-openstack application on
simplex configurations has been failing since the barbican
chart was added to the application. The failure was due
to lost node status messages from the kubelet to the
kube-apiserver, which causes the node to be marked
NotReady and endpoints to be removed.

The root cause is the kubernetes bug here:
kubernetes/kubernetes#74412

In short, the addition of the barbican chart added enough
new secrets/configmaps that the kubelet hit the limit of
http2-max-streams-per-connection. As done upstream, the
fix is to change the following kubelet config:
configMapAndSecretChangeDetectionStrategy (from Watch to
Cache).

Change-Id: Ic816a91984c4fb82546e4f43b5c83061222c7d05
Closes-bug: 1820928
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
slittle1 pushed a commit to starlingx-staging/puppet that referenced this issue Sep 4, 2019
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.

1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by kubernetes/kubernetes#74412
because this was a golang deficiency, and is fixed by the newer
version of golang.

2) Enforced the kubernetes 1.15.3 version

3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

4) cgroup validation checking now includes the pids subfolder.

5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased

Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
slittle1 pushed a commit to starlingx-staging/stx-config that referenced this issue Sep 4, 2019
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.

1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by kubernetes/kubernetes#74412
because this was a golang deficiency, and is fixed by the newer
version of golang.

2) Enforced the kubernetes 1.15.3 version

3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2

4) cgroup validation checking now includes the pids subfolder.

5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased

Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability.
Projects
None yet
Development

No branches or pull requests