Skip to content
This repository has been archived by the owner on Jun 29, 2022. It is now read-only.

Update kubernetes to 1.19.3 #1030

Merged
merged 12 commits into from
Oct 15, 2020
Merged

Update kubernetes to 1.19.3 #1030

merged 12 commits into from
Oct 15, 2020

Conversation

surajssd
Copy link
Member

@surajssd surajssd commented Sep 28, 2020

  • Remove the code added for HAProxy.
  • Modify the seccomp allowed profiles so that pods that have a default profile are allowed to run.

Fixes #866
Fixes #867

@knrt10
Copy link
Member

knrt10 commented Sep 29, 2020

Added #867 to this PR in description, as this fixes it too. but I don't know why am I am not able to find it in linked issue.

@surajssd
Copy link
Member Author

surajssd commented Sep 30, 2020

Pods fail to create when the linkerd is installed:

cert-manager    0s   Warning   FailedCreate    replicaset/cert-manager-cainjector-fc6c787db    Error creating: Internal error occurred: failed calling webhook "linkerd-proxy-injector.linkerd.io": Post "https://linkerd-proxy-injector.linkerd.svc:443/?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
cert-manager    0s   Warning   FailedCreate    replicaset/cert-manager-webhook-845d9df8bf      Error creating: Internal error occurred: failed calling webhook "linkerd-proxy-injector.linkerd.io": Post "https://linkerd-proxy-injector.linkerd.svc:443/?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0
projectcontour  0s   Warning   FailedCreate    daemonset/envoy                                 Error creating: Internal error occurred: failed calling webhook "linkerd-proxy-injector.linkerd.io": Post "https://linkerd-proxy-injector.linkerd.svc:443/?timeout=30s": x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0

And due to this the CI is timing out right now. This is a weird error.

@invidian
Copy link
Member

And due to this the CI is timing out right now. This is a weird error.

This is perhaps because of:

Kubernetes is now built with golang 1.15.0-rc.1.
The deprecated, legacy behavior of treating the CommonName field on X.509 serving certificates as a host name when no Subject Alternative Names are present is now disabled by default. It can >be temporarily re-enabled by adding the value x509ignoreCN=0 to the GODEBUG environment variable. (kubernetes/kubernetes#93264, @justaugustus) [SIG API Machinery, Auth, CLI, Cloud Provider, Cluster Lifecycle, >Instrumentation, Network, Node, Release, Scalability, Storage and Testing]

This seems to me like linkerd generates bad certificates then?

@surajssd surajssd force-pushed the surajssd/update-k8s-1.19.2 branch 9 times, most recently from bc9d34b to eab5513 Compare October 9, 2020 11:37
@surajssd surajssd force-pushed the surajssd/update-k8s-1.19.2 branch 2 times, most recently from c280165 to 4f6cebf Compare October 12, 2020 10:27
ci/aws/aws-cluster.lokocfg.envsubst Outdated Show resolved Hide resolved
@@ -0,0 +1,29 @@
{{- if gt (int .Values.apiserver.replicas) 1 }}
apiVersion: apps/v1
kind: DaemonSet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one disadvantage of running those components as DaemonSet instead of Deployment, which I recently hit myself.

If one runs 3 node controller pool, but each node is resource constrained (e.g. 2 vCPUs, 4GB of RAM), then one might want to run only 2 replicas of each component to save a bit of resources, as for example each kube-apiserver is very RAM-costly.

I think it should be our conscious decision, that we won't be able to support such scenario.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather run less number of other components than APIServer. If I were to run my prod cluster then I should make sure that control plane is not fighting for resources. On the workers I will try to bin pack my work loads, not on the control plane.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this open so that we know this was decided consciously.

@invidian
Copy link
Member

I forgot to summarize requested changes:

  • DaemonSet over Deployment has some disadvantage I recently hit, so we should agree we accept it.
  • Switch from Deployment to DaemonSet could be done in separate PR, to avoid this PR from overgrowing.
  • Some commit messages are out of date
  • Some commits add code which is removed in the next commits, but they do not follow logical flow, but rather fix failed attempt of resolving issue, so IMO they should be adjusted. Failed attempt can be described in commit message instead, for example.
  • Some of existing users might be affected by GODEBUG=x509ignoreCN=0, do we want to handle that?
  • We do some changes to upstream projects, which IMO should be reported upstream so we don't carry the maintenance cost.

@surajssd surajssd force-pushed the surajssd/update-k8s-1.19.2 branch 3 times, most recently from e59aa43 to 8d1dd96 Compare October 14, 2020 08:22
Copy link
Member

@rata rata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@surajssd can you update calico-hostendpoint-controller to use same kubernetes version? or not more than 2 versions older.

I mean here, just in case: https://github.com/kinvolk/calico-hostendpoint-controller/blob/fcf32d9e11c82a6522f49b860b6c4d3d411ca245/Dockerfile#L13. And update the calico component version here :)

While it probably will work with an older version, it is not guaranteed, so we should avoid it IMHO.

@surajssd
Copy link
Member Author

@rata done kinvolk/calico-hostendpoint-controller#8

Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely LGTM, I still found some nits which could be addressed.

The pods that don't specify seccomp profile in their `securityContext`
explicitly get `runtime/default` injected by default. So this profile
has to be allowed by PSPs or else all pods' config should be updated to
have seccomp profile mentioned explicitly.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
invidian
invidian previously approved these changes Oct 15, 2020
Copy link
Member

@invidian invidian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Great work @surajssd 🎉

@invidian invidian changed the title Update kubernetes to 1.19.2 Update kubernetes to 1.19.3 Oct 15, 2020
invidian
invidian previously approved these changes Oct 15, 2020
Pull the image from quay.io instead of docker hub.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
- Sets the flag `--permit-port-sharing` on apiserver, bootstrap and
  self-hosted both, so that now multiple apiserver pods (which use host
  network) can bind on port 6443 simultaneously.

- Makes the self-hosted apiserver "Deployment" based instead of
  Daemonset.

- Remove HAProxy sidecar and the init container needed to create config.

- Remove all appearance of `exposeOnAllInterfaces` parameter.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
For the control plane components if the control plane nodes are more
than 1 then use DaemonSets but if it is just one node then use
Deployment.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
To be able to run on same port as the bootstrap kube-apiserver this
should run as root.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
This commit renames the metrics:

- `kubelet_running_pod_count` to `kubelet_running_pods`.
- `kubelet_running_container_count` to `kubelet_running_containers`.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
These tests were earlier run on deployment but now they should be run on
daemonset, since these components are deployed as ones.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
- With go 1.15 there are strict checks for common name so this commit
  adds a new variable `ignore_x509_cn_check` to Packet and AWS so that
  user can disable such strict checks.

- This commit disables the checks on the CI AWS and Packet, since we
  test linkerd on these two platforms. Once linkerd provides builds with
  go 1.15 then we can revert changes from this commit.

Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Signed-off-by: Suraj Deshmukh <suraj@kinvolk.io>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove HAProxy from kube-apiserver deployment Upgrade to Kubernetes 1.19.x
4 participants