HA access to `kube-apiserver` using a host-local proxy #2106

NicolasT · 2019-12-06T21:36:25Z

Implement the 'local nginx' approach as described in #2103.

bert-e · 2019-12-06T21:36:27Z

Hello nicolast,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Status report is not available.

bert-e · 2019-12-06T21:36:31Z

Conflict

A conflict has been raised during the creation of
integration branch w/2.5/improvement/GH-2103-apiserver-ha with contents from improvement/GH-2103-apiserver-ha
and development/2.5.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/2.5/improvement/GH-2103-apiserver-ha origin/development/2.5
 $ git merge origin/improvement/GH-2103-apiserver-ha
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/2.5/improvement/GH-2103-apiserver-ha

NicolasT · 2019-12-06T21:42:07Z

There's a leftover reference to the metalk8s:api_server:host Pillar value in salt/metalk8s/orchestrate/solutions/available.sls, however, I'm not sure that's supposed to be there.

Need input @gdemonet.

NicolasT · 2019-12-07T10:43:25Z

CI failures because in the upgrade/downgrade scenarios, apiServer config is still expected in BootstrapConfiguration (of course). Will fix.

bert-e · 2019-12-08T20:50:26Z

Waiting for approval

The following approvals are needed before I can proceed with the merge:

the author
one peer

Peer approvals must include at least 1 approval from the following list:

buildchain/buildchain/image.py

bert-e · 2019-12-10T09:08:55Z

Waiting for approval

The following approvals are needed before I can proceed with the merge:

the author
one peer

Peer approvals must include at least 1 approval from the following list:

The following reviewers are expecting changes from the author, or must review again:

@slaperche-scality

bert-e · 2019-12-11T11:10:06Z

History mismatch

Merge commit #2d6af2647b352daf7acee3aa6ee6f23d2771ceca on the integration branch
w/2.5/improvement/GH-2103-apiserver-ha is merging a branch which is neither the current
branch improvement/GH-2103-apiserver-ha nor the development branch
development/2.5.

It is likely due to a rebase of the branch improvement/GH-2103-apiserver-ha and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

NicolasT · 2019-12-11T17:21:12Z

This seems to somehow break downgrade, because etcd doesn't get restarted. Further investigation shows this is due to the image not getting pulled?!?
Indeed, there's no /var/lib/metalk8s/repositories/conf.d/99-metalk8s-2.3.0-dev-registry.inc or similar on my test system. No clue what's causing that, though...

Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106

TeddyAndrieux

Looks good just few comments

salt/metalk8s/kubernetes/apiserver/installed.sls

TeddyAndrieux · 2019-12-13T14:32:50Z

salt/metalk8s/kubernetes/apiserver/kubeconfig.sls

@@ -4,7 +4,7 @@
 include:
  - metalk8s.internal.m2crypto

-{%- set apiserver = 'https://' ~ pillar.metalk8s.api_server.host ~ ':6443' %}
+{%- set apiserver = 'https://' ~ grains['metalk8s']['control_plane_ip'] ~ ':6443' %}


Why do not use the "proxy" here ?
That mean if for what ever reason local apiserver do not work the admin.conf will not work and since we use this kubeconfig everywhere it's not really good IMO

Because this conf is perfectly valid to be used from outside the node, e.g. an admin copying it to his machine (aren't we even doing that for tests?)

Note how most KubeConfigs are rewritten to use the local proxy, so the services using those shouldn't be impacted. AFAIK only salt-master is using admin.conf (which is a bug by itself...).

In practice, not much changes, because in our current deployments the apiServer.host value is set to the hostname or IP of the bootstrap node anyway...

Not that any script which uses admin.conf and is certain it's running on a host on which the proxy is deployed, could pass -s/--server to the kubectl invocation to use the proxy anyway.

Ok but then I think we need to change something in our current restore script because if I well remember we need to contact apiserver before configuring the local one and we basically call kubeconfig salt state from apiserver and then use this new admin,conf

That wouldn't work today in the first place then, unless the APIserver address was HA...

No I mean, that's not a reasonable requirement in the first place.

Instead, the recover script should take, as an argument, the address of an API-server, and use that address when connecting to do whatever it needs to do.

With this PR yes indeed but when restore script get writed in our bootstrap config the goal was to have an IP able to hit apiserver, and not only bound to bootstrap IP, either with keepalived or with any other system because configuring all server with the bootstrap IP as apiserver endpoint is really wrong.
But yes now we need to change this

My point is that a VIP or LB for API-server has always been optional, so the restore script relying on that is wrong in the first place.

The bug fixed in this PR is the fact that sometimes we relied on this given address to be HA, given the context of #2108

The restore script should also be updated to support this properly, and I'm happy to open a bug for this, but I'd rather not move this into this PR, since the context is completely different IMHO.

Finally, the lack of CI for the restore operation (#1687) basically allows me to merge this without remorse ;-P

Ugh, I keep putting comments, but the topic keeps resonating.

What I was thinking, primarily: there should be no admin.conf in the backup, because it's not at all required to restore a cluster. An admin.conf can be generated from the CA data that does need to be backed up.

The whole restore process should be constructive, starting from the minimal set of required data as kept in a backup.

So, I went to see, and apparently there's not even an admin.conf in the backup archive (which is good!)

However, what is in there has a really strange structure... We really need to spend some time on that it seems.

salt/metalk8s/orchestrate/upgrade/init.sls

salt/metalk8s/salt/master/configured.sls

For some reason `/etc/profile` isn't exported when `doit` is invoked. As such, the build is not using the caching HTTP proxy we have in place in our CI environment, resulting in a lot of unnecessary public traffic or hitting upstream services. Fixing by an explicit `source` of the environment file when invoking `doit.sh`.

Install of this version is broken (missing `requirements/base.txt` file used by `setup.py` in the package). See: kragniz/python-etcd3#952

This SLS really doesn't make much sense.

Using two different permissions for the same file doesn't really help trying to make states idempotent.

Since we'll start using `nginx` as a 'normal' container, we need to distribute it as both a tar archive (to inject in the `containerd` cache on the bootstrap node), as well as in the 'registry' for clients to pull. See: #2103

We replace the `save_as_tar` boolean by a list of formats to use to save the image. This also allows to cleanup the code of RemoteImage by deporting format-specific code into dedicated classes instead of branching hither and yon in RemoteImage. Refs: #2103 Signed-off-by: Sylvain Laperche <sylvain.laperche@scality.com> (cherry picked from commit ac15e32) Signed-off-by: Nicolas Trangez <ikke@nicolast.be>

When using a local(host) `nginx` to proxy to the `kube-apiserver` instances using a simple *stream* transport (i.e., TCP socket, no TLS handling by `nginx`), we need to ensure the TLS server exposes a certificate the client will accept, i.e., has `IP:127.0.0.1` in the `subjectAlternateName` field. See: #2103

This is similar to the poor-mans-HA functionality as found in the Kubespray project. See: #2103

Instead of using `BootstrapConfiguration` `apiServer.host`, which will be removed in a subsequent commit since no longer required, use the control-plane IP of the host on which the `KubeConfig` file is generated. See: #2103

…rver` Instead of relying on the `BootstrapConfiguration` `apiServer.host` value, we can instead use the proxy to `kube-apiserver` running on every host in the cluster. See: #2103

Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106

We no longer need this since we provide in-cluster HA for `kube-apiserver` access. If this is desired for out-of-cluster access, we can provide this using a `LoadBalancer` `Service` once we have the infrastructure to support this in place. This also removed the optional deployment of `keepalived`. See: #2103 See: #1788

bert-e · 2019-12-13T16:45:16Z

History mismatch

Merge commit #edb50cc539c1bad95a84d59ec3a8fd77e294bbcf on the integration branch
w/2.5/improvement/GH-2103-apiserver-ha is merging a branch which is neither the current
branch improvement/GH-2103-apiserver-ha nor the development branch
development/2.5.

It is likely due to a rebase of the branch improvement/GH-2103-apiserver-ha and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

The following options are set: approve

NicolasT · 2019-12-13T16:47:03Z

/reset

bert-e · 2019-12-13T16:50:32Z

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: approve

bert-e · 2019-12-13T16:51:12Z

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: approve

bert-e · 2019-12-13T16:52:26Z

Conflict

A conflict has been raised during the creation of
integration branch w/2.5/improvement/GH-2103-apiserver-ha with contents from improvement/GH-2103-apiserver-ha
and development/2.5.

I have not created the integration branch.

Here are the steps to resolve this conflict:

 $ git fetch
 $ git checkout -B w/2.5/improvement/GH-2103-apiserver-ha origin/development/2.5
 $ git merge origin/improvement/GH-2103-apiserver-ha
 $ # <intense conflict resolution>
 $ git commit
 $ git push -u origin w/2.5/improvement/GH-2103-apiserver-ha

The following options are set: approve

bert-e · 2019-12-13T17:14:53Z

Waiting for approval

The following approvals are needed before I can proceed with the merge:

the author
one peer

Peer approvals must include at least 1 approval from the following list:

The following options are set: approve

TeddyAndrieux

LGTM (except restore script which is broken now but will be tackle in another PR)

bert-e · 2019-12-16T12:46:09Z

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

✔️ development/2.4
✔️ development/2.5

The following branches will NOT be impacted:

development/1.0
development/1.1
development/1.2
development/1.3
development/2.0
development/2.1
development/2.2
development/2.3

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

Any commit you add on the source branch will trigger a new cycle after the
current queue is merged.
Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve

gdemonet

Works on my machine, and code looks good to me, except for the apiVersion of the BootstrapConfiguration which should change (as you documented it, it's a breaking change).

gdemonet · 2019-12-08T21:23:15Z

salt/metalk8s/kubernetes/apiserver-proxy/files/apiserver-proxy.yaml.j2

+  name: apiserver-proxy
+  namespace: kube-system
+  labels:
+    addonmanager.kubernetes.io/mode: Reconcile


Oh, I was surprised by this "addon manager", looks like it is a "Bash Operator" :) https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/addon-manager/kube-addons.sh

gdemonet · 2019-12-13T11:44:30Z

salt/metalk8s/kubernetes/apiserver-proxy/files/apiserver-proxy.conf.j2

+{%-     if apiserver == grains['metalk8s']['control_plane_ip'] %}
+{%-         set weight = 100 %}
+{%-      else %}
+{%-         set weight = 1 %}
+{%-      endif %}


FWIW, you can use inline conditional expressions with Jinja:

{%- set weight = 100 if apiserver == grains['metalk8s']['control_plane_ip'] else 1 %}

gdemonet · 2019-12-16T17:30:00Z

CHANGELOG.md

@@ -2,6 +2,15 @@

 ## Release 2.4.2 (in development)

+### Breaking changes


Would probably make sense to bump the version of BootstrapConfiguration

No. What you can put in the 'object' doesn't really change: there are no backwards-incompatible changes or anything. We just now ignore information we did use before (and as such, no longer require them to be set). It's a required value that now becomes kinda 'optional' (and unused, at all).

Basically, any existing v1alpha2 BootstrapConfiguration still works.

bert-e · 2019-12-16T23:18:26Z

I have successfully merged the changeset of this pull request
into targetted development branches:

✔️ development/2.4
✔️ development/2.5

The following branches have NOT changed:

development/1.0
development/1.1
development/1.2
development/1.3
development/2.0
development/2.1
development/2.2
development/2.3

Please check the status of the associated issue GH-2103.

Goodbye nicolast.

Previous MetalK8s states used during `iso-manager.sh -a` stage expect the path to the `KubeConfig` to be used by `salt-master` to be available in the Pillar (`metalk8s.api_server.kubeconfig`). This is removed in a subsequent commit, replaced by a hard-coded value. As such, we need to override this value (using the same hard-coded value) in the script, otherwise importing e.g. a 2.3 ISO on a 2.4.x platform (containing these changes) will fail. Keeping this as a separate commit, mainly for documentation purposes. See: #2106

NicolasT added kind:enhancement New feature or request topic:networking Networking-related issues topic:deployment Bugs in or enhancements to deployment stages moonshot labels Dec 6, 2019

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from 7c93b7c to 3c4fd71 Compare December 6, 2019 21:37

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch 3 times, most recently from b270c28 to 49449fc Compare December 7, 2019 11:02

NicolasT marked this pull request as ready for review December 7, 2019 20:58

NicolasT requested a review from a team as a code owner December 7, 2019 20:58

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from 49449fc to c93a3f4 Compare December 8, 2019 09:22

slaperche-scality previously requested changes Dec 10, 2019

View reviewed changes

buildchain/buildchain/image.py Outdated Show resolved Hide resolved

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from 29be094 to acab9d7 Compare December 11, 2019 11:10

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from acab9d7 to 706d07e Compare December 11, 2019 12:12

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch 2 times, most recently from 3d4a373 to b6acef7 Compare December 12, 2019 13:58

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from b6acef7 to e4f5c6c Compare December 12, 2019 14:12

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from 5a57463 to f203600 Compare December 12, 2019 15:42

NicolasT requested a review from slaperche-scality December 12, 2019 15:47

NicolasT mentioned this pull request Dec 12, 2019

Extract Terraform tooling from Eve #2130

Open

NicolasT added this to the MetalK8s 2.4.2 milestone Dec 13, 2019

TeddyAndrieux reviewed Dec 13, 2019

View reviewed changes

NicolasT and others added 12 commits December 13, 2019 17:38

images: don't use etcd3 0.11.0 in salt-master

9533d97

Install of this version is broken (missing `requirements/base.txt` file used by `setup.py` in the package). See: kragniz/python-etcd3#952

salt: remove metalk8s.internal.init

1920bce

This SLS really doesn't make much sense.

salt: use consistent permissions on SA private key

acaff0e

Using two different permissions for the same file doesn't really help trying to make states idempotent.

buildchain: distribute nginx as tar *and* in registry

33a4af7

Since we'll start using `nginx` as a 'normal' container, we need to distribute it as both a tar archive (to inject in the `containerd` cache on the bootstrap node), as well as in the 'registry' for clients to pull. See: #2103

salt: deploy a local proxy to kube-apiserver on all nodes

3a551e8

This is similar to the poor-mans-HA functionality as found in the Kubespray project. See: #2103

kubernetes: use control-plane IP in generated KubeConfig

2ff910a

Instead of using `BootstrapConfiguration` `apiServer.host`, which will be removed in a subsequent commit since no longer required, use the control-plane IP of the host on which the `KubeConfig` file is generated. See: #2103

salt, kubernetes: use the apiserver-proxy to connect to `kube-apise…

d4e8583

…rver` Instead of relying on the `BootstrapConfiguration` `apiServer.host` value, we can instead use the proxy to `kube-apiserver` running on every host in the cluster. See: #2103

NicolasT force-pushed the improvement/GH-2103-apiserver-ha branch from f203600 to 9d70c9d Compare December 13, 2019 16:46

NicolasT requested a review from TeddyAndrieux December 13, 2019 19:30

TeddyAndrieux approved these changes Dec 16, 2019

View reviewed changes

gdemonet suggested changes Dec 16, 2019

View reviewed changes

bert-e merged commit 9d70c9d into development/2.4 Dec 16, 2019

bert-e deleted the improvement/GH-2103-apiserver-ha branch December 16, 2019 23:18

		@@ -2,6 +2,15 @@

		## Release 2.4.2 (in development)

		### Breaking changes

HA access to kube-apiserver using a host-local proxy #2106

HA access to kube-apiserver using a host-local proxy #2106

Uh oh!

Conversation

NicolasT commented Dec 6, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bert-e commented Dec 6, 2019

Hello nicolast,

Uh oh!

bert-e commented Dec 6, 2019

Conflict

Uh oh!

NicolasT commented Dec 6, 2019

Uh oh!

NicolasT commented Dec 7, 2019

Uh oh!

bert-e commented Dec 8, 2019

Waiting for approval

Uh oh!

Uh oh!

bert-e commented Dec 10, 2019

Waiting for approval

Uh oh!

bert-e commented Dec 11, 2019

History mismatch

Uh oh!

NicolasT commented Dec 11, 2019

Uh oh!

TeddyAndrieux left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TeddyAndrieux Dec 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bert-e commented Dec 13, 2019

History mismatch

Uh oh!

NicolasT commented Dec 13, 2019

Uh oh!

bert-e commented Dec 13, 2019

Reset complete

Uh oh!

bert-e commented Dec 13, 2019

Reset complete

Uh oh!

bert-e commented Dec 13, 2019

Conflict

Uh oh!

bert-e commented Dec 13, 2019

Waiting for approval

Uh oh!

TeddyAndrieux left a comment

Choose a reason for hiding this comment

Uh oh!

HA access to `kube-apiserver` using a host-local proxy #2106

HA access to `kube-apiserver` using a host-local proxy #2106

NicolasT commented Dec 6, 2019 •

edited

Loading

TeddyAndrieux Dec 13, 2019 •

edited

Loading