Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control-plane without LB for API-server is a SPOF #2103

Closed
NicolasT opened this issue Dec 6, 2019 · 2 comments · Fixed by #2106
Closed

Control-plane without LB for API-server is a SPOF #2103

NicolasT opened this issue Dec 6, 2019 · 2 comments · Fixed by #2106
Assignees
Labels
kind:bug Something isn't working

Comments

@NicolasT
Copy link
Contributor

NicolasT commented Dec 6, 2019

When deploying without creating an external load-balancer for the API server, or enabling the keepalived VIP management, we end up configuring all services who require 'external' access to the API server, including kube-controller-manager, kube-scheduler, kubelet, kube-proxy to connect to apiServer.host from the BootstrapConfiguration, which would then be the bootstrap node. As such, this becomes a SPOF for the control-plane, even when multiple control-plane nodes are up and running.

This can be resolved by either using an external LB or use the built-in keepalived functionality. However, this may not be feasible in all environments.

As such, we could consider adopting the Kubespray approach of deploying an nginx instance on every host in the cluster (using the hostNetwork), listening on 127.0.0.1:6443, which then balances/proxies to all known API server instances in the cluster. At that point, we can configure all k-c-ms, k-ss, kubelets, kube-proxys and other consumers of the API server to connect to 127.0.0.1:6443.

This does not fix the issue of API server accessibility from outside the cluster, i.e. an admin's workstation or similar.

@NicolasT NicolasT added kind:bug Something isn't working moonshot labels Dec 6, 2019
@NicolasT
Copy link
Contributor Author

NicolasT commented Dec 6, 2019

The benefit of using an 'internal' HA'ish solution would be that there's no need to special-case the keepalived infrastructure we have right now: if we want to provide a stable IP towards the API server for 'real extenral' access, we can use whatever HA/LB solution we come up with for #1788 when (also) applied to the control-plane.

@NicolasT
Copy link
Contributor Author

NicolasT commented Dec 6, 2019

Moving forward with the nginx-based approach, we could then use this same proxy for other purposes like access to the repositories, salt-master,...

@NicolasT NicolasT self-assigned this Dec 6, 2019
NicolasT added a commit that referenced this issue Dec 6, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
Instead of using `BootstrapConfiguration` `apiServer.host`, which will
be removed in a subsequent commit since no longer required, use the
control-plane IP of the host on which the `KubeConfig` file is
generated.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 6, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
NicolasT added a commit that referenced this issue Dec 6, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 7, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
Instead of using `BootstrapConfiguration` `apiServer.host`, which will
be removed in a subsequent commit since no longer required, use the
control-plane IP of the host on which the `KubeConfig` file is
generated.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 7, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
Instead of using `BootstrapConfiguration` `apiServer.host`, which will
be removed in a subsequent commit since no longer required, use the
control-plane IP of the host on which the `KubeConfig` file is
generated.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
NicolasT added a commit that referenced this issue Dec 7, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 7, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 8, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
NicolasT added a commit that referenced this issue Dec 8, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
NicolasT added a commit that referenced this issue Dec 8, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
NicolasT added a commit that referenced this issue Dec 12, 2019
…' into w/2.5/improvement/GH-2103-apiserver-ha

* origin/improvement/GH-2103-apiserver-ha:
  salt, kubernetes: remove `apiServer` from `BootstrapConfiguration`
  scripts/iso-manager: override `metalk8s.api_server.kubeconfig` in Pillar
  salt, kubernetes: use the `apiserver-proxy` to connect to `kube-apiserver`
  kubernetes: use control-plane IP in generated `KubeConfig`
  salt: deploy a local proxy to `kube-apiserver` on all nodes
  kubernetes: add `127.0.0.1` to SAN of `kube-apiserver` cert
  buildchain/image: allow to save image in multiple formats at once
  buildchain: distribute `nginx` as tar *and* in registry
  salt: use consistent permissions on SA private key
  salt: remove `metalk8s.internal.init`
  images: don't use `etcd3` 0.11.0 in `salt-master`
  ci: explicitly source `/etc/profile` when running build

Conflicts:
	buildchain/buildchain/versions.py
	salt/metalk8s/kubernetes/apiserver/installed.sls
NicolasT added a commit that referenced this issue Dec 13, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
NicolasT pushed a commit that referenced this issue Dec 13, 2019
We replace the `save_as_tar` boolean by a list of formats to use to save
the image.
This also allows to cleanup the code of RemoteImage by deporting
format-specific code into dedicated classes instead of branching hither
and yon in RemoteImage.

Refs: #2103
Signed-off-by: Sylvain Laperche <sylvain.laperche@scality.com>
(cherry picked from commit ac15e32)
Signed-off-by: Nicolas Trangez <ikke@nicolast.be>
NicolasT added a commit that referenced this issue Dec 13, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
NicolasT added a commit that referenced this issue Dec 13, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
NicolasT added a commit that referenced this issue Dec 13, 2019
Instead of using `BootstrapConfiguration` `apiServer.host`, which will
be removed in a subsequent commit since no longer required, use the
control-plane IP of the host on which the `KubeConfig` file is
generated.

See: #2103
NicolasT added a commit that referenced this issue Dec 13, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
NicolasT added a commit that referenced this issue Dec 13, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
NicolasT added a commit that referenced this issue Dec 13, 2019
…' into w/2.5/improvement/GH-2103-apiserver-ha

* origin/improvement/GH-2103-apiserver-ha:
  salt, kubernetes: remove `apiServer` from `BootstrapConfiguration`
  scripts/iso-manager: override `metalk8s.api_server.kubeconfig` in Pillar
  salt, kubernetes: use the `apiserver-proxy` to connect to `kube-apiserver`
  kubernetes: use control-plane IP in generated `KubeConfig`
  salt: deploy a local proxy to `kube-apiserver` on all nodes
  kubernetes: add `127.0.0.1` to SAN of `kube-apiserver` cert
  buildchain/image: allow to save image in multiple formats at once
  buildchain: distribute `nginx` as tar *and* in registry
  salt: use consistent permissions on SA private key
  salt: remove `metalk8s.internal.init`
  images: don't use `etcd3` 0.11.0 in `salt-master`
  ci: explicitly source `/etc/profile` when running build

Conflicts:
	buildchain/buildchain/versions.py
	salt/metalk8s/kubernetes/apiserver/installed.sls
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
Since we'll start using `nginx` as a 'normal' container, we need to
distribute it as both a tar archive (to inject in the `containerd` cache
on the bootstrap node), as well as in the 'registry' for clients to
pull.

See: #2103
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
We replace the `save_as_tar` boolean by a list of formats to use to save
the image.
This also allows to cleanup the code of RemoteImage by deporting
format-specific code into dedicated classes instead of branching hither
and yon in RemoteImage.

Refs: #2103
Signed-off-by: Sylvain Laperche <sylvain.laperche@scality.com>
(cherry picked from commit ac15e32)
Signed-off-by: Nicolas Trangez <ikke@nicolast.be>
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
When using a local(host) `nginx` to proxy to the `kube-apiserver`
instances using a simple *stream* transport (i.e., TCP socket, no TLS
handling by `nginx`), we need to ensure the TLS server exposes a
certificate the client will accept, i.e., has `IP:127.0.0.1` in the
`subjectAlternateName` field.

See: #2103
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
This is similar to the poor-mans-HA functionality as found in the
Kubespray project.

See: #2103
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
Instead of using `BootstrapConfiguration` `apiServer.host`, which will
be removed in a subsequent commit since no longer required, use the
control-plane IP of the host on which the `KubeConfig` file is
generated.

See: #2103
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
…rver`

Instead of relying on the `BootstrapConfiguration` `apiServer.host`
value, we can instead use the proxy to `kube-apiserver` running on every
host in the cluster.

See: #2103
ChengYanJin pushed a commit that referenced this issue Dec 17, 2019
We no longer need this since we provide in-cluster HA for
`kube-apiserver` access. If this is desired for out-of-cluster access,
we can provide this using a `LoadBalancer` `Service` once we have the
infrastructure to support this in place.

This also removed the optional deployment of `keepalived`.

See: #2103
See: #1788
TeddyAndrieux added a commit that referenced this issue Dec 20, 2019
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
TeddyAndrieux added a commit that referenced this issue Dec 23, 2019
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
TeddyAndrieux added a commit that referenced this issue Dec 23, 2019
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
gdemonet pushed a commit that referenced this issue Jan 17, 2020
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
TeddyAndrieux added a commit that referenced this issue Jan 20, 2020
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
TeddyAndrieux added a commit that referenced this issue Jan 21, 2020
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
TeddyAndrieux added a commit that referenced this issue Jan 22, 2020
Since #2103 we not longer use VIP for apiserver so we need an IP of one
apiserver to be able to register the new bootstrap node to the current
kubernetes cluster

Fixes: #2157
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants