New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stuck on kube-apiserver HA #55677

Closed
Dieken opened this Issue Nov 14, 2017 · 5 comments

Comments

Projects
None yet
5 participants
@Dieken

Dieken commented Nov 14, 2017

// I didn't get suggestion from slack, so open an issue here.

I get stuck on kube-apiserver HA. The load balancer on China Aliyun cloud provider doesn't allow HTTPS backend server for L7 load balancer, and backend server behind L4 load balancer can't access itself through L4 load balancer.

k8s node ---https--> L7 LB -----[oops, no https]-----> k8s apiserver
k8s master nodeA  ------> L4 LB ----[oops, can't reach] -----> k8s master nodeA

I planned to place all k8s master nodes behind L4 LB, all k8s worker nodes access apiserver through LB, all k8s master nodes directly use local apiserver, but it's a pity kube-proxy on all k8s nodes use a configmap "kube-proxy" which includes the address of apiserver, I can't use different config for kube-proxy pods on k8s master and worker nodes.

Currently my plan B is to write a DNS checker, it periodically checks all apiserver and updates DNS record, or even updates /etc/hosts on all k8s nodes. Is there any better solution?

IMHO, I am wondering why kubelet + kube-controller-manager + kube-scheduler can't take a list of apiservers when kube-apiserver takes a list of etcd servers, then we don't have to rely on load balancer or DNS to archive HA for kube-apiserver.

@kubernetes/sig-api-machinery @kubernetes/sig-cluster-ops @kubernetes/sig-network

@Dieken

This comment has been minimized.

Show comment
Hide comment
@Dieken

Dieken Nov 14, 2017

/sig api-machinery
/sig cluster-ops
/sig network

Dieken commented Nov 14, 2017

/sig api-machinery
/sig cluster-ops
/sig network

@Dieken

This comment has been minimized.

Show comment
Hide comment
@Dieken

Dieken Nov 27, 2017

My plan B to update /etc/hosts on host OS doesn't work, PODs lookup DNS through host /etc/resolve.conf, not /etc/hosts.

Currently I just let those 3 k8s master nodes tolerate 1/3 chance to get timeout connecting to SLB, if one node is down, it comes to 1/2 chance, very dirty :-(

Dieken commented Nov 27, 2017

My plan B to update /etc/hosts on host OS doesn't work, PODs lookup DNS through host /etc/resolve.conf, not /etc/hosts.

Currently I just let those 3 k8s master nodes tolerate 1/3 chance to get timeout connecting to SLB, if one node is down, it comes to 1/2 chance, very dirty :-(

@fejta-bot

This comment has been minimized.

Show comment
Hide comment
@fejta-bot

fejta-bot Feb 25, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot commented Feb 25, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot

This comment has been minimized.

Show comment
Hide comment
@fejta-bot

fejta-bot Mar 27, 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

fejta-bot commented Mar 27, 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@fejta-bot

This comment has been minimized.

Show comment
Hide comment
@fejta-bot

fejta-bot Apr 26, 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot commented Apr 26, 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment