Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream Kubernetes Bug: Controller Manager & Scheduler show as Unhealthy in clusters that have controller-manager and scheduler not on the same host as API server (Azure, Tectonic among others) #11496

Open
tfiduccia opened this issue Feb 20, 2018 · 10 comments

Comments

Projects
None yet
10 participants
@tfiduccia
Copy link

commented Feb 20, 2018

Rancher versions: 2.0 master 2/20

Steps to Reproduce:

  1. Create a Azure AKE cluster
  2. Wait for it to activate and go to dashboard

Results: The Controller Manager and Scheduler stay in unhealthy state.
image

@nathan-jenan-rancher

This comment has been minimized.

Copy link
Contributor

commented Feb 21, 2018

This is a bug with Azure that we can't fix ourselves, here is the issue we opened with them:

Azure/AKS#173

@deniseschannon deniseschannon modified the milestones: v2.0 - Beta, v2.0 - GA Feb 22, 2018

@deniseschannon deniseschannon modified the milestones: v2.0 - Beta, v2.0 - GA Mar 6, 2018

@deniseschannon deniseschannon changed the title Controller Manager & Scheduler don't Activate in Azure AKS Upstream Azure Bug: Controller Manager & Scheduler don't Activate in Azure AKS Mar 6, 2018

@deniseschannon deniseschannon modified the milestones: v2.0 - GA, v2.1 Apr 25, 2018

@JasonvanBrackel

This comment has been minimized.

Copy link

commented May 10, 2018

This appears to be an upsteam kubernetes issue, that is affecting AKS by proxy affecting Rancher.

@nathan-jenan-rancher linked to this kubernetes/kubernetes#19570 in the conversation on Azure/AKS#173. This is complicated by the fact that componentstatus is broken and likely be deprecated before it's fixed.

See kubernetes/enhancements#553
and
kubernetes/kubernetes#18610

@codedevote

This comment has been minimized.

Copy link

commented Jul 6, 2018

As far as Is understand, there is currently no way to create an AKS cluster using rancher that will actually work as expected due to this issue. Is there any workaround or advice on how to overcome this limitation?

@JasonvanBrackel

This comment has been minimized.

Copy link

commented Jul 6, 2018

@codedevote These clusters are fully functional, the health reporting what is wrong. If you were to create an AKS cluster from the Azure Portal and import it into rancher, this cluster would be completely functional, but the controller manager and scheduler will show unhealthy in Rancher. Rancher can still manage workload on the cluster as designed.

@codedevote

This comment has been minimized.

Copy link

commented Jul 7, 2018

@JasonvanBrackel Ah, ok. Thank you very much for clarification on this issue.

@superseb superseb changed the title Upstream Azure Bug: Controller Manager & Scheduler don't Activate in Azure AKS Upstream Kubernetes Bug: Controller Manager & Scheduler show as Unhealthy in clusters that have controller-manager and scheduler not on the same host as API server (Azure, Tectonic among others) Jul 7, 2018

@cjellick cjellick modified the milestones: v2.1, v2.2 Aug 22, 2018

@cjellick cjellick modified the milestones: v2.2, Backlog Sep 18, 2018

@loganhz loganhz added the area/ake label Oct 20, 2018

@luisve1427

This comment has been minimized.

Copy link

commented Dec 27, 2018

Do you have a timeline for this? This issue is stopping us on using rancher with AKS, I understand that the cluster is fully functional but the dashboard is used as a visualization tool

@JasonvanBrackel

This comment has been minimized.

Copy link

commented Jan 2, 2019

Do you have a timeline for this? This issue is stopping us on using rancher with AKS, I understand that the cluster is fully functional but the dashboard is used as a visualization tool

@luisve1427 This is a problem in Kubernetes itself. The underlying APIs involved are in a state of limbo as per my earlier comment in #11496 (comment).

Unfortunately any tool relying upon the Kubernetes Component Status API is going to have a problem with Clusters built in Azure until those APIs are deprecated in favor of APIs that are not yet released.

@hadifarnoud

This comment has been minimized.

Copy link

commented Mar 1, 2019

I'm not on Azure and still have the same issue (on Hetzner)

@JasonvanBrackel

This comment has been minimized.

Copy link

commented Mar 4, 2019

It's a Kubernetes problem. It's surfaced by any implementation that does not host the scheduler and the kubeapi-server on the same node. The underlying code makes incorrect assumptions around this layout as it's calling localhost in areas where that's no longer a sound assumption. It doesn't surface on most implementations as most implementations use a traditional Kubernetes "master" node, but it's not a requirement. Hetzner is likely architected similar to Azure in that manner to surface there error.

@ablekh

This comment has been minimized.

Copy link

commented Mar 4, 2019

@JasonvanBrackel Would you have any thoughts/advice/help on #15986 (comment) by any chance? Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.