Skip to content

Rebuilding a Master #6911

@brettneese

Description

@brettneese

What's the best way to set OpenShift up to easily rebuild the master while keeping data and settings intact? We had a few masters blow up on us for reasons we're not sure of, and we're running into lots of problems getting our cluster back online.

Our current strategy was going to be run a separate etcd instance and image that, but when we destroyed the master and attempted to rebuild it using the ansible-install script, everything seems to be frozen and not doing anything. The web console lists the services and deployments but they're not building or deploying, just hanging, and there's nothing in the logs for any of them either.

The only suspicious thing in our journalctl is:

`Jan 29 18:00:23 ip-172-31-41-10.ec2.internal origin-master[858]: E0129 18:00:23.961762 858 horizontal.go:69] Couldn't reconcile horizontal pod autoscalers: error listing nodes: the server has asked for the client to provide credentials (get horizontalPodAutoscalers)``

We're running on AWS for now but will eventually have our own metal, using this AMI: https://aws.amazon.com/marketplace/pp/B00O7WM7QW

(I'm also on IRC as <hbk> <bneese> - we connected the IRC channel to our slack. :))

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions