Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upMove buildmaster from Linode to AWS (tracking issue) #281
Comments
|
We actually don't have instructions for setting up a Salt master yet, so I'll need to write those up and add them to the wiki (e.g. installing the right version of the right package). Also, if you're trying to do this as a multimaster transition (which I recommend) then you'll need to first list both masters in the minion configuration file, then after verifying the new master is correctly set up removing the old one. Note: if you're going to give both masters |
|
The private google doc should have some possibly out of date instructions for setting up a salt master. I think it was mostly bootstrappable except for some minor things. Thought these days, we probably have things like minion keys that need to be dealt with. When I wrote the instructions I was doing a totally fresh master :) |
|
I'd be interested in taking a peek at that doc purely out of curiosity :) As for the DNS, the fact that Homu and nginx are running on the same box as the Salt master is just an artifact of our environment, and we shouldn't bake it into our configuration files. We should add a separate DNS entry for
right after we set up the new master. After restarting the minions and checking connectivity to the new master, we can update it to say simply:
This lets us decouple the DNS changes for Salt from the DNS changes for Homu, when we switch over the What's the timezone for the 9am switchover? I may or may not be awake that early. |
|
I think it was originally a private etherpad, and those servers have since been moved behind a firewall and then burned down (ether-pocalypse). It's 9AM US Pacific time :-) |
Remove longview setup for servo-master Linode exodus Longview is Linode's proprietary monitoring service. We are moving the servo-master machine from Linode to EC2, so we can and will no longer use Longview, so this commit removes it. Refs #281 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/saltfs/282) <!-- Reviewable:end -->
|
@aneeshusa Thanks for the feedback! I just checked the secrets doc and it does not currently contain any setup instructions for the build master. How much of the master setup can we do through Salt itself? Does my amended checklist for doing a dual-master switchover look sane? |
|
Master setup: For now, I just plan to create a script to install the correct More complicated setup is IMO not worth automating right now; e.g., keeping the list of accepted minion keys in sync would want some kind of CMDB (even if it's just flat files), and ditto for keeping the pillars synced. (GitFS should keep the file tree synced). Checklist: We should explicitly separate setting up a Salt minion on the new machine from setting up a Salt master on the new machine on the checklist (since we haven't Salted the process of setting up a Salt master yet). It looks like the checklist already does this, but I'd like to be more clear for that section:
We also need to add some additional steps to the switchover. Earlier, I was thinking to reuse the Also, I have a busy week, so any chance of pushing the switchover date back a few days would be appreciated! |
|
Salt minion IDs are meant to be unique, so we should follow the convention instead of trying to reuse the New steps (we should do these before bringing up the new master):
At this point, we can proceed with setting up the new machine, and assigning it a Salt minion ID of Also, I don't know what I was thinking yesterday with the DNS records, but here's my updated recommendations after some sleep:
The goal is that for a given machine, the Salt minion ID == the hostname for the machine == the DNS A record for that machine, and that this is an immutable identifier for that machine. Separately, when we point at a given hostname for functional reasons (i.e. the Homu webhooks), we should use a CNAME record to point to the appropriate machine, because the particular machine that is responsible for these settings could change at any time and the application should not know. Updated minion config settings:
After the switchover:
Since we'd like the minions to connect to all the masters and not just one, we can use the immutable names directly and we don't need a We may also want to consider adding |
Handle multiple (redundant) masters Update the Salt states to handle multiple minions that host masters. This will allow us to easily enter redundant multimaster mode to handle switching over our master from Linode to EC2, by using separate IDs for each machine instead of trying to reuse the `servo-master` ID. See #281 (comment) for more details. I haven't updated the `common/map.jinja` file yet; are we still using these hostnames in the `/etc/hosts` file or is everything happening via DNS lookups? <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/saltfs/284) <!-- Reviewable:end -->
|
@aneeshusa The production salt setup is now dual-master. They're I believe this covers our bases for a smooth transition on the saltfs side of things, and now the buildmaster+homu+DNS move is all that's left to worry about. I've amended all the minions'
and they appear to be connecting to both buildmasters successfully at the same time.
|
|
Everything (http://servo-master1.servo.org/homu/queue/servo == http://build.servo.org/homu/queue/servo, http://servo-master1.servo.org/buildslaves == http://build.servo.org/buildslaves) appears to be working correctly. servo/servo#10285 shows that webhooks are working. |
|
The only surprise that I ran into was that in @aneeshusa, I read a bit about managing Salt with Salt and while I think it's a good idea to manage the salt config files, it looks like it introduces a variety of complications. If you'd like to invest the time into making it work, I'd happily help get a PR through, but for now I think that making this type of major changes to the Salt setup is such an infrequent occurrence that our effort is probably better focused on more commonly used parts of the system. |
|
We're shutting down |
|
I did run a Does buildbot properly respect DNS TTLs/will it retry and reconnect if we change the We still list Agreed on the Salt changes for now. |
|
I don't know the details of how Buildbot handles TTLs, but kicking the buildslave processes seemed to cause them to repeat the lookup. So Buildbot itself probably isn't caching too hard. Wiki updated. I'll open an issue to discuss migrating away from hardcoded IPs in |
|
Kicking buildbot should be good enough in that case. Already opened an issue for |
cc @larsbergstrom
Async steps
m3.mediuminstance on AWS has 1 core and 4.02GB RAMservo-master0as servo-masterand run a Salt highstate/srv/salt/pillar/srv/pillar/*/etc/salt/master/etc/salt/pki/masterdirectly to new master with sftpUpdate all Salt minions to saymaster: build.servo.orgin/etc/salt/minion; amend https://github.com/servo/servo/wiki/Buildbot-administration#linux to reflect this changeservo-master0.servo.orgA record pointing to old buildmasterservo-master1.servo.orgA record pointing to new buildmasterbuild.servo.orgrather than the Linode master's IPAt 9am PST Wednesday 3/30
/home/servo/homu/main.dbfrom old master to new; restart homubuild.servo.orgrecord to point at new master. All webhooks use DNS, so they will not need to be modified.