Skip to content
This repository has been archived by the owner on Mar 2, 2018. It is now read-only.

Question: Tag experiments and avoid the master branch for production #11

Closed
vladikoff opened this issue Jun 18, 2015 · 16 comments
Closed

Comments

@vladikoff
Copy link
Contributor

Main concern:

The master branch should not be the one that is used in production

Some Notes from a discussion with @pdehaan @jrgm:

  1. Should we have a way to QA the new experiments in latest or stage before they go live into production? Which means - should prod use a specific train-XX branch which gets updated manually before deploy.
  2. Using the master branch for production releases feels like an anti-pattern weird and IMHO will be really unexpected to new developers of the project.
  3. It makes it harder to test new experiments:
    • there is no FxA QA cycle via TeamCity or other testing
    • e2e tests affected: i.e some experiment makes Marketplace E2E tests fail and is only caught when the experiment was enabled in production.
  4. How does rollback work for experiments?
  5. Who is in charge of merging the experiments, how can that person tell if production was negatively affected by the changes made to this repository?

Related, we should limit "Merge access" to this repository: #10

@vladikoff
Copy link
Contributor Author

EDIT: removed the word anti-pattern and clarified point 2. ;)

@dannycoates
Copy link
Contributor

Should we have a way to QA the new experiments in latest or stage before they go live into production?

Absolutely...

Using the master branch for production releases feels weird

I agree. We don't follow that pattern for any of the project repos, so it's surprising. I propose we have a branch for each environment we have code running on.

  • master becomes the dev branch
  • stage and prod each have their own branch
  • topic branches as needed for new work

Each env will specify in their config which branch they're following. I don't think we want train branches. We could tag prod with the train on release day for record keeping, but the release cycle of experiments should not be bound to trains.

Experiments should merge from branch to branch only in this order, topic -> master -> stage -> prod. Rolling back is either a git reset or git revert.

Is anyone uncomfortable with that branch strategy?

@rfk
Copy link
Contributor

rfk commented Jun 18, 2015

removed the word anti-pattern

Is the word "anti-pattern" considered an anti-pattern? :-P

@shane-tomlinson
Copy link

master becomes the dev branch
stage and prod each have their own branch
topic branches as needed for new work

I like this strategy, with a modification to line 2. Copying a comment I made in IRC:

If we have to roll back a train, then it'd make it easy to roll back the experiments too. For example, say train 42 was a catastrophe and needs to be rolled back. With just a prod branch, fxa-content-experiments would still be looking for uniqueUserId whereas the content server would be passing uuid.

The background is uuid may be renamed to uniqueUserId during train 42.

@rfk
Copy link
Contributor

rfk commented Jul 9, 2015

We could tag prod with the train on release day for record keeping

This sounds prudent to me. What's the next action on this bug?

@shane-tomlinson
Copy link

We could tag prod with the train on release day for record keeping

This sounds prudent to me. What's the next action on this bug?

After mulling over rollback, I am strongly against using a generic prod branch. Here is the scenario I want to guard against.

We want to change the name uuid to uniqueUserId in train 42, including where used as a subject attribute in fxa-content-experimetns. We update fxa-content-experiment's prod when train 42 is cut. @jrgm and co find that a bug was introduced in train 42 that requires a rollback to train 41. At this point, all code except the fxa-content-experiments is rolled back to train 41. fxa-content-experiments is still using prod, which was cut at train 42, meaning the experiment's subject attributes will still be looking for uniqueUserId instead of uuid. At this point, none of our able based selections work as we expect. We roll the content-experiments back to a known good sha and disable auto-updating on the content-server. It would have just been easier to tag the experiments train-41 and train-42 in the first place.

@rfk
Copy link
Contributor

rfk commented Jul 9, 2015

I thought that's broadly what @dannycoates was suggesting with "we could tag prod with the train on release day" - put a train-X tag on prod branch each time we do a deploy, but retain the flexibility to add additional commits to the prod branch outside of the regular train cycle. @shane-tomlinson are you suggesting more along the lines of a separate train-X experiments branch for each train, do that we don't have to rollback anything in this repo?

@shane-tomlinson
Copy link

put a train-X tag on prod branch each time we do a deploy, but retain the flexibility to add additional commits to the prod branch outside of the regular train cycle.

I suggest the content server should not load directly from the prod branch to avoid dealing with hassles/inadvertent errors during a rollback.

When a new production train is cut, a new branch of fxa-content-experiments can be created as well, named train-xx. We can automate updating the content server configuration every time a train is cut to pull from that train's branch. If we want to modify behavior of the currently running train, we can follow a merge sequence similar to @dannycoates's suggestion: master->stage->train-xx

This sequence is how we already approach hot-patching existing repos, and gives us a way to easily rollback without worrying about keeping track of specific git shas and updating content server config manually.

I want to avoid adding more stress to the rollback procedure than already exists, diverging from the existing process is going to add friction that I do not see the need for.

@vladikoff
Copy link
Contributor Author

I suggest the content server should not load directly from the prod branch to avoid dealing with hassles/inadvertent errors during a rollback.

So we should not need a prod branch at all right?

We can automate updating the content server configuration every time a train is cut to pull from that train's branch.

👍

@dannycoates
Copy link
Contributor

Train branches sound reasonable. One way we could still mess up is if we forget to create the new train branch in this repo before the new train is rolled out.

@vladikoff
Copy link
Contributor Author

if we forget to create the new train branch in this repo

content server should throw an error if the branch does not exist during startup.

Side question: what happens if github is not accessible during the refresh process (that runs every 5-10 minutes) @dannycoates ?

@dannycoates
Copy link
Contributor

We could also skip the stage branch and have it use the train-(X + 1) branch to catch the missing branch early.

what happens if github is not accessible during the refresh process?

It times out and tries again the next time.

@rfk
Copy link
Contributor

rfk commented Jul 9, 2015

SGTM :-)

@vladikoff
Copy link
Contributor Author

Decisions made!

@edwindotcom
Copy link

closing, everyone is aware of the new process.

@rfk
Copy link
Contributor

rfk commented Jul 14, 2015

everyone is aware of the new process.

Let's make sure this new process is documented somewhere for posterity; I'm sure my own awareness will decay quickly with time

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants