Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set a git ref when a deploy is successful. #917

Merged
merged 3 commits into from Jul 5, 2019

Conversation

Projects
None yet
4 participants
@ajshepley
Copy link
Contributor

commented Jul 2, 2019

To support future features, we want to add a per-environment per-project git reference that will allow new tools and features to know which commit is currently deployed by making a Github refs api call.

See Shopify/shipit#743 for more context.

The basic gist of these changes is that when a deploy finishes, its -> success state transition will trigger a call to its owning Stack, which will enqueue a job that will update or create a ref in github pointing to whatever the Stack thinks is its latest deployed commit.

For now, we've assumed that it is sufficient to abide by a ref set at the end of a deploy, and not add things like startup checks by the stacks the ensure the currently set ref is correct (and so on). The idea being that it is fine to let an assumedly-rare edge case of stale or incorrect last deployed refs self-correct via deploys or manual intervention for now.

This should serve to resolve Shopify/shipit#743.

@ajshepley ajshepley requested review from DazWorrall and JackTLi Jul 2, 2019

@DazWorrall
Copy link
Member

left a comment

One minor thing but this LGTM. Were you able to 🎩from your dev machine?

@DazWorrall DazWorrall requested a review from casperisfine Jul 3, 2019

@ajshepley ajshepley changed the title Ajs/743 set ref upon deploy Set a git ref when a deploy is successful. Jul 3, 2019

@ajshepley

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

Yeah, I was able to test with Shipit running locally using a stack based on a personal project branch.

This led to me discovering the issue fixed with commit c3b669c. I may try to add a test for that case shortly, actually. (I hadn't discovered this issue previously as I was invoking the job via a console and via tests.)

That said, even when 🎩 -ing I find there's a lot going on with Shipit and so it's easy (so far) for me to miss or not think of important cases. So if you're thinking I might have missed something, do let me know! 😃

@DazWorrall
Copy link
Member

left a comment

Looks good!

@ajshepley

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

I realized there's a flaw in what I've written in c3b669c - it could, in theory, retrieve a deploy other than the one that enqueued the job.

That alone isn't a problem and, in fact, ensures better behaviour of only updating with the latest sha. However, it's technically possible that a different, unsuccessful deploy could have occurred between the job being enqueued and the job firing. This would result in an incorrect ref being set.

I'm preparing a change (now: 82b2c23) that has the deploy provide the job with its own until_commit sha. This means that the job could be stale by the time it executes, but in exchange it can never update the ref with the sha of a commit that wasn't successfully deployed.

However, something I'm now wondering about is if we want the ref to be updated for other Completed deploy statuses (such as Faulty, etc.). Those environments will technically be running new changes. But on the other hand, maybe we don't want to signal to other tools/features that they should be using the new commit in those cases? @JackTLi @DazWorrall what do you guys think?

That is, should I change deploys.rb from

after_transition to: :success, do: :update_latest_deployed_ref

to something like

after_transition to: %i(success flapping faulty validating), do: :update_latest_deployed_ref

EDIT: Talked to Jack directly, we explicitly only want this happening on success, so I'll leave it as-is.

@JackTLi

This comment has been minimized.

Copy link
Member

commented Jul 3, 2019

hm I'm a little lost on the change to include the sha as a job parameter - what's wrong with stack_sha = stack.last_deployed_commit.sha in the job? I would prefer this since it'll always give whatever is out right now. Ideally this value should be the one that's in sync with the ref

@ajshepley ajshepley force-pushed the ajs/743_set_ref_upon_deploy branch 3 times, most recently from 9dc43e2 to cd27cd1 Jul 3, 2019

@ajshepley

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

For posterity, we discussed Jack's comment above offline:

In practice, stack.last_deployed_commit was somehow returning the information of the deploy prior (to the one that triggered the job) when running shipit locally / 🎩, but manually calling stack.deploys.last from the job correctly returned the most recent deploy.

But this ran the risk of failed deploys' SHAs being picked up, etc. Filtering seemed long-in-the-tooth compared to just passing the sha from the deploy triggering the job.

However, when discussing this it seemed that it was better to just make the filtered call

stack_sha = stack.deploys.where(status: "success").last&.until_commit&.sha

in the job, ensuring that it is correct even if it's a bit ugly, rather than run into potential issues due to staleness or ordering concerns. So I've now gone down that route, and added a test that checks that failed deploy SHAs are not sent to github.

As a sidenote, we also came to the conclusion that, due to rollbacks needing to update the ref, we'll let Octokit force the update_ref call via force: true. Github, by default, does the opposite (and so only allows fast-forwards) but Octokit true as the default.

ajshepley added some commits Jun 27, 2019

When deploys complete, they will now tell the stack to save a git ref…
… pointing at the latest deployed commit sha.
Added tests for update-git-refs job, added test that asserts that the…
… ref update task is enqueued when deploys complete. Added test for deploys triggering the ref job.

Added test for when the raised exception is not the one expected.

Added test that ensures the correct/most recently deployed SHA is chosen by the job.

Added test that verifies that only the most recent successful deploy sha of a stack is used by the ref job.

@ajshepley ajshepley force-pushed the ajs/743_set_ref_upon_deploy branch from cd27cd1 to ba3f19c Jul 3, 2019

DEPLOY_PREFIX = 'shipit-deploy'.freeze

def perform(stack)
stack_sha = stack.deploys.where(status: "success").last.until_commit.sha

This comment has been minimized.

Copy link
@casperisfine

casperisfine Jul 4, 2019

Contributor

You should put that query in a Stack method, so that it both have a meaningful name, and is easy to unit test in stacks_test.rb.

Also there's already a bunch of methods doing something very close.

There is also a scope for success status:

scope :success, -> { where(status: 'success') }

This comment has been minimized.

Copy link
@ajshepley

ajshepley Jul 4, 2019

Author Contributor

EDIT: See the comment after this one.


Yeah, it's definitely worth extracting. I'll do so now.

Though I figure I should explain why I didn't re-use the existing scopes and methods:


I was previously trying to retrieve this information via stack.last_deployed_commit (which turned out, after talking to Jack, wasn't restrictive enough. But that's a separate issue.)

That method is just calling last_completed_deploy, which itself is just deploys_and_rollbacks.last_completed.
last_completed is just a class method returning the last record of the completed scope. That scope includes 'success'.

This worked fine in automated tests. However, I observed some odd behaviour when testing shipit itself locally.

When the Deploy transitioned to Success, it would trigger my method and enqueue the job. However, when the Job later called into the stack (via last_completed_deploy), it would return the deploy prior to the one that triggered the job.

But stack.deploys returns the latest deploy correctly.


I'm new to Rails, so I wasn't sure if this was due to how scopes work under the hood, or some other mechanism that I wasn't understanding.

I haven't gone so far as looking at the queries themselves to determine exactly what's going on here, but that was a reason I avoided using the existing scope here.

I have a bit more bandwidth now to test around and figure out this behaviour (or discover the flaw in my testing that led me to think this was happening).

This comment has been minimized.

Copy link
@ajshepley

ajshepley Jul 4, 2019

Author Contributor

I did some more research and talked to Jack about the stuff I described above.

It seems that state_machine's after_transition executes before the active record commits, and so the job is enqueued before the commit as well.

When running Shipit locally, this can result in the Job itself executing before the Deploy's change to 'success' has been committed, and so the scope filtering to the most recent deploy of 'success' will return the incorrect deploy.

This race condition explains why I was seeing the odd behaviour described above. This is actually an issue anywhere we're using after_transition to enqueue a job that accesses db state.


After talking to Jack, we've decided that this is unlikely to occur in production. So I'm going to defer to adding a Stack method that uses the aforementioned scope.

There are a few different overall approaches and workarounds that could be taken, but Jack and I agreed that they should be held off and done on a later change.

@ajshepley ajshepley requested a review from JackTLi Jul 4, 2019

@casperisfine
Copy link
Contributor

left a comment

Code LGTM.

One thing I'd suggest though would be to make this opt-in. Because many CI system will trigger a build on new branch created (no longer a problem with buildkite), and people might want to have a specific branch name.

@JackTLi

JackTLi approved these changes Jul 5, 2019

@ajshepley

This comment has been minimized.

Copy link
Contributor Author

commented Jul 5, 2019

EDIT: Nevermind that. This repo is not hooked up to that hook.

@ajshepley ajshepley merged commit aa6a1e3 into master Jul 5, 2019

3 checks passed

CLA Contributor License Agreement (CLA) status
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ajshepley ajshepley deleted the ajs/743_set_ref_upon_deploy branch Jul 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.