[Proposal] Release process #667

seanknox · 2017-05-23T20:47:55Z

Background

A constant pain point for acs-engine users and internal teams at Microsoft using acs-engine has been lack of clearly defined versions that can be tracked for stability and feature completeness. Now that the project has agreed to use SemVer for version numbers, I'm proposing the following possible approach in defining releases of acs-engine.

For the first release, I propose the following:

Define/flesh out high-level roadmap (included in this PR) to be updated through the course of the project toward a "stable" v1.0.0 release.
Define the first official acs-engine release, v0.1.0, using GitHub Milestones
Define special CI release job that 1) runs all unit tests, functional/regression tests, any other additional tests pertinent for release verification and 2) uses the GitHub API to tag the repo and create a release 3) build and deploy any artifacts like Docker images (not clear this is needed)
Once we believe all issues/features are accepted for v0.1.0 release kick off automated release testing
In addition to automated testing, conduct manual testing according to a regression test matrix (TBD)
If testing produces an issue needing a fix, PR the fix, get it reviewed and merged into master
Use automated CI job above to trigger release

Going forward, the release process would be:

Bi-monthly planning meetings to review project Roadmap and define release milestone
Once all issues/features are accepted for the release milestone, kick off automated release testing as well as manual testing
If testing produces an issue needing a fix, PR the fix, get it reviewed and merged into master
Use automated CI job to trigger release

This change is

Fixes #55

seanknox · 2017-05-23T20:48:43Z

docs/roadmap/releases.md

+### Step 3: Tag and Create a Release
+
+TBD
+


@jchauncey do you have a lightweight recommendation for using CI to do this?

So typically you have a foo-release job and watches for pushed tags to a repo. You can then build whatever artifact you want and with that given tag.

seanknox · 2017-05-23T20:49:07Z

docs/roadmap/releases.md

+### Step 4: Close GitHub Milestones
+
+TBD
+


Will flesh this out a bit later, before merging this PR.

We have used our "seed-repo" job in the past to close milestones and keep the labels in sync with each other.

Considering it's a single repo (just acs-engine) do you think that'd be helpful here?

seanknox · 2017-05-23T20:52:24Z

First crack at a release process. Please take a gander and comment. Thanks!

cc @anhowe @rgardler @dmitsh @lachie83 @jchauncey @sgoings

seanknox · 2017-05-23T21:00:45Z

@ritazh @wbuchwalter would love your thoughts as well.

wbuchwalter · 2017-05-25T22:06:18Z

Great stuff, two questions:

The dev would happen directly on the master branch like today, but releases would now happen more or less every two months.
What would be the process if an urgent fix needs to be made to the previous release?
Would it be easier to have the development happening on a dev branch and merging in master for each release?
Also what does a release would look like exactly? would the acs-engine binaries be stored in a GitHub release?

seanknox · 2017-05-25T22:23:24Z

The dev would happen directly on the master branch like today, but releases would now happen more or less every two months.

Also what does a release would look like exactly? would the acs-engine binaries be stored in a GitHub release?

Master development will largely continue as it already does. Releases will occur monthly, driven by milestones defined in GitHub. Release milestones follow the roadmap toward v1.0.0. The roadmap itself still is TBD, though I've added some examples (GPU support, support for dedicated etcd clusters, etc.).

What would be the process if an urgent fix needs to be made to the previous release?

Patch fixes can happen at any time, based on need. We're using SemVer for version numbers, which works like:

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards-compatible manner, and
PATCH version when you make backwards-compatible bug fixes.

Would it be easier to have the development happening on a dev branch and merging in master for each release?

I think this would be harder, actually. Trunk-based development (committing to master) discourages long-lived development branches to avoid merge hell and broken builds.

jchauncey · 2017-05-25T22:29:01Z

Yeah you really really want to avoid long running "feature/dev" branches in this workflow. So we are constantly merge into master.

Lets say that we find a bug we need to fix today and do an immediately release for. The workflow would look something like the following:

PR gets submitted for bug fix and is reviewed
Merge PR into master
Checkout last release tag and create local branch
Cherry pick new commit from master onto local branch (advancing that tree by 1 commit)
Cut tag in git and push to github which will kick off the release pipeline in CI
This pipeline validates the bug fix is good
Publish artifact with version of new tag

This process can keep going for subsequent PATCH releases and even works if we need to port fixes onto older releases (for whatever reason that occurs).

wbuchwalter · 2017-05-25T22:29:46Z

Totally missed the SemVer part in your first explanation, makes sense. 👍 Forget what I said about git flow then.
Looking forward to this, I have a project that has acs-engine has a dependency and that would make life much easier.

seanknox · 2017-05-26T19:59:07Z

Folks, I'm planning to merge this to get us moving. We'll adjust the process as we learn what works bests.

JackQuincy · 2017-05-26T20:10:40Z

LGTM. But I don't have much experience on this sort of things so I'm not going to approve so someone else needs to look at this as well. @colemickens @anhowe @sgoings can any of you approve/look?

amanohar · 2017-05-26T20:56:03Z

If I have a multi week feature that I am working on: is the proposal there are regular merges into master for that? What if a release was scheduled for say 6/15 but the feature is not ready (but other features are)? How do will commit be picked for being tagged as a version? Will the partial change for a feature in development still be released?

jchauncey · 2017-05-26T20:59:57Z

It largely depends on how you want to divide up the feature development. Lets say its a sufficiently large feature then we might put it behind some type of conditional flag with it off by default. This wya we can continue to merge and ship off master without it being live. Then when the feature is ready for use we turn it on by default. If a release was scheduled for 6/15 and we arent ready for people to use it then we should still be able to ship.

Nothing should ever block master from being released (in theory ;))

amanohar · 2017-05-26T21:06:41Z

In theory, yes :). I am using a feature flag so its hidden even today.

My question/concern is around shipping partial code/feature that might not have been fully tested. In theory nothing should even hit the code path of a new feature but that's not always the case. I am in favor of one branch approach but its not without risks.

jchauncey · 2017-05-26T21:09:59Z

Sure there are risks but we can definitely minimize those. Feature flags help a lot and for areas where we are refactoring code and removing functionality in favor or something else we should have adequate test coverage to validate something is ok to merge.

seanknox · 2017-05-27T02:19:23Z

If I have a multi week feature that I am working on: is the proposal there are regular merges into master for that? What if a release was scheduled for say 6/15 but the feature is not ready (but other features are)? How do will commit be picked for being tagged as a version? Will the partial change for a feature in development still be released?

These are great questions, so let's unpack this a bit.

If I have a multi week feature that I am working on: is the proposal there are regular merges into master for that?

Yes, and feature flags are a great way to enable this. This allows you as the engineer working on the long-running feature to stay current with HEAD without fear that the code path can hit your still-in-development work. As a corollary, I'm want to encourage us to seek to deliver value in chunks. If a feature is large and complex, break the feature down into discrete deliverables and put them behind feature flags. You can see an example of the "smallest thing that adds value" mindset in Lachie's issue about creating a separate etcd cluster:

a user can provision a single etcd member on a separate VM

a user can provision an etcd cluster on separate VMs

a user can provision an etcd instance with SSD disks

The feature as a whole is "dedicated etcd cluster" which is a lot of work until it would be complete. We can provide value to users right away and avoid merge conflict hell by merging each portion to master as it is completed. Use feature flags to rapidly iterate without jeopardizing safety.

What if a release was scheduled for say 6/15 but the feature is not ready (but other features are)?
How do will commit be picked for being tagged as a version?

Core to our ability to ship software at a regular cadence is planning. Periodically maintainers will meet—every month is what I proposed—to identify goals for the next version milestone. We do our best to estimate what we think we can get done. If we're wrong—say a feature was slated for the v0.2.0 release but isn't ready, then the feature won't be released. This is where a backlog and regular communication about the state of work can help us adjust course as we learn more. I'd like us to start using a backlog to provide transparency about feature and bug fix work (probably GitHub issues).

Will the partial change for a feature in development still be released?

Use feature flags. We want to constantly be delivering value and providing guardrails to ensure safety. This includes tests, linting, CI, and code review. If a commit can pass all of these gates and is behind a feature flip, then we have great confidence that the change set can be committed in a safe fashion.

My question/concern is around shipping partial code/feature that might not have been fully tested. In theory nothing should even hit the code path of a new feature but that's not always the case.

I've heard this fear many times in my career and can assure you there is no magic to making it work: a conditional evaluates to either true or false. If you have code behind feature flag that is only evaluated by the Go lexer if the feature flag conditional is true, it can't be executed unless it passes the truth test.

We can increase our confidence that this is all true by ensuring we include unit and functional tests at every opportunity. When something breaks—CI or a regression is actually released—we update our automated tests to catch that from happening again. We create a delivery pipeline that all code must pass through before being committed:

unit tests
lint/style
PMs working side by side with engineers to ensure we all share the same idea about the feature
code review
functional/regression/smoke tests
CI as a gating mechanism to enforce all of this

amanohar · 2017-05-28T05:53:40Z

I've heard this fear many times in my career and can assure you there is no magic to making it work:

Makes sense. Thanks! Having done this before I understand that testing and CI are important. Although no fear involved here :) . I want to make sure I understand the proposal as it relates to my work and finding a good balance between quality and agility.

a conditional evaluates to either true or false. If you have code behind feature flag that is only evaluated by the Go lexer if the feature flag conditional is true, it can't be executed unless it passes the truth test.

As mentioned code is behind a condition so we should be good. And based on my experience it not always possible to write that can be just cleanly put behind a condition and turned off. There is possibility of some interactions/intersections between old and new code in some code paths.

jdumars · 2017-05-30T14:25:29Z

@seanknox - very well put. Working at Rally Software showed me how powerful Agile principles are when properly applied. It's not necessarily about dogma. It's about a spirit of partnership with the customer where delivery is as rapid as possible so course-correcting feedback can be reconciled with the plan. And, in every action, is the curiosity about how we might do it better next time. These two driving forces coupled with some of the practices you mentioned above can positively transform the entire landscape of development processes.

ritazh · 2017-05-30T15:54:15Z

Great discussion! Completely agree with the approach.

We can increase our confidence that this is all true by ensuring we include unit and functional tests at every opportunity.

We should probably start from this. Creating issues where we are currently missing tests. Most of the tests today are around templates.

CI as a gating mechanism to enforce all of this

I'm sure there is already discussions around CI. As a developer, it would be really nice to be able to see which tests caused the CI to fail (especially the E2E tests and logs) and be able to fix it without bugging the maintainers. 😄

seanknox · 2017-05-30T20:21:26Z

@ritazh love your idea to track gaps in test coverage. Would you mind opening an issue around missing coverage you're aware of?

I'm sure there is already discussions around CI. As a developer, it would be really nice to be able to see which tests caused the CI to fail (especially the E2E tests and logs) and be able to fix it without bugging the maintainers. 😄

Great observation. We're seeking to make CI public for this very reason.

anhowe · 2017-05-30T22:44:27Z

docs/roadmap/roadmap.md

+## CI dashboard available to the public
+## Native Azure VNET CNI support
+## Multi-GPU support
+## Support for dedicated etcd cluster VMs


Can we instead replace the above 4 items with a "feature" issue tag.

Per our discussion over using the wiki to track Roadmap, I'm going to delete this roadmap.md file.

anhowe · 2017-05-30T22:46:27Z

docs/roadmap/releases.md

+This section leads a maintainer through creating an acs-engine release.
+
+### Step 1: Assemble Master Changelog
+A change log is a file which contains a curated, chronologically ordered list of changes


Is the changelog generation automated?

It probably should be, though we'd want to adopt semantic commit messages to ensure commits are in a format useful for the changelog. @jchauncey thoughts?

We have a changelog generator script that can make pretty changelogs but you must follow a commit style for it to work.

But it generates them to look like this - https://github.com/deis/workflow/releases/tag/v2.13.0

Roger that, then no need to specify in this PR. FWIW, I have another PR around enforcing go conventions which includes using semantic commits.

anhowe · 2017-05-30T22:49:24Z

docs/roadmap/releases.md

+
+When showstopper-level bugs are found, the process is as follows:
+
+1. Create an issue that describes the bug.


write a test that covers the test gap?

I like it. Will update.

anhowe · 2017-05-30T22:51:25Z

docs/roadmap/planning-process.md

+
+## Release Planning Meetings
+
+Major decisions affecting the Roadmap are discussed during Release Planning Meetings on the first Thursday of each month, aligned with the [Release Schedule][].


These would also be aligned with the monthly objectives.

anhowe · 2017-05-30T22:55:59Z

docs/roadmap/planning-process.md

@@ -0,0 +1,28 @@
+# Planning Process
+
+acs-engine features a lightweight process that emphasizes openness and ensures every community member can be an integral part of planning for the future.


Maybe delete this sentence (further description in slack)

anhowe · 2017-05-30T22:57:10Z

docs/roadmap/planning-process.md

+
+## The Role of Maintainers
+
+[Maintainers][] lead the acs-engine project. Their duties include proposing the Roadmap, reviewing and integrating contributions and maintaining the vision of the project.


We need to include the current roles covered by the "GitHub Issues" Role, and the "GitHub Code Review" role.

anhowe

lgtm, please answer the comments/questions, then it should be good

jackfrancis · 2017-05-30T23:56:48Z

LGTM. My only opinion would be that for the initial v0.1.0 release, we stick to mostly the overhead of this new release process itself: tooling, orchestration, docs, etc. And that we deprioritize any add'l code changes during this "administrative" spike.

In other words, in my view, the purpose of this upcoming release is exclusively to "package" the existing code into a rational, versioned, process-approved publication.

Great work!

seanknox · 2017-05-31T02:18:50Z

My only opinion would be that for the initial v0.1.0 release, we stick to mostly the overhead of this new release process itself: tooling, orchestration, docs, etc. And that we deprioritize any add'l code changes during this "administrative" spike.

In other words, in my view, the purpose of this upcoming release is exclusively to "package" the existing code into a rational, versioned, process-approved publication.

Well said @jackfrancis. I removed the roadmap.md in favor of using the wiki to manage the roadmap: https://github.com/Azure/acs-engine/wiki/Roadmap

FWIW, the "stretch" goals listed will assuredly be part of the release. In fact, 2 of them are already merged and GPU support is close behind.

seanknox · 2017-05-31T03:20:33Z

Thanks for the considerate feedback everyone!

msftclas added the cla-not-required label May 23, 2017

seanknox commented May 23, 2017

View reviewed changes

seanknox changed the title ~~Release process~~ [Proposal] Release process May 24, 2017

seanknox force-pushed the release-process branch from 064e90e to f42fc8d Compare May 24, 2017 23:21

seanknox force-pushed the release-process branch from f42fc8d to 6cb1ace Compare May 25, 2017 22:24

seanknox mentioned this pull request May 26, 2017

Produce releases exist #55

Closed

seanknox added this to the v0.1.0 milestone May 26, 2017

seanknox added the ready label May 30, 2017

Sean Knox added 2 commits May 30, 2017 13:23

(docs): add release process

76aa095

(docs): add planning process

8ac8ab0

seanknox force-pushed the release-process branch from 0e26a4e to 28779b5 Compare May 30, 2017 20:23

seanknox mentioned this pull request May 30, 2017

A release process using SemVer exists #695

Closed

anhowe reviewed May 30, 2017

View reviewed changes

anhowe approved these changes May 30, 2017

View reviewed changes

docs(release): address review feedback

458a2bb

seanknox force-pushed the release-process branch from 28779b5 to 458a2bb Compare May 31, 2017 02:08

seanknox merged commit d6608f3 into master May 31, 2017

seanknox removed the ready label May 31, 2017

seanknox deleted the release-process branch May 31, 2017 02:57


		When showstopper-level bugs are found, the process is as follows:

		1. Create an issue that describes the bug.


		## Release Planning Meetings

		Major decisions affecting the Roadmap are discussed during Release Planning Meetings on the first Thursday of each month, aligned with the [Release Schedule][].

		@@ -0,0 +1,28 @@
		# Planning Process

		acs-engine features a lightweight process that emphasizes openness and ensures every community member can be an integral part of planning for the future.


		## The Role of Maintainers

		[Maintainers][] lead the acs-engine project. Their duties include proposing the Roadmap, reviewing and integrating contributions and maintaining the vision of the project.

[Proposal] Release process #667

[Proposal] Release process #667

Conversation

seanknox commented May 23, 2017 • edited

Background

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanknox commented May 23, 2017

seanknox commented May 23, 2017

wbuchwalter commented May 25, 2017

seanknox commented May 25, 2017

jchauncey commented May 25, 2017

wbuchwalter commented May 25, 2017

seanknox commented May 26, 2017

JackQuincy commented May 26, 2017 • edited

amanohar commented May 26, 2017

jchauncey commented May 26, 2017

amanohar commented May 26, 2017

jchauncey commented May 26, 2017

seanknox commented May 27, 2017 • edited

amanohar commented May 28, 2017

jdumars commented May 30, 2017

ritazh commented May 30, 2017

seanknox commented May 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anhowe left a comment

Choose a reason for hiding this comment

jackfrancis commented May 30, 2017

seanknox commented May 31, 2017

seanknox commented May 31, 2017

seanknox commented May 23, 2017 •

edited

JackQuincy commented May 26, 2017 •

edited

seanknox commented May 27, 2017 •

edited