New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving our git workflow #266
Comments
It is much more difficult than the previous workflow where we generally had a single minor version that was active and a single minor version that was only given major bug fixes. The introduction of the master branch and the lack of any releases yet to give a baseline for what makes a change significant has made it a bit confusing about where things should be targeted. In the case of issue#264 today, I opened its pull request against master because it isn't a critical fix (data loss/stability/etc.) and I didn't want to interrupt the upcoming review of the first 2.0/1.0 releases. Having a clear roadmap for which versions are upcoming along with expected maintenance dates for major/minor version combinations, with the name of the current git branch to start on would be useful. Then when you want to lock off a branch from features, such as when 2.0.0 gets released, you would change the roadmap to indicate that even small features get put on master. I have also found it a little confusing that the two release branches have had -M1/-M2 releases and then are bumped back to the same snapshot build as previously. Bumping back to 2.0-SNAPSHOT rather than 2.0-M3-SNAPSHOT, for example, makes it difficult to easily visualise which milestone we are currently up to on each branch |
I don't think the workflow is fundamentally different from what we did before. master is the branch for "the next minor/major release". 'releases/2.0.x' is the branch for all fixes to 2.0. The thing that perhaps is confusing is that I created the 2.0.x branch before we released 2.0 - but that is because we needed to allow development of new features that would not be included in the 2.0 release (so that we could be ready in a pinch once we get through review). Hence the split from master. I'm not sure I can come up with a simpler model that still allows us to do this. As an aside I am deliberately leaving the 1.0 branch out of all this, as it's a backport and really should not be used for any sort of fixes or enhancements.
That seems very reasonable to me. But for what it's worth it's entirely acceptable to do bug fixes on the release branch, even during review.
A clearer roadmap is definitely a good idea. What I struggle with here though is everybody's limited availability. I can come up with a basic timeline and fill in some of the issues/features, but I will need input (and more importantly commitment) from other devs to plan the rest. This has traditionally not been a strong point of our little band :)
This was mainly done because milestone builds are not considered "proper" releases, and I sort of had to shoehorn the Milestone naming into our workflow (not to mention that pending legal approval it was unclear when we could do a proper release, so I wanted to keep our options open). But I can understand how it's confusing. I won't do it that way for the next major release (which is when I would first expect us to do milestones again). |
Here's an interesting article, proposing an alternative approach to gitflow (which the author claims is needlessly complex): |
My impression is that the full Gitflow is most suited to web applications that have a quick release and deploy cycle which doesn't exactly fit this case. The tool support would be the main reason for going with it. but may make it difficult for others just using the basic git binary to contribute. The anti-gitflow argument is almost as difficult to work through as the pro-gitflow argument in my view. They advocate rebasing and squashing simply to get a linear history, but there are just as many conflicts with git merges using rebasing/squashing as there are with branched/merged features. I personally don't mind seeing the full non-linear history as it actually occurred, but that mostly relies on the branch names being useful and not simply issue-key numbers. Once the first legal review is done, could we highly recommend that both features and bug fixes be on "master", and cherry-pick the appropriate bug fixes back onto the releases/2.0.x branch ourselves to relieve users of that responsibility. Is it a legal requirement from Eclipse that we put all of the contribution information into a single file? If not, can we add some direct instructions to README.md to get people started. If it all has to be in the relatively large CONTRIBUTING.md file, can we rearrange or add to the file to give the information about a single recommended branch to start on and the git commands to get them started to it? For example, adding the commands "git checkout master" and "git checkout -B issues/#101-turtle-trig-line-numbers" might stand out to them where the current prose version of that doesn't. |
That was my thinking as well. Much as I like adopting a well-documented standard, I feel gitflow is overkill for our situation.
I'm not convinced about their linear history arguments to be honest, but a lot of the rest of it I do like. Especially the notion that only master is permanent, and hotfixes are done by branching off from the relevant version tag (rather than keeping several parallel release branches up in the air all the time). Let me stew on this a bit and then propose an alternative workflow (which also takes your idea about doing both features and hotfixes on master into account).
Oh no, not at all, we have complete freedom in how we organize this. |
Another solution when we come to the stage where we are only maintaining a single release stream may be to keep master as the next major.minor release, but allow the opening up of short-lived patch release branches for previous versions as necessary, and replace them with just a tag when the version is released (the tag doesn't show up in "git branch -a" so as not to confuse people). E.g., we could have master at 2.4.1-SNAPSHOT, but if we wanted to release a bug fix targeted to 2.3.4 we would checkout the 2.3.3 tag as a new branch named "releases/2.3.4", bump the version on it to 2.3.4-SNAPSHOT, and work on that for a short time until its release, upon which time the releases/2.3.4 branch is replaced by a tag "2.3.4" and deleted, after merging the relevant changes back into 2.4.1-SNAPSHOT on master. If 2.3.5 is necessary, then we checkout the 2.3.4 tag and do a similar short-lived process on it. We are currently maintaining two major release streams so it doesn't quite fit yet as the current model simplifies the three-stream situation a little compared to that model. If we are able to regularly release new minor versions the short-lived branches may reduce confusion, as almost everything would be done against master and there wouldn't be other long-lived branches floating around the main eclipse/rdf4j git repository. It should be possible to start regularly bumping minor versions again now that we are past the large SPARQL-1.1/RDF-1.1/Java-8 changes that hampered that for a little while. |
That could work quite well actually. The two major releases we are doing could be fitted in by thinking of the 1.0 branch as a "secondary master" branch: we occassionally merge release branches into it to do patch release but nothing else. I wasn't really planning on doing a 1.1 release by the way. For me, 1.0 is the end of the line of Java 7 support (barring patch releases in sync with 2.0 patches). |
To be fair, the ideas (branch-off-tag, then delete-after-tag-and-merge) above aren't my own, I read them somewhere recently but forgot to comment here at the time so I have lost the source. |
Ok, so if I can summarize this we would have the following workflow:
I like this approach, but given that we will be using cherry-picking a lot more, it does mean that we should think about using merge squashing more. I'd hate to have to cherry-pick a fix with 30 individual commits. GitHub has the option for committers to squash a PR before merging, so we don't necessarily have to force contributors to do the squashing. In all of this the Java 7 backport is an anomaly. It just becomes a separate branch (permanent for as long was decide to actually bring out backports) to which we regularly merge new fixes and keep things J7-compatible. I think it's certainly easier to explain this workflow. What do you think is the ideal time to implement it? Directly after the 2.0.1 release? |
Sounds easier to me. |
One or two minor tweaks I'd like to suggest to the workflow:
Thoughts? Also keen to hear what you think of using 'squash and merge' instead of standard merge commits. |
Modifying/correcting the Pull Request title is also much easier than a commit message.
One or two minor tweaks I'd like to suggest to the workflow:
|
@ansell pointed out some major downsides to using squash-and-merge (see comments on #325). The most important of which is probably that we lose sign-off information - which is a big requirement from Eclipse's point of view. This goes hand-in-hand with individual commits having issue numbers: if we don't squash, we should still require that. Of course, we can advise contributors to clean up/squash their commit history themselves before presenting a PR. |
So the branching model seems to be acceptable to everyone. The reference to squashing I read as being an optional thing that the user would do locally which is okay by me. Jeen has now proposed using the inbuilt GitHub squash and merge feature that is separate to that and has a few minor side-effects that we would need to work through. Specifically:
|
And, 3. Should it be allowable for users to do squash and merge locally to preserve GPG signing (-S) and then not have the GitHub squash and merge used. |
Ad 1. They are, but only for as long as the third-party repo (from which the PR came) is accessible to us. It's fine if a user deletes a branch, but if they remove their forked repo completely, we lose that bit of the history. |
The sign-off thing is a blocker for me. Unless someone has a clever solution to that I suggest we just stick with 'normal' merge commits, after all. |
Got rid of the redundant link to the guidelines (GitHub shows it right at the top itself) and the tick boxes for formatting and tests, etc (nobody really uses them anyway it seems).
Issue #266: simplified Pull Request template
Issues/1396 standardise benchmarks
Our current git workflow is outlined in the contributing guidelines.
Summarized, it's this:
issues/#<issue>-<keywords>
releases/2.0.x
).I have noticed that several people, in particular new developers, struggle with picking the correct branch to start fixing things and/or with picking the right target branch to merge their fixes back into.
Partly this could be caused by unclear explanation in our docs, or by inattentive reading on the part of the contributor, but I feel that perhaps our workflow itself is not as simple and straightforward as it could be.
In particular, I think that people struggle to remember step 5 (perhaps they simply forgot where their branch originated after a while), and this is not something that is trivially simply to see in most git tools.
So, open question: does anybody have any suggestions on how to improve our workflow and/or our workflow documentation? Any ideas to make it easier, both for new contributors and for existing contributors and full committers, are most welcome.
The text was updated successfully, but these errors were encountered: