New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add commit message guideline document #138

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
6 participants
@b-m-f
Copy link
Contributor

b-m-f commented Dec 6, 2018

No description provided.

@nottrobin

This comment has been minimized.

Copy link
Member

nottrobin commented Dec 6, 2018

Thanks @b-m-f, great work

Show resolved Hide resolved workflow/commit-messages.md Outdated

@b-m-f b-m-f force-pushed the b-m-f:master branch from 57f8d83 to 641bdf7 Dec 6, 2018

Show resolved Hide resolved workflow/commit-messages.md Outdated
Show resolved Hide resolved workflow/commit-messages.md Outdated
Show resolved Hide resolved workflow/commit-messages.md Outdated
Show resolved Hide resolved workflow/commit-messages.md Outdated

@b-m-f b-m-f force-pushed the b-m-f:master branch 2 times, most recently from 3e35d53 to c9107c2 Dec 6, 2018


## Breaking up large tickets

If your feature/ticket includes mutliple things that you have to complete in order to submit a pull request, these should all be seperate commit messages, according to the general guideline above. This will make it easy for the reviewer to understand what exactly has happened.

This comment has been minimized.

@squidsoup

squidsoup Dec 6, 2018

Contributor

these should all be seperate commit messages

Do you actually mean seperate commit messages (i.e. seperate commits), or a single commit with a multi-line commit message? The former can be tricky to achieve sometimes.

This comment has been minimized.

@bartaz

bartaz Dec 7, 2018

Contributor

As far as I understand the intention was for it to be separate commits (with proper messages).

I agree that it's usually tricky. We don't often have a perfect plan and errorless solution in mind that nicely maps in to a clean, logically split commit history. We implement a part of solution, commit it, maybe discover an error later, get some review comment. Keeping commit history clean with always valuable "feature" commits would require constant rebases with a sole purpose of having a clean history - history we will rarely look at.

If the PR itself is small enough (fixes one bug, adds one feature) it can theoretically work in history as one commit. Even if developed in couple of steps it's the final result that counts. When reviewing I rarely look at individual commits. It's the whole PR diff that I review.

While I agree that having meaningful commits with good commit messages is valuable, I wouldn't go out of the way cleaning up history, rebasing review comment fixes into past individual commits, etc.

This comment has been minimized.

@nottrobin

nottrobin Dec 7, 2018

Member

I am a bit concerned with this document that it will make people feel like they have to spend a lot of effort tidying commits, and I do agree that this is not worth spending a lot of time on.

Having said that, I do think this is a little blaze:

Keeping commit history clean with always valuable "feature" commits would require constant rebases with a sole purpose of having a clean history - history we will rarely look at.

Clean history is definitely useful. While it's true that the vast majority of history probably won't be inspected, it's also true that people very often do inspect history. I definitely do. And I also link to commits. I tell the community in bug reports e.g. "this feature was added in this commit", or I quickly skim the commits between two dates to see what happened. Or sometimes I cherry-pick commits to quickly grab functionality from other projects. Creating neat, atomic commits that describe individual features that are added is definitely not a waste of time.

So the middle ground I think it would be good to achieve with this document is to outline what would be the ideal way to create commits, but making it clear that if you find it difficult it is not a hard requirement. I never want PRs to be blocked because of the format of commit messages - @b-m-f please could you add something to the document to make this clear?

It would certainly be helpful if all our devs would gradually learn how to quickly and efficiently manage Git commits such that it's second nature and doesn't take them much time, but this upskilling will always be a gradual process.

Tricks for keeping commits tidy

Having said all that, personally, I do my best to rebase commits quite regularly. It's become so second nature that I don't think it slows me down in any significant way. A few simple tricks help.

  1. (I think everyone does this), frequently rebase (don't merge) your work on top of the latest version of the branch you're basing your work off (and ultimately merging into). Do this at every opportunity, and also ideally just before the PR is merged.
  2. Try to avoid using git commit --all (or git commit -a) - you miss an opportunity to consider how your work could be logically encapsulated into separate commits, and you might easily include work in the commit that you didn't intend. Instead do git commit {file1} {file2}, or git add {file1} {file2} followed by git commit.
  3. Try out the git add --patch or git add -p command. This allows you to interactively choose specific parts of the diff to add to the next commit. For example, if you did one thing at the top of the file and another at the bottom, that achieve two different things, you can stage just the top part for the first commit, and then the bottom part for the second.
  4. Make frequent use of git commit --amend --no-edit. If you do new work that could simply fit into the previous commit, just do git add {changed files} followed by git commit --amend --no-edit
  5. Make frequent use of interactive rebasing to reorder commits (@tbille showed me this trick, I didn't know it before). Since this is fairly involved, I'll explain it below:

Interactive rebasing

Let's say:

  1. You do some work in users.js and you commit it with "Improve user logic"
  2. You do some work in README.md and you commit it with "Explain user logic better in README"
  3. Now you realise you need another fix to your users.js work. You want it to be added to the "Improve user logic" commit, so do this:
$ git add users.js
$ git commit -m '... to be rebased'
$ git rebase -i HEAD^^^  # Interactively rebase the last 3 commits

This will open an editor with this contents:

pick 239d4c3 Improve user logic
pick adc7c21 Explain user logic better in README
pick fd4f81b ... to be rebased

Now if I reorder my commits to move line 3 to line 2, and replace "pick" with "squash" (or simply "s"), then it will "squash" the "... to be rebased" commit into the "Improve user logic" one:

pick 239d4c3 Improve user logic
squash fd4f81b ... to be rebased
pick adc7c21 Explain user logic better in README

Save and exit, and then it will open another editor to ask you what you want the new commit message for the new squashed commit to be:

# This is a combination of 2 commits.
# This is the 1st commit message:

Improve user logic

# This is the commit message #2:

... to be rebased

But in this case I probably just want to delete the second message. Save and exit, and voila:

$ git log
commit b87706ed594c983841857b51923e499988760d41 (HEAD -> master)
Author: Robin Winslow <robin@robinwinslow.co.uk>
Date:   Fri Dec 7 11:05:54 2018 +0000

    Explain user logic better in README

commit da1e9d47073cdc3bf86d3658b2e020c7e37292c0
Author: Robin Winslow <robin@robinwinslow.co.uk>
Date:   Fri Dec 7 11:05:40 2018 +0000

    Improve user logic

This process might seem complicated, but once you've done it a couple of times, you'll see that it actually doesn't take long. It probably takes me usually about 15-20 seconds.

There are of course some cases where commits can't be neatly re-ordered (although fewer than you'd think). In these cases, after you perform your interactive rebase, you'll get a conflict which you'll have to resolve in the normal way. You'll quickly get a feel for which commits can be reordered and which can't, and if you think it's going to be too much trouble, just don't bother starting. Or git rebase --abort.

This comment has been minimized.

@bartaz

bartaz Dec 7, 2018

Contributor

git rebase -i is great once you get used to it. I do use it quite regularly on a larger PRs to turn the long list of commits into something more managable (mostly by squash and fixup - as reordering is tricky as @nottrobin pointed out).

@nottrobin That is a great write down here. It would be good to turn it into a document itself or a part of one about git/commit/github process.

This comment has been minimized.

@nottrobin

nottrobin Dec 7, 2018

Member

Thanks @bartaz. I realise that at this point I've basically written a blog post. I should probably publish it =)

This comment has been minimized.

@squidsoup

squidsoup Dec 11, 2018

Contributor

@b-m-f I think the language here is still ambiguous.

This comment has been minimized.

@squidsoup

squidsoup Dec 11, 2018

Contributor

Clean history is definitely useful. While it's true that the vast majority of history probably won't be inspected, it's also true that people very often do inspect history. I definitely do. And I also link to commits. I tell the community in bug reports e.g. "this feature was added in this commit", or I quickly skim the commits between two dates to see what happened. Or sometimes I cherry-pick commits to quickly grab functionality from other projects. Creating neat, atomic commits that describe individual features that are added is definitely not a waste of time.

Absolutely, but I think we're conflating clean history on master with clean history in a PR, which is what @bartaz was talking about.

This comment has been minimized.

This comment has been minimized.

@nottrobin

nottrobin Dec 12, 2018

Member

Absolutely, but I think we're conflating clean history on master with clean history in a PR, which is what @bartaz was talking about.

For all our repositories on GitHub (except maybe build.snapcraft.io, which we should maybe look into), history in a PR becomes history in master, because we "merge" rather than "squash" (In GitHub, even if you "squash", by default each individual commit message is still included in the overall commit message of the squash commit). That is a basic assumption that I believe @b-m-f was working to, that @squidsoup you're right might not always apply.

@b-m-f it may be worth just making it clear that caring about the commit history in your PR is largely irrelevant if the repository you're working on squashes commits when PRs are merged. With a note that, in this scenario, you should try to be as detailed as possible in your squash commit.

This comment has been minimized.

@b-m-f

b-m-f Dec 13, 2018

Contributor

@squidsoup I have added another sentence for the squashing. Do you think this makes it less ambiguous?


## Rebasing

If possible, you should try to rebase changes that are requested during the Code Review phase into your previous commits.

This comment has been minimized.

@squidsoup

squidsoup Dec 6, 2018

Contributor

Both github and our landers on LP (MAAS, RBAC), support commit squashing on merge, what's the rationale for doing it manually?

This comment has been minimized.

@b-m-f

b-m-f Dec 7, 2018

Contributor

I was thinking that this would be the easiest way for the reviewers to recheck the code.

Just adding the commits on top and then rebasing on merge is of course another option.

The only problem that I see is loosing the messages for the single steps that were done for the PR in the overall history. Or is it possible to just squash certain commits together in the UIs?

This comment has been minimized.

@nottrobin

nottrobin Dec 7, 2018

Member

Although I'm not sure if there should be "a team standard" (there may be project standards), personally I much prefer the GitHub default of "Merge commit", which uses a standard "git merge" to merge a feature branch with master.

Using "squash" loses a significant amount of information about how the work had been split up. We are supposed to try to keep PRs as small as possible, but there are definitely mixed results with this, and even if you do this well there are cases where adding one whole piece of functionality requires changes in quite a number of places.

If someone has taken the time to separate out the creation of the "models.User" from "views.login" as 2 separate commits (as I'd hope they might), I don't want to lose that separation by squashing them together.

This does somewhat depend on how good we are at keeping our commits neat. Obviously I'd rather have 1 squashed commit for the whole PR than 30 commits called "fixed typo". But for my PRs, I'd certainly rather the history was preserved as I wrote it.

This comment has been minimized.

@bartaz

bartaz Dec 7, 2018

Contributor

If we are talking about GitHub 'Squash on merge' feature it simply squashes whole PR into one commit and rebases it on top of master (or whatever branch you are merging to). As far as I remember (we use it by default in build.snapcraft.io) it creates a commit message that lists all the messages of commits being squashed.

I believe we had a dev catch up discussion about merging vs rebasing vs squashing PRs and the consensus was to use merge by default not to loose commit history by squashing whole PR. Personally I don't think we loose much by squashing, although it may end up with huge commit diffs in large PRs.

This comment has been minimized.

@nottrobin

nottrobin Dec 7, 2018

Member

If we are talking about GitHub 'Squash on merge' feature it simply squashes whole PR into one commit and rebases it on top of master (or whatever branch you are merging to). As far as I remember (we use it by default in build.snapcraft.io) it creates a commit message that lists all the messages of commits being squashed.

Yes that's right. The lost information I was referring to is the information about which commit message pertains to which part of the diff, which to me seems pretty important (imagine if we disassociated all of git's commit messages from the diffs), although it does depend on whether the person filing the PR cared about keeping an accurate history.

I believe we had a dev catch up discussion about merging vs rebasing vs squashing PRs and the consensus was to use merge by default not to loose commit history by squashing whole PR.

Yes I remember that discussion. I agree my memory is that the preference of the team to use merge, which is also my preference. I'm not sure we made a very strong insistence though. We could consider this again, or we could continue with that standard.

This comment has been minimized.

@nottrobin

nottrobin Dec 10, 2018

Member

@squidsoup since this sentence isn't actually related to merging PRs, AFAICT it's only about keeping the commits in your feature branch tidy, are you happy with its current form? Or would you like to suggest an alternative?

This comment has been minimized.

@squidsoup

squidsoup Dec 11, 2018

Contributor

@squidsoup Or would you like to suggest an alternative?

@nottrobin I personally don't feel there's much to be gained by investing energy in making PRs with commits per feature in large branches, particularly when our workflow for two of our large projects, CRBS and MAAS squash commits on merge. History on master should be clean, but I would rather spend time on the code rather than crafting feature commits when those get lost anyway.

This probably only makes sense for projects that merge commit to trunk.

This comment has been minimized.

@squidsoup

squidsoup Dec 11, 2018

Contributor

If we are talking about GitHub 'Squash on merge' feature it simply squashes whole PR into one commit and rebases it on top of master (or whatever branch you are merging to). As far as I remember (we use it by default in build.snapcraft.io) it creates a commit message that lists all the messages of commits being squashed.

Github may do that, but LP does not. Manually writing a multiline commit message for large branches isn't difficult however.

Yes that's right. The lost information I was referring to is the information about which commit message pertains to which part of the diff,

I would suggest that if you have to break up your code review by commit, the author should be making three PRs and the branch is too large.

I believe we had a dev catch up discussion about merging vs rebasing vs squashing PRs and the consensus was to use merge by default not to loose commit history by squashing whole PR.

Why is intermediary history valuable, bearing in mind our workflow on LP? Personally speaking, most of my intermediary commits are junk, and little more than placemarkers.

This comment has been minimized.

@nottrobin

nottrobin Dec 13, 2018

Member

@squidsoup: I personally don't feel there's much to be gained by investing energy in making PRs with commits per feature in large branches, particularly when our workflow for two of our large projects, CRBS and MAAS squash commits on merge. History on master should be clean, but I would rather spend time on the code rather than crafting feature commits when those get lost anyway.

Why is intermediary history valuable, bearing in mind our workflow on LP? Personally speaking, most of my intermediary commits are junk, and little more than placemarkers.

I completely agree, you should make a call about what makes sense for working on CRBS and MAAS. For people working on our repos, this isn't "intermediary history", it's simply "history". But I can see that for people working on MAAS's repos it's different, and it is definitely a waste of time to worry about history that's going to be thrown away.

Firstly, in my opinion this document should only be considered advisory (as I've mentioned a few times) and so you or anyone else should be absolutely free to ignore any of its advice that doesn't work for you.

I assume MAAS and CRBS work this way because we don't control those codebases? So we can assume that we'd try to follow these guidelines for codebases we do control? @b-m-f it would be worth, I think, explicitly adding a concession for when we're working on codebases that we don't control.

I would suggest that if you have to break up your code review by commit, the author should be making three PRs and the branch is too large.

This I completely disagree with. One could consider that Git history should merely be a history of all the merged PRs, but this offers much less detail than I'm used to working with. PRs should be as small as possible, but at the very least a PR is likely to include several things - e.g. some code, an update to a README, and some tests. I think it's much more useful, if you can, to see PRs as encapsulating whole new pieces of functionality, and commits as describing individual changes.

This inevitably gives you much more detailed history & the commits will have much smaller diffs. If you've ever tried working through history where people have created huge commits I think you'll appreciate how frustrating it can be.

E.g. my PR might have:

102: Document user logic
101: Test user logic
100: Improve user logic

If I "merge" this PR into master, then my master history will look like:

103: Merge PR #56: New user logic (merge of commit 102 & commit 99)
- 102: Document user logic
- 101: Test user logic
- 100: Improve user logic
99: Merge PR #55: Add secret cryptocurrency miner

And in GitHub those commits will even tell you that they came from that PR.

If there's no other choice, simply one commit per PR is acceptable, but we can have more detail, I think that's always better.

This comment has been minimized.

@squidsoup

squidsoup Dec 13, 2018

Contributor

I assume MAAS and CRBS work this way because we don't control those codebases?

Right, and potentially out of our hands, that's just the way landings work for MAAS and CRBS (RBAC) and I think it's probably best if we try to conform to their workflows as much as possible. I think @b-m-f's suggestions are great, but we probably just need a caveat that they only apply to projects where we have ownership of landing and ci.

@barrymcgee

This comment has been minimized.

Copy link
Contributor

barrymcgee commented Dec 7, 2018

I'm of the opinion Git history is very important and should be treated as documentation.

I had this drummed into me by a previous CTO I worked under and have been a convert to this idea ever since. I can't explain it any better than he can himself:

Video: https://vimeo.com/172882423 (9 mins)
Slide breakdown: https://blog.mocoso.co.uk/talks/2015/01/12/telling-stories-through-your-commits/

Show resolved Hide resolved workflow/commit-messages.md Outdated

@b-m-f b-m-f force-pushed the b-m-f:master branch from c9107c2 to 2baa831 Dec 7, 2018

@bartaz

This comment has been minimized.

Copy link
Contributor

bartaz commented Dec 7, 2018

I'm of the opinion Git history is very important and should be treated as documentation.

It is important, and I fully agree that meaningful history is helpful. I use git blame quite often when trying to check when given change was introduced. Good commit message is first thing that gives some context. If commit allows to track down the PR it's even better (which is a GH feature for some time now).
And we should at least try to educate ourselves in this regard (thanks for links @barrymcgee).

I guess what we are discussing here is how much we want to enforce it.

  • Do we want to spend time on reviews actively checking history?
  • Do we want to block merging PRs if history is not 'clean' enough? - if so, what would be the rules for what is acceptable, what not?
@bartaz

This comment has been minimized.

Copy link
Contributor

bartaz commented Dec 7, 2018

One issue I have with unnecessary rebases is that it makes it harder to share given branch and update it locally.

Say I'm reviewing a branch and for that I pull it locally. Later it's rebased. I can't just pull it now as it will end up with duplicated conflicting changes. I need to reset and pull, or recreate the branch from scratch (unless there is some git magic command that does a 'pull after rebase').

@nottrobin

This comment has been minimized.

Copy link
Member

nottrobin commented Dec 7, 2018

@bartaz Yes, those are the salient questions. Personally, I feel quite strongly both that commits are important and people should care about them, and that we really shouldn't enforce them or block anything based on them.

But it seems this discussion is inexorably heading for another dev catch-up.

@bartaz: unless there is some git magic command that does a 'pull after rebase'

I'm having trouble reproducing the problem of not being able to git pull right now, but I think the command you're looking for may be git pull --force.

@bartaz

This comment has been minimized.

Copy link
Contributor

bartaz commented Dec 7, 2018

@nottrobin Ah right, haven't done pull of rebased branch for a while because I just remembered I had issues with that. It doesn't break local branch, git will just refuse to do the pull if the pulled branch cannot be merged.

Didn't know you can --force pull. I need to check how that works. Does this override local branch with pulled one or does it try to merge them?

@b-m-f b-m-f force-pushed the b-m-f:master branch from 2baa831 to 6f9c48a Dec 10, 2018

@b-m-f

This comment has been minimized.

Copy link
Contributor

b-m-f commented Dec 10, 2018

I have updated the doc. I hope it is now clearer that these should not be enforced rules, but rather a document to aid in finding a good way to write commit messages :)
Taken from the desired state we would like to see in our git logs, even if this is certainly not always possible.

Here is also some further reading on the topic.
By Tim Pope
From the git website
Rules in the Spring Framework

@b-m-f b-m-f force-pushed the b-m-f:master branch from 6f9c48a to 26c5ddc Dec 13, 2018


If your feature/ticket includes mutliple things that you have to complete in order to submit a pull request, these should all be seperate commit messages, according to the general guideline above. This will make it easy for the reviewer to understand what exactly has happened.
But make sure that this is a something that is done throughout your project. If another workflow is used, such as squashing all your commits together before you merge your branch into master this is not necessary. Just make sure that your commit message for the squash commit includes a detailed message of the changes.

This comment has been minimized.

@squidsoup

squidsoup Dec 13, 2018

Contributor

Might be good to mention here that it is fine to have multiline commits (in the sqashed commits workflow) ensuring all the changes are enumerated in the form:

Overall commit message.

  • Change 1
  • Change 2

This comment has been minimized.

@nottrobin

nottrobin Dec 13, 2018

Member

Yeah, I think multiline commits should be encouraged anyway, in all cases. Some people say that every commit should have a title and a description (I believe our alumnus @WillMoggridge may have that opinion), but that is probably going too far for our standard practices.

But yes, it might be reasonable to say more explicitly that squash commits should be multiline.

This comment has been minimized.

@b-m-f

b-m-f Dec 17, 2018

Contributor

What do you think of the content I have added @squidsoup @nottrobin . Should it be more explicit to use multiline commit messages?

@b-m-f b-m-f force-pushed the b-m-f:master branch from 26c5ddc to 18ae316 Dec 17, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment