The goal at which we’d like to end up, is a way to import commits send to zsh’s mailing lists with as little manual labour as possible. The workflow for this kind of endavour - using git’s pre-packaged means - would usually involve the use of “git am”. In your Mail-User-Agent (MUA) you would mark and save the mail message that contain commits you want to import to an `mbox’ folder - say “feature-x.mbox” - and run “git am” on that like so:
% git am feature-x.mbox
With zsh’s development style, that presents two issues:
- Zsh uses a traditional ChangeLog file to linearly track changes.
- Zsh also uses numbers from ezmlm’s (the mailing list software) X-Seq: header to make it easy to dig up mailing list discussions that belong to the individual change listed in ChangeLog file.
So in the end, the goal is to ideally have one command that does this:
- Look at the saved mbox file, prefix all Subject: headers with the corresponding X-Seq: number (extra sugar: detect if a commit mail was sent to zsh-users instead of zsh-workers and prefix the X-Seq: number with a “users/” string).
- Commit all messages from the mbox file via “git am” and amend the commits with automatic updates to the ChangeLog file. This should reflect the X-Seq: number, files that were touched by the commit, as well as the title of the commit message (the first line of the commit, that is usually put into the Subject: header of commit mails).
With zsh, all codebase changes (except for trivial ones) go through one of its mailing lists (usually zsh-workers@zsh.org, but sometimes zsh-users, as well). What is more is this: The use of numbers from the X-Seq: header, require developers to amend every commit message. That means that you are doing integration work all the time even with your own changes.
This is a workflow that works pretty well for myself and I think it is simple enough for others to adopt as well. This discusses not only the import of commits from the mailing-lists, but also how to get mails to the mailing-list that are properly formatted to be consumed by the usual git-tools, such as “git am”. That is why this introduction is a little lengthy.
The branch in which zsh’s development is going forward is “master”.
Branches in git are not scary at all. They are cheap to create, have lying around and easy to work with. I know that many people think “Why on earth would I be adding a new branch for this!?”: But let’s assume for the moment, that doing this will help in the end.
Let’s assume, we’ve got the master branch checked out and we’d like to work on “feature-x”. I usually prefix my branches with my initials, so I’d do this:
% git checkout -b ft/feature-x
Now I’d work, commit, work, commit, rework… whatever I need to do and git allows me to do. In the end, there will be one or more commits, that implement “feature-x”.
Git has a number of tools, that help mailing-list based development. In particular that would be “git format-patch” and “git send-email”. The former generates files, that are properly formatted for consumption by the latter as well as git’s other mail-related tools.
Just as a reminder, commit messages with git - by convention - look like this:
The first line should be short and to the point about the change in the commit The second line is to be left _EMPTY_! The rest may go into as much detail about the changes as the author sees fit. Information that could be included is: What changed? Why change it in the first place? Why change it in this way and not in another fashion? Maybe parts of mailing list discussions, if they are relevant.
This helper program generates mail messages from a set of commits.
Say we know, we have exactly three commits on our development branch. In that case we might call the program like this:
% git format-patch -3
“-3” tells it to create mails for the last three commits.
If you do not quite know how many commits you got, you can also tell format-patch to start at the point where you branched off (that would usually be the “master” branch) and tell it to stop whereever you are right now:
% git format-patch master..
That is all. The result will be a number of “*.patch” files, that you can send off to whereever they need to be send to.
If you are preparing larger patch-series, you might want to add a cover-mail, too. But that is beyond the scope of this document.
This section could be very very short. Because you can just feed the files from “format-patch” to “send-email” and be done with it. But there is another worthwhile feature we might as well look at.
Mails generated by “format-patch” always contain a line with three dashes, followed by a few lines of diff-stat information before the actual diff is inserted. This is somewhere in the mail’s body and looks something like this:
--- Src/Zle/zle_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Now, since those “*.patch” files are just plain-text email messages, you might be tempted to edit them using your favourite editor. And you can.
The feature I’d like to highlight is, that any text you enter between the three-dashes-line and the diffstat will be discarded by “git am” (the tool that will ultimately import the commit for us later on). So you can use that space to add comments about the commit, that might be of interest for other people at the time, but does not deserve to be part of the actual commit message.
The actual call to send off the generated mails, looks like this (I told you it would have been a short section without that other feature):
% git send-email --to='zsh-workers@zsh.org' --suppress-cc=all *.patch
You can configure “send-email” so you have to supply less options, but the command line is still pretty short and zsh’s git completion will help you construct it with ease.
Once you can see the messages in your MUA, you can probably mark them in some way and save them to a local folder (preferably “mbox” format, that is what I tested this solution with; although the module I used supports a wide variety of formats).
If your MUA cannot do this: My condolences. ;)
This is actually the only section about the solution that is present with this little software package.
First lets move to the “master” branch again (you could also do the import in another integration branch and merge that into “master” later, but lets not over-do things):
% git checkout master
There are two steps that need to be taken (let’s again assume “feature-x.mbox” as our newly created mbox file). First, amend the Subject: lines of the mails to reflect the X-Seq: number:
% zsh-am-xseq2subject feature-x.mbox
This step needs to be taken exactly once.
And finally, import the changes and amend ChangeLog along the way appropriately:
% zsh-am-and-changelog feature.mbox
And that is it. Unless you get merge conflicts, in which case you need to do some manual labour after all.
Obviously, having to enter more than one command is unacceptable, so here’s a short-hand:
% zsh-am feature-x.mbox
You can pass as many mbox files as you like to the short-hand command.
One obvious step is to remove the mbox file. That is boring.
More interestingly, there’s still our development branch lying around. You can keep it, if you want to. But you can also just remove it, because your changes are now part of “master”, albeit in amended form since the ChangeLog file was changed and the commit title got the X-Seq: number stuck to its front.
So for the version control system, the changes are different. And that is finally, why it makes sense to code on separate branches for anything you send through the mailing lists: The changes are different, as far as git is concerned.
To remove the development branch just do this:
% git branch -D ft/feature-x
The workflow presented earlier uses explicitly created additional branches, because the author believes it helps to separate different changes from each other as well as from on-going development. This section might show reasons why that could indeed be beneficial.
That being said, note that all of this is entirely optional.
If you are new to git, you might be surprised to hear that you already have
your own branch already anyway: The master
branch.
With centralised systems, the initial situation is simpler, simply because the way the network is allowed to look like is a lot more limited. With centralised systems you have got exactly one remote system: The central repository.
With distributed systems like git, there could potentially be any number of remote systems (in git-lingo, such a system is called a “remote”). And that includes none at all. It could also mean 20 or 30 remotes. It does not really matter.
When you clone a git repository, git automatically adds one remote for you.
It calls this default remote origin
, because it points to where you got
the code from in the first place. If you’d like to see a list of remotes,
that are registered with your repository you can call “git remote show”:
% git remote show origin
You can add other remotes if you want to. For example, I have a mirror of zsh’s git repository set up on github, which I added like this:
% git remote add github git@github.com:ft/zsh.git
In a centralised-like workflow (which is used by most projects I am aware of) there is no need for that though. The point is that you could. And in more decoupled scenarios it does make sense.
Note, that you can also remove any remote, including origin
. There is
nothing special about it. If you remove it, you cannot get changes from the
remote repository anymore though. So do not do that. ;)
Since there can be any number of remote systems, git has to have a way to keep track of their changes: It keeps exact copies of the remote branches in the local repository. You might know that calling “git branch” lists all (local) branches in your repository. Its “-a” option will list all branches, including global ones:
% git branch -a * master remotes/github/master remotes/origin/#CVSPS.NO.BRANCH remotes/origin/HEAD -> origin/master remotes/origin/dot-zsh-3.1.5-pws-14 remotes/origin/dot-zsh-3.1.5-pws-17 remotes/origin/dot-zsh-3.1.5-pws-19 remotes/origin/master remotes/origin/zsh remotes/origin/zsh-3.1.5-pws-16-patches remotes/origin/zsh-4.0-patches remotes/origin/zsh-4.2-patches
That is actually a lot and it might confuse you. So, to make it clear: Most
of these were created during the CVS history import. The one that is
interesting (since it is the representation of the remote’s master
branch)
is remotes/origin/master
. There is also remotes/origin/HEAD
which
points to origin/master
: This means that the default branch of the remote
repository is master
.
Your local master
is branched off of the remote branch origin/master
.
It is as if you had done this manually:
% git checkout -b master origin/master
You did not have to do that, because git did it for you when you cloned the
repository. What you have to realise is this: Your local master
branch is
yours! The remote changes are kept in origin/master
.
How to get changes from a remote branch to a local one then? Or the other way round? “Pull! Push!” might be your reaction. Kind of. Let’s take a closer look.
The command used to get changes from a remote is actually “git fetch”. It
gets changes from a remote and updates the branches in its remote/*/*
namespace accordingly. It does not touch your local branches!
“git pull” performs a “git fetch”, but is also does something else: It
merges, too. You can configure what is merged to where. Per default, stuff
from origin/master
gets merged to master
.
If you do not have new changes in your master
branch, the merge is
trivial since history remains linear. If you do, the merge will
commence and also create a merge commit in the end: Your history is not
linear anymore.
If you want linear history, you can use “git pull –rebase”. “rebase”
means: Reset master
to origin/master
, to the trivial linear merge and
finally replay all local changes on top of that new master branch.
The thing with merging and rebasing is this: If you made changes to ChangeLog, you’ll get a guaranteed merge conflict. That is because of the nature of ChangeLog files. Changes are always made at the top, and thus will always trigger a conflict if something else changed this top as well (as changes in the remote repository would).
This is why I prefer to code on separate branches and only use my local
master
branch for integration: If something goes wrong, I can always
just reset my master
branch, pull changes from the remote and re-apply
the changes from my saved mailbox using `zsh-am’.
If there are merge conflicts (they will not be in ChangeLog, because
`zsh-am’ will always produce new entries on top of the current state), I
can always rebase my coding branch on top of master
and resolve any
merge conflicts there. Then I resend a new patch series and `zsh-am’ that
on top of master as soon as the mails return to me.
I believe that workflow to be more robust with the special needs of zsh’s development style with X-Seq: numbers and especially ChangeLog entries.
You can do all of this with just your local master
branch. But I think
it is substantially harder to get everything right in case of conflicts
doing that.
The answer is indeed “git push”. There is one caveat though: If the remote has new changes, it will not let you push. You would have to fetch, and merge or rebase (either explicitly or using “git pull” with or without “–rebase”) first and resolve any conflicts locally. This then involves all subtleties that were mentioned in the previous section.
After that, you can push to the remote indeed.
- One command to do all the work at once.
- Support for X-Seq: numbers.
- Support for mails to zsh-users (they get a “users/” prefix)
- Support for amending commits with ChangeLog entries
- Support reading mbox files from stdin
The `zsh-am’ script requires a POSIX shell as /bin/sh. It also requires an implementation of `mktemp(1)’ to be available for the “read mbox file from stdin” feature to work.
The other scripts are written in Perl. Standard modules such as POSIX are assumed to work. The mailbox handling is done by an extension module called Mail::Box which is available from CPAN and is packaged for popular linux distributions as well. For example on debian, the right package to install would be `libmail-box-perl’.
The package consists of four scripts:
- genchangelog: Generates the changelog entries.
- zsh-am-xseq2subject: Amends Subject lines with “<number>:” and “users/<number>:” prefixes based on the X-Seq: headers.
- zsh-am-and-changelog: Calls git-am and amends the ChangeLog along the way.
- zsh-am: Calls zsh-am-xseq2subject and zsh-am-and-changelog in succession for any number of mbox files.
The installation works like this:
# make install
The default installation prefix is “/usr/local”, so the scripts will end up in “/usr/local/bin”. If you’d prefer them to live in “~/bin”, do this:
% make install PREFIX="$HOME"
Similarly, the package may be uninstalled using:
# make uninstall
After installing, you have to move to your zsh git clone and call zsh-am with its “-init” option:
% cd ~/src/zsh % zsh-am -init
% cd ~/src/zsh % git checkout master % git checkout -b ft/zle-init-hooks [..hack..hack..hack..] % git format-patch master.. % git send-email --to='zsh-workers@zsh.org' --suppress-cc=all *.patch [..In MUA, mark mails and save them to "~/zle-init-hooks.mbox"..] % git checkout master % zsh-am ~/zle-init-hooks.mbox % gitk --all ;: check if everything in master looks right % git push % rm ~/zle-init-hooks.mbox % git branch -D ft/zle-init-hooks