Improve hg-git's author/committer parsing to match what git expects #230

I found out about this problem when converting Mozilla's hg repository to git.

The way that git parses the author/commiter lines is that it looks for the first less-than character, and expects that to start the email address, and then it looks for the next greater-than character, and it then expects everything following that character to be parsed as a date.

hg-git's output doesn't completely match that expectation. For example, for this revision, the author is "", which makes hg-git generate a author line like this:

author none@none 123456000 +0000

Which git fails to parse. Another example is this revision, where there is no space between the username and the email, which creates an author line like this:

author Ms2ger 123456000 +0000

And you can see how that would confuse git!

With the fixes in this pull request, hg-git can generate better commit objects, that git can actually deal with. I managed to convert the entire hg history of mozilla-central to git with these patches.


ehsan@04a37b4 also fixes another instance of this problem, for commits like

Can you add some tests for this? I'm hesitant to pull this without corresponding tests. I'm also skeptical of the correctness of get_valid_git_username_email() - couldn't that fail if the username was something like " " or some other broken thing?

When you're ready, I'd greatly prefer patches mailed to the hg-git Google Group - it's easier for me to review and apply them there than on a pull request here.


Commits on Aug 11, 2011
  1. @ehsan

    Sanitize the author username and email address to make sure that git …

    ehsan committed
    …is not confused by brackets around those names
  2. @ehsan
Commits on Aug 15, 2011
  1. @ehsan

    Make the space between the username and email address in hg username …

    ehsan committed
    …parsing code optional, to handle cases like 'User<>'
Commits on Aug 23, 2011
  1. @ehsan
Showing with 7 additions and 4 deletions.
  1. +7 −4 hggit/
11 hggit/
@@ -337,12 +337,15 @@ def export_hg_commit(self, rev):
+ def get_valid_git_username_email(self, name):
+ return name.lstrip('<').rstrip('>')
def get_git_author(self, ctx):
# hg authors might not have emails
author = ctx.user()
# check for git author pattern compliance
- regex = re.compile('^(.*?) \<(.*?)\>(.*)$')
+ regex = re.compile('^(.*?) ?\<(.*?)\>?(.*)$')
a = regex.match(author)
if a:
@@ -350,11 +353,11 @@ def get_git_author(self, ctx):
email =
if len( > 0:
name += ' ext:(' + urllib.quote( + ')'
- author = name + ' <' + email + '>'
+ author = self.get_valid_git_username_email(name) + ' <' + self.get_valid_git_username_email(email) + '>'
elif '@' in author:
- author = author + ' <' + author + '>'
+ author = self.get_valid_git_username_email(author) + ' <' + self.get_valid_git_username_email(author) + '>'
- author = author + ' <none@none>'
+ author = self.get_valid_git_username_email(author) + ' <none@none>'
if 'author' in ctx.extra():
author = "".join(apply_delta(author, ctx.extra()['author']))
