Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Improve hg-git's author/committer parsing to match what git expects #230

Closed
wants to merge 4 commits into from

2 participants

Ehsan Akhgari Augie Fackler
Ehsan Akhgari

I found out about this problem when converting Mozilla's hg repository to git.

The way that git parses the author/commiter lines is that it looks for the first less-than character, and expects that to start the email address, and then it looks for the next greater-than character, and it then expects everything following that character to be parsed as a date.

hg-git's output doesn't completely match that expectation. For example, for this revision hg.mozilla.org/mozilla-central/rev/a537a070dbf40081e1d32321924b6589b271574e, the author is "Ms2ger@gmail.com", which makes hg-git generate a author line like this:

author Ms2ger@gmail.com none@none 123456000 +0000

Which git fails to parse. Another example is this revision http://hg.mozilla.org/mozilla-central/rev/e88d2327e25d600ce326615f682db1d79d2bb10e, where there is no space between the username and the email, which creates an author line like this:

author Ms2ger 123456000 +0000

And you can see how that would confuse git!

With the fixes in this pull request, hg-git can generate better commit objects, that git can actually deal with. I managed to convert the entire hg history of mozilla-central to git with these patches.

Ehsan Akhgari

ehsan@04a37b4 also fixes another instance of this problem, for commits like https://hg.mozilla.org/mozilla-central/rev/e751acb410d0

Ehsan Akhgari ehsan closed this
Ehsan Akhgari ehsan reopened this
Augie Fackler
Collaborator

Can you add some tests for this? I'm hesitant to pull this without corresponding tests. I'm also skeptical of the correctness of get_valid_git_username_email() - couldn't that fail if the username was something like " foo@example.com " or some other broken thing?

When you're ready, I'd greatly prefer patches mailed to the hg-git Google Group - it's easier for me to review and apply them there than on a pull request here.

Thanks!

Augie Fackler durin42 closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 11, 2011
  1. Ehsan Akhgari

    Sanitize the author username and email address to make sure that git …

    ehsan authored
    …is not confused by brackets around those names
  2. Ehsan Akhgari
Commits on Aug 15, 2011
  1. Ehsan Akhgari

    Make the space between the username and email address in hg username …

    ehsan authored
    …parsing code optional, to handle cases like 'User<user@somewhere.org>'
Commits on Aug 23, 2011
  1. Ehsan Akhgari
This page is out of date. Refresh to see the latest.
Showing with 7 additions and 4 deletions.
  1. +7 −4 hggit/git_handler.py
11 hggit/git_handler.py
View
@@ -337,12 +337,15 @@ def export_hg_commit(self, rev):
self.swap_out_encoding(oldenc)
return commit.id
+ def get_valid_git_username_email(self, name):
+ return name.lstrip('<').rstrip('>')
+
def get_git_author(self, ctx):
# hg authors might not have emails
author = ctx.user()
# check for git author pattern compliance
- regex = re.compile('^(.*?) \<(.*?)\>(.*)$')
+ regex = re.compile('^(.*?) ?\<(.*?)\>?(.*)$')
a = regex.match(author)
if a:
@@ -350,11 +353,11 @@ def get_git_author(self, ctx):
email = a.group(2)
if len(a.group(3)) > 0:
name += ' ext:(' + urllib.quote(a.group(3)) + ')'
- author = name + ' <' + email + '>'
+ author = self.get_valid_git_username_email(name) + ' <' + self.get_valid_git_username_email(email) + '>'
elif '@' in author:
- author = author + ' <' + author + '>'
+ author = self.get_valid_git_username_email(author) + ' <' + self.get_valid_git_username_email(author) + '>'
else:
- author = author + ' <none@none>'
+ author = self.get_valid_git_username_email(author) + ' <none@none>'
if 'author' in ctx.extra():
author = "".join(apply_delta(author, ctx.extra()['author']))
Something went wrong with that request. Please try again.