Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Improve hg-git's author/committer parsing to match what git expects #230

Closed
wants to merge 4 commits into from

2 participants

@ehsan

I found out about this problem when converting Mozilla's hg repository to git.

The way that git parses the author/commiter lines is that it looks for the first less-than character, and expects that to start the email address, and then it looks for the next greater-than character, and it then expects everything following that character to be parsed as a date.

hg-git's output doesn't completely match that expectation. For example, for this revision hg.mozilla.org/mozilla-central/rev/a537a070dbf40081e1d32321924b6589b271574e, the author is "Ms2ger@gmail.com", which makes hg-git generate a author line like this:

author Ms2ger@gmail.com none@none 123456000 +0000

Which git fails to parse. Another example is this revision http://hg.mozilla.org/mozilla-central/rev/e88d2327e25d600ce326615f682db1d79d2bb10e, where there is no space between the username and the email, which creates an author line like this:

author Ms2ger 123456000 +0000

And you can see how that would confuse git!

With the fixes in this pull request, hg-git can generate better commit objects, that git can actually deal with. I managed to convert the entire hg history of mozilla-central to git with these patches.

@ehsan

ehsan@04a37b4 also fixes another instance of this problem, for commits like https://hg.mozilla.org/mozilla-central/rev/e751acb410d0

@ehsan ehsan closed this
@ehsan ehsan reopened this
@durin42
Collaborator

Can you add some tests for this? I'm hesitant to pull this without corresponding tests. I'm also skeptical of the correctness of get_valid_git_username_email() - couldn't that fail if the username was something like " foo@example.com " or some other broken thing?

When you're ready, I'd greatly prefer patches mailed to the hg-git Google Group - it's easier for me to review and apply them there than on a pull request here.

Thanks!

@durin42 durin42 closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Aug 11, 2011
  1. @ehsan

    Sanitize the author username and email address to make sure that git …

    ehsan authored
    …is not confused by brackets around those names
  2. @ehsan
Commits on Aug 15, 2011
  1. @ehsan

    Make the space between the username and email address in hg username …

    ehsan authored
    …parsing code optional, to handle cases like 'User<user@somewhere.org>'
Commits on Aug 23, 2011
  1. @ehsan
This page is out of date. Refresh to see the latest.
Showing with 7 additions and 4 deletions.
  1. +7 −4 hggit/git_handler.py
View
11 hggit/git_handler.py
@@ -337,12 +337,15 @@ def export_hg_commit(self, rev):
self.swap_out_encoding(oldenc)
return commit.id
+ def get_valid_git_username_email(self, name):
+ return name.lstrip('<').rstrip('>')
+
def get_git_author(self, ctx):
# hg authors might not have emails
author = ctx.user()
# check for git author pattern compliance
- regex = re.compile('^(.*?) \<(.*?)\>(.*)$')
+ regex = re.compile('^(.*?) ?\<(.*?)\>?(.*)$')
a = regex.match(author)
if a:
@@ -350,11 +353,11 @@ def get_git_author(self, ctx):
email = a.group(2)
if len(a.group(3)) > 0:
name += ' ext:(' + urllib.quote(a.group(3)) + ')'
- author = name + ' <' + email + '>'
+ author = self.get_valid_git_username_email(name) + ' <' + self.get_valid_git_username_email(email) + '>'
elif '@' in author:
- author = author + ' <' + author + '>'
+ author = self.get_valid_git_username_email(author) + ' <' + self.get_valid_git_username_email(author) + '>'
else:
- author = author + ' <none@none>'
+ author = self.get_valid_git_username_email(author) + ' <none@none>'
if 'author' in ctx.extra():
author = "".join(apply_delta(author, ctx.extra()['author']))
Something went wrong with that request. Please try again.