I found out about this problem when converting Mozilla's hg repository to git.
The way that git parses the author/commiter lines is that it looks for the first less-than character, and expects that to start the email address, and then it looks for the next greater-than character, and it then expects everything following that character to be parsed as a date.
hg-git's output doesn't completely match that expectation. For example, for this revision hg.mozilla.org/mozilla-central/rev/a537a070dbf40081e1d32321924b6589b271574e, the author is "Ms2ger@gmail.com", which makes hg-git generate a author line like this:
author Ms2ger@gmail.com none@none 123456000 +0000
Which git fails to parse. Another example is this revision http://hg.mozilla.org/mozilla-central/rev/e88d2327e25d600ce326615f682db1d79d2bb10e, where there is no space between the username and the email, which creates an author line like this:
author Ms2ger<Ms2ger@gmail.com <Ms2gerMs2ger@gmail.com 123456000 +0000
And you can see how that would confuse git!
With the fixes in this pull request, hg-git can generate better commit objects, that git can actually deal with. I managed to convert the entire hg history of mozilla-central to git with these patches.
Sanitize the author username and email address to make sure that git …
…is not confused by brackets around those names
Make get_valid_git_username_email a proper method on the GitHandler o…
Make the space between the username and email address in hg username …
…parsing code optional, to handle cases like 'User<firstname.lastname@example.org>'
Treat the trailing bracket after the hg author name as optional.
This causes stuff like
https://hg.mozilla.org/mozilla-central/rev/e751acb410d0 to be parsed
ehsan@04a37b4 also fixes another instance of this problem, for commits like https://hg.mozilla.org/mozilla-central/rev/e751acb410d0
Can you add some tests for this? I'm hesitant to pull this without corresponding tests. I'm also skeptical of the correctness of get_valid_git_username_email() - couldn't that fail if the username was something like " email@example.com " or some other broken thing?
When you're ready, I'd greatly prefer patches mailed to the hg-git Google Group - it's easier for me to review and apply them there than on a pull request here.