Follow git-check-ref-format rules for tags etc. #7

Closed
purcell opened this Issue Apr 19, 2011 · 6 comments

Comments

Projects
None yet
2 participants
@purcell
Owner

purcell commented Apr 19, 2011

Dots in darcs tag names are translated to underscores by darcs-to-git, but this isn't always necessary. darcs-to-git should use the rules described in "man git-check-ref-format" to decide which characters (if any) to make safe.

@purcell

This comment has been minimized.

Show comment
Hide comment
@purcell

purcell Jul 29, 2011

Owner

From Juliusz Chroboczek:

The syntax of tags in git is somewhat baroque; it is defined in
the manual page for git-check-ref-format. Periods, in particular,
are allowed in some positions but not others.

Replacing all periods with underlines is the safe thing to do, but it
breaks repositories that use tag names such as "babeld-1.1.2" and
derive the tarball's name from the tag name.

This patch simply leaves periods as they are, and hopes everything goes
well. A better solution would be to do something smart with periods
in the forbidden positions.

---
darcs-to-git |    2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/darcs-to-git b/darcs-to-git
index 7831072..fe068bc 100755
--- a/darcs-to-git
+++ b/darcs-to-git
@@ -284,7 +284,7 @@ class DarcsPatch
    self.name = patch_xml.get_elements('name').first.get_text.value.darcs_unescape rescue 'Unnamed patch'
    self.comment = patch_xml.get_elements('comment').first.get_text.value.darcs_unescape rescue nil
    if (self.is_tag = (self.name =~ /^TAG (.*)/))
-      self.git_tag_name = $1.gsub(/[\s:.*]+/, '_')
+      self.git_tag_name = $1.gsub(/[\s:*]+/, '_')
    end
    @git_author_name, @git_author_email = AUTHOR_MAP[author]
  end
-- 
1.7.5.4
Owner

purcell commented Jul 29, 2011

From Juliusz Chroboczek:

The syntax of tags in git is somewhat baroque; it is defined in
the manual page for git-check-ref-format. Periods, in particular,
are allowed in some positions but not others.

Replacing all periods with underlines is the safe thing to do, but it
breaks repositories that use tag names such as "babeld-1.1.2" and
derive the tarball's name from the tag name.

This patch simply leaves periods as they are, and hopes everything goes
well. A better solution would be to do something smart with periods
in the forbidden positions.

---
darcs-to-git |    2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/darcs-to-git b/darcs-to-git
index 7831072..fe068bc 100755
--- a/darcs-to-git
+++ b/darcs-to-git
@@ -284,7 +284,7 @@ class DarcsPatch
    self.name = patch_xml.get_elements('name').first.get_text.value.darcs_unescape rescue 'Unnamed patch'
    self.comment = patch_xml.get_elements('comment').first.get_text.value.darcs_unescape rescue nil
    if (self.is_tag = (self.name =~ /^TAG (.*)/))
-      self.git_tag_name = $1.gsub(/[\s:.*]+/, '_')
+      self.git_tag_name = $1.gsub(/[\s:*]+/, '_')
    end
    @git_author_name, @git_author_email = AUTHOR_MAP[author]
  end
-- 
1.7.5.4
@jech

This comment has been minimized.

Show comment
Hide comment
@jech

jech Jul 29, 2011

Contributor

Better version of this patch sent by e-mail.

-- jch

Contributor

jech commented Jul 29, 2011

Better version of this patch sent by e-mail.

-- jch

@jech

This comment has been minimized.

Show comment
Hide comment
@jech

jech Jul 29, 2011

Contributor

From 5d9d789ebda676673482f038f58264e7cd710baf Mon Sep 17 00:00:00 2001
From: Juliusz Chroboczek jch@pps.jussieu.fr
Date: Sat, 30 Jul 2011 00:31:07 +0200
Subject: [PATCH] Smarter handling of forbidden characters in git tag names.

The syntax of tags in git is somewhat baroque; it is defined in
the manual page for git-check-ref-format. Periods, in particular,
are allowed in some positions but not in others.

Replacing all periods with underlines is the safe thing to do, but it
breaks repositories that use tag names such as "babeld-1.1.2" and
derive the tarball's name from the tag name.

This code attempts to leave periods whenever possible, and only replace
them with underlines when absolutely necessary. It should handle all
of the cases documented in git-check-ref-format, with the exception of
slashes, which are replaced unconditionally.

---
 darcs-to-git |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/darcs-to-git b/darcs-to-git
index 7831072..318df57 100755
--- a/darcs-to-git
+++ b/darcs-to-git
@@ -284,7 +284,22 @@ class DarcsPatch
     self.name = patch_xml.get_elements('name').first.get_text.value.darcs_unescape rescue 'Unnamed patch'
     self.comment = patch_xml.get_elements('comment').first.get_text.value.darcs_unescape rescue nil
     if (self.is_tag = (self.name =~ /^TAG (.*)/))
-      self.git_tag_name = $1.gsub(/[\s:.*]+/, '_')
+      self.git_tag_name = $1.gsub(/[\s:*]+/, '_')
+      # git tag names cannot start or end with a period
+      if (self.git_tag_name[0] == ?.)
+        self.git_tag_name[0] = ?_
+      end
+      if (self.git_tag_name[self.git_tag_name.length - 1] == ?.)
+        self.git_tag_name[self.git_tag_name.length - 1] = ?_
+      end
+      # git tag names cannot contain two successive periods
+      while (i = self.git_tag_name.index('..'))
+        self.git_tag_name[i + 1] = ?_
+      end
+      # git tag names cannot contain #{
+      while (i = self.git_tag_name.index('#{'))
+        self.git_tag_name[i + 1] = ?_
+      end
     end
     @git_author_name, @git_author_email = AUTHOR_MAP[author]
   end
-- 
1.7.5.4
Contributor

jech commented Jul 29, 2011

From 5d9d789ebda676673482f038f58264e7cd710baf Mon Sep 17 00:00:00 2001
From: Juliusz Chroboczek jch@pps.jussieu.fr
Date: Sat, 30 Jul 2011 00:31:07 +0200
Subject: [PATCH] Smarter handling of forbidden characters in git tag names.

The syntax of tags in git is somewhat baroque; it is defined in
the manual page for git-check-ref-format. Periods, in particular,
are allowed in some positions but not in others.

Replacing all periods with underlines is the safe thing to do, but it
breaks repositories that use tag names such as "babeld-1.1.2" and
derive the tarball's name from the tag name.

This code attempts to leave periods whenever possible, and only replace
them with underlines when absolutely necessary. It should handle all
of the cases documented in git-check-ref-format, with the exception of
slashes, which are replaced unconditionally.

---
 darcs-to-git |   17 ++++++++++++++++-
 1 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/darcs-to-git b/darcs-to-git
index 7831072..318df57 100755
--- a/darcs-to-git
+++ b/darcs-to-git
@@ -284,7 +284,22 @@ class DarcsPatch
     self.name = patch_xml.get_elements('name').first.get_text.value.darcs_unescape rescue 'Unnamed patch'
     self.comment = patch_xml.get_elements('comment').first.get_text.value.darcs_unescape rescue nil
     if (self.is_tag = (self.name =~ /^TAG (.*)/))
-      self.git_tag_name = $1.gsub(/[\s:.*]+/, '_')
+      self.git_tag_name = $1.gsub(/[\s:*]+/, '_')
+      # git tag names cannot start or end with a period
+      if (self.git_tag_name[0] == ?.)
+        self.git_tag_name[0] = ?_
+      end
+      if (self.git_tag_name[self.git_tag_name.length - 1] == ?.)
+        self.git_tag_name[self.git_tag_name.length - 1] = ?_
+      end
+      # git tag names cannot contain two successive periods
+      while (i = self.git_tag_name.index('..'))
+        self.git_tag_name[i + 1] = ?_
+      end
+      # git tag names cannot contain #{
+      while (i = self.git_tag_name.index('#{'))
+        self.git_tag_name[i + 1] = ?_
+      end
     end
     @git_author_name, @git_author_email = AUTHOR_MAP[author]
   end
-- 
1.7.5.4

purcell added a commit that referenced this issue Jul 30, 2011

Follow git-check-ref-format rules for tags (see #7)
Signed-off-by: Steve Purcell <steve@sanityinc.com>
@purcell

This comment has been minimized.

Show comment
Hide comment
@purcell

purcell Jul 30, 2011

Owner

Hi Juliusz - thanks! I committed that, and then I followed up with a further commit which converted the logic into a single regexp.

-Steve

Owner

purcell commented Jul 30, 2011

Hi Juliusz - thanks! I committed that, and then I followed up with a further commit which converted the logic into a single regexp.

-Steve

@purcell purcell closed this Jul 30, 2011

@jech

This comment has been minimized.

Show comment
Hide comment
@jech

jech Jul 30, 2011

Contributor

Hi Juliusz - thanks! I committed that,

Hmm... I see it on github, but not on sanityinc. Which is the right
repo to be tracking?

and then I followed up with a further commit which converted the logic
into a single regexp.

Completely incomprehensible for me. Ah, those youngsters with their new
shiny toys ;-)

Are you sure it's doing the right thing?

"foo".gsub(/(?:[\s:*]+|.{2,}|#{|^.|.$)/, '_')
=> "_oo"

-- Juliusz

Contributor

jech commented Jul 30, 2011

Hi Juliusz - thanks! I committed that,

Hmm... I see it on github, but not on sanityinc. Which is the right
repo to be tracking?

and then I followed up with a further commit which converted the logic
into a single regexp.

Completely incomprehensible for me. Ah, those youngsters with their new
shiny toys ;-)

Are you sure it's doing the right thing?

"foo".gsub(/(?:[\s:*]+|.{2,}|#{|^.|.$)/, '_')
=> "_oo"

-- Juliusz

@purcell

This comment has been minimized.

Show comment
Hide comment
@purcell

purcell Jul 30, 2011

Owner

On 30 Jul 2011, at 14:30, jech wrote:

Hi Juliusz - thanks! I committed that,

Hmm... I see it on github, but not on sanityinc. Which is the right
repo to be tracking?

Either. I usually push to both simultaneously. Forgot this time, but it's on sanitinc too now.

and then I followed up with a further commit which converted the logic
into a single regexp.

Completely incomprehensible for me. Ah, those youngsters with their new
shiny toys ;-)

:-)

Are you sure it's doing the right thing?

"foo".gsub(/(?:[\s:*]+|.{2,}|#{|^.|.$)/, '_')
=> "_oo"

Good catch! Fixed now.

Cheers,

-Steve

Owner

purcell commented Jul 30, 2011

On 30 Jul 2011, at 14:30, jech wrote:

Hi Juliusz - thanks! I committed that,

Hmm... I see it on github, but not on sanityinc. Which is the right
repo to be tracking?

Either. I usually push to both simultaneously. Forgot this time, but it's on sanitinc too now.

and then I followed up with a further commit which converted the logic
into a single regexp.

Completely incomprehensible for me. Ah, those youngsters with their new
shiny toys ;-)

:-)

Are you sure it's doing the right thing?

"foo".gsub(/(?:[\s:*]+|.{2,}|#{|^.|.$)/, '_')
=> "_oo"

Good catch! Fixed now.

Cheers,

-Steve

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment