Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make git rebase -r's label generation more resilient #327

Closed
wants to merge 2 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Sep 2, 2019

Those labels must be valid ref names, and therefore valid file names. The initial patch came in via Git for Windows.

Change since v1:

  • moved the entire sanitizing logic to label_oid(), as a preparatory step.

Cc: Doan Tran Cong Danh congdanhqx@gmail.com

@dscho
Copy link
Member Author

dscho commented Sep 2, 2019

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Sep 2, 2019

Submitted as pull.327.git.gitgitgadget@gmail.com

@dscho
Copy link
Member Author

dscho commented Sep 2, 2019

(See also git-for-windows#2315)

sequencer.c Show resolved Hide resolved
@dscho dscho added the needs-work These patches have pending issues that need to be resolved before submitting label Sep 11, 2019
dscho and others added 2 commits November 17, 2019 22:53
One of the trickier aspects of the design of `git rebase
--rebase-merges` is the way labels are generated for the initial todo
list: those labels are supposed to be intuitive and first and foremost
unique.

To that end, `label_oid()` appends a unique suffix when necessary.

Those labels not only need to be unique, but they also need to be valid
refs. To make sure of that, `make_script_with_merges()` replaces
whitespace by dashes.

That would appear to be the wrong layer for that sanitizing step,
though: all callers of `label_oid()` should get that same benefit.

Even if it does not make a difference currently (the only called of
`label_oid()` that passes a label that might need to be sanitized _is_
`make_script_with_merges()`), let's move the responsibility for
sanitizing labels into the `label_oid()` function.

This commit is best viewed with `-w` because it unfortunately needs to
change the indentation of a large block of code in `label_oid()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The `label` todo command in interactive rebases creates temporary refs
in the `refs/rewritten/` namespace. These refs are stored as loose refs,
i.e. as files in `.git/refs/rewritten/`, therefore they have to conform
with file name limitations on the current filesystem in addition to the
accepted ref format.

This poses a problem in particular on NTFS/FAT, where e.g. the colon,
double-quote and pipe characters are disallowed as part of a file name.

Let's safeguard against this by replacing not only white-space
characters by dashes, but all non-alpha-numeric ones.

However, we exempt non-ASCII UTF-8 characters from that, as it should be
quite possible to reflect branch names such as `↯↯↯` in refs/file names.

Signed-off-by: Matthew Rogers <mattr94@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho removed the needs-work These patches have pending issues that need to be resolved before submitting label Nov 17, 2019
@dscho
Copy link
Member Author

dscho commented Nov 17, 2019

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 17, 2019

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> Those labels must be valid ref names, and therefore valid file names. The
> initial patch came in via Git for Windows.
>
> Change since v1:
>
>  * moved the entire sanitizing logic to label_oid(), as a preparatory step.
>
> Johannes Schindelin (1):
>   rebase-merges: move labels' whitespace mangling into `label_oid()`
>
> Matthew Rogers (1):
>   rebase -r: let `label` generate safer labels
>
>  sequencer.c              | 72 +++++++++++++++++++++++++---------------
>  t/t3430-rebase-merges.sh |  6 ++++
>  2 files changed, 51 insertions(+), 27 deletions(-)

I think Dscho meant to Cc you as these two patches are meant to be a
more complete solution to supersede your [*1*].


[Reference]

*1* <860dee65f49ea7eacf5a0c7c8ffe59095a51b1ce.1573205699.git.congdanhqx@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

This branch is now known as js/rebase-r-safer-label.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

This patch series was integrated into pu via git@9ac4901.

@gitgitgadget gitgitgadget bot added the pu label Nov 18, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

On the Git mailing list, Danh Doan wrote (reply to this):

On 2019-11-18 12:42:41+0900, Junio C Hamano <gitster@pobox.com> wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
> 
> > Those labels must be valid ref names, and therefore valid file names. The
> > initial patch came in via Git for Windows.
> >
> > Change since v1:
> >
> >  * moved the entire sanitizing logic to label_oid(), as a preparatory step.
> >
> > Johannes Schindelin (1):
> >   rebase-merges: move labels' whitespace mangling into `label_oid()`
> >
> > Matthew Rogers (1):
> >   rebase -r: let `label` generate safer labels
> >
> >  sequencer.c              | 72 +++++++++++++++++++++++++---------------
> >  t/t3430-rebase-merges.sh |  6 ++++
> >  2 files changed, 51 insertions(+), 27 deletions(-)
> 
> I think Dscho meant to Cc you as these two patches are meant to be a
> more complete solution to supersede your [*1*].

I've applied their patches over my branches,
my introduced test_expect_failure was flipped.
That's nice.

Anyway, when reading their patch, I discovered a problem with
git rebase --rebase-merges when its message is onto.

Here is a patch to fix it.

I applied Dscho's patches over my dd/sequencer-utf8 then writing this
patch, in case you have problem applying it.

The context for the diff is coming from Dscho's patches.

-------8<--------------------
From 48205889b404b82baa4b30c2eedd52243c349e3e Mon Sep 17 00:00:00 2001
From: Doan Tran Cong Danh <congdanhqx@gmail.com>
Date: Mon, 18 Nov 2019 18:02:05 +0700
Subject: [PATCH] sequencer: handle rebase-merge for "onto" message

In order to work correctly, git-rebase --rebase-merges needs to make
initial todo list with unique labels.

Those unique labels is being handled by employing a hashmap and
suffixing an unique number if any duplicate is found.

But we forgat that beside of those labels for side branches,
we also make a special label `onto' for our so-called new-base.

In a special case that any of those labels for side branches named
`onto', git will run into trouble.

Correct it.

Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
---
 sequencer.c              |  5 +++++
 t/t3430-rebase-merges.sh | 21 +++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 350045b1b4..fc81e43f0f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4569,10 +4569,15 @@ static int make_script_with_merges(struct pretty_print_context *pp,
 	strbuf_init(&state.buf, 32);
 
 	if (revs->cmdline.nr && (revs->cmdline.rev[0].flags & BOTTOM)) {
+		struct labels_entry *onto_label_entry;
 		struct object_id *oid = &revs->cmdline.rev[0].item->oid;
 		FLEX_ALLOC_STR(entry, string, "onto");
 		oidcpy(&entry->entry.oid, oid);
 		oidmap_put(&state.commit2label, entry);
+
+		FLEX_ALLOC_STR(onto_label_entry, label, "onto");
+		hashmap_entry_init(&onto_label_entry->entry, strihash("onto"));
+		hashmap_add(&state.labels, &onto_label_entry->entry);
 	}
 
 	/*
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index f728aba995..4e2c0ede51 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -474,4 +474,25 @@ test_expect_success '--rebase-merges with commit that can generate bad character
 	git rebase --rebase-merges --force-rebase E
 '
 
+test_expect_success '--rebase-merges with message matched with onto label' '
+	git checkout -b onto-label E &&
+	git merge -m onto G &&
+	git rebase --rebase-merges --force-rebase E &&
+	test_cmp_graph <<-\EOF
+	*   onto
+	|\
+	| * G
+	| * F
+	* |   E
+	|\ \
+	| * | B
+	* | | D
+	| |/
+	|/|
+	* | C
+	|/
+	* A
+	EOF
+'
+
 test_done
-- 
2.24.0.10.gc61c3b979f.dirty

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

On the Git mailing list, Doan Tran Cong Danh wrote (reply to this):

In order to work correctly, git-rebase --rebase-merges needs to make
initial todo list with unique labels.

Those unique labels is being handled by employing a hashmap and
appending an unique number if any duplicate is found.

But, we forget that beside those labels for side branches,
we also have a special label `onto' for our so-called new-base.

In a special case that any of those labels for side branches named
`onto', git will run into trouble.

Correct it.

Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
---
Sorry for the noise, I forgot to check spelling for v1

And I forgot to delete From line when append to my MUA.

Range-diff against v1:
1:  48205889b4 ! 1:  9246beacf2 sequencer: handle rebase-merge for "onto" message
    @@ Metadata
     Author: Doan Tran Cong Danh <congdanhqx@gmail.com>
     
      ## Commit message ##
    -    sequencer: handle rebase-merge for "onto" message
    +    sequencer: handle rebase-merges for "onto" message
     
         In order to work correctly, git-rebase --rebase-merges needs to make
         initial todo list with unique labels.
     
         Those unique labels is being handled by employing a hashmap and
    -    suffixing an unique number if any duplicate is found.
    +    appending an unique number if any duplicate is found.
     
    -    But we forgat that beside of those labels for side branches,
    -    we also make a special label `onto' for our so-called new-base.
    +    But, we forget that beside those labels for side branches,
    +    we also have a special label `onto' for our so-called new-base.
     
         In a special case that any of those labels for side branches named
         `onto', git will run into trouble.

 sequencer.c              |  5 +++++
 t/t3430-rebase-merges.sh | 21 +++++++++++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 350045b1b4..fc81e43f0f 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4569,10 +4569,15 @@ static int make_script_with_merges(struct pretty_print_context *pp,
 	strbuf_init(&state.buf, 32);
 
 	if (revs->cmdline.nr && (revs->cmdline.rev[0].flags & BOTTOM)) {
+		struct labels_entry *onto_label_entry;
 		struct object_id *oid = &revs->cmdline.rev[0].item->oid;
 		FLEX_ALLOC_STR(entry, string, "onto");
 		oidcpy(&entry->entry.oid, oid);
 		oidmap_put(&state.commit2label, entry);
+
+		FLEX_ALLOC_STR(onto_label_entry, label, "onto");
+		hashmap_entry_init(&onto_label_entry->entry, strihash("onto"));
+		hashmap_add(&state.labels, &onto_label_entry->entry);
 	}
 
 	/*
diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
index f728aba995..4e2c0ede51 100755
--- a/t/t3430-rebase-merges.sh
+++ b/t/t3430-rebase-merges.sh
@@ -474,4 +474,25 @@ test_expect_success '--rebase-merges with commit that can generate bad character
 	git rebase --rebase-merges --force-rebase E
 '
 
+test_expect_success '--rebase-merges with message matched with onto label' '
+	git checkout -b onto-label E &&
+	git merge -m onto G &&
+	git rebase --rebase-merges --force-rebase E &&
+	test_cmp_graph <<-\EOF
+	*   onto
+	|\
+	| * G
+	| * F
+	* |   E
+	|\ \
+	| * | B
+	* | | D
+	| |/
+	|/|
+	* | C
+	|/
+	* A
+	EOF
+'
+
 test_done
-- 
2.24.0.10.gc61c3b979f.dirty

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi,

On Mon, 18 Nov 2019, Doan Tran Cong Danh wrote:

> In order to work correctly, git-rebase --rebase-merges needs to make
> initial todo list with unique labels.
>
> Those unique labels is being handled by employing a hashmap and
> appending an unique number if any duplicate is found.
>
> But, we forget that beside those labels for side branches,
> we also have a special label `onto' for our so-called new-base.
>
> In a special case that any of those labels for side branches named
> `onto', git will run into trouble.
>
> Correct it.
>
> Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
> ---

Looks obviously correct to me. ACK!

Thank you,
Dscho

> Sorry for the noise, I forgot to check spelling for v1
>
> And I forgot to delete From line when append to my MUA.
>
> Range-diff against v1:
> 1:  48205889b4 ! 1:  9246beacf2 sequencer: handle rebase-merge for "onto=
" message
>     @@ Metadata
>      Author: Doan Tran Cong Danh <congdanhqx@gmail.com>
>
>       ## Commit message ##
>     -    sequencer: handle rebase-merge for "onto" message
>     +    sequencer: handle rebase-merges for "onto" message
>
>          In order to work correctly, git-rebase --rebase-merges needs to=
 make
>          initial todo list with unique labels.
>
>          Those unique labels is being handled by employing a hashmap and
>     -    suffixing an unique number if any duplicate is found.
>     +    appending an unique number if any duplicate is found.
>
>     -    But we forgat that beside of those labels for side branches,
>     -    we also make a special label `onto' for our so-called new-base.
>     +    But, we forget that beside those labels for side branches,
>     +    we also have a special label `onto' for our so-called new-base.
>
>          In a special case that any of those labels for side branches na=
med
>          `onto', git will run into trouble.
>
>  sequencer.c              |  5 +++++
>  t/t3430-rebase-merges.sh | 21 +++++++++++++++++++++
>  2 files changed, 26 insertions(+)
>
> diff --git a/sequencer.c b/sequencer.c
> index 350045b1b4..fc81e43f0f 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4569,10 +4569,15 @@ static int make_script_with_merges(struct pretty=
_print_context *pp,
>  	strbuf_init(&state.buf, 32);
>
>  	if (revs->cmdline.nr && (revs->cmdline.rev[0].flags & BOTTOM)) {
> +		struct labels_entry *onto_label_entry;
>  		struct object_id *oid =3D &revs->cmdline.rev[0].item->oid;
>  		FLEX_ALLOC_STR(entry, string, "onto");
>  		oidcpy(&entry->entry.oid, oid);
>  		oidmap_put(&state.commit2label, entry);
> +
> +		FLEX_ALLOC_STR(onto_label_entry, label, "onto");
> +		hashmap_entry_init(&onto_label_entry->entry, strihash("onto"));
> +		hashmap_add(&state.labels, &onto_label_entry->entry);
>  	}
>
>  	/*
> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index f728aba995..4e2c0ede51 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -474,4 +474,25 @@ test_expect_success '--rebase-merges with commit th=
at can generate bad character
>  	git rebase --rebase-merges --force-rebase E
>  '
>
> +test_expect_success '--rebase-merges with message matched with onto lab=
el' '
> +	git checkout -b onto-label E &&
> +	git merge -m onto G &&
> +	git rebase --rebase-merges --force-rebase E &&
> +	test_cmp_graph <<-\EOF
> +	*   onto
> +	|\
> +	| * G
> +	| * F
> +	* |   E
> +	|\ \
> +	| * | B
> +	* | | D
> +	| |/
> +	|/|
> +	* | C
> +	|/
> +	* A
> +	EOF
> +'
> +
>  test_done
> --
> 2.24.0.10.gc61c3b979f.dirty
>
>

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 18, 2019

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Mon, 18 Nov 2019, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > Those labels must be valid ref names, and therefore valid file names. =
The
> > initial patch came in via Git for Windows.
> >
> > Change since v1:
> >
> >  * moved the entire sanitizing logic to label_oid(), as a preparatory =
step.
> >
> > Johannes Schindelin (1):
> >   rebase-merges: move labels' whitespace mangling into `label_oid()`
> >
> > Matthew Rogers (1):
> >   rebase -r: let `label` generate safer labels
> >
> >  sequencer.c              | 72 +++++++++++++++++++++++++--------------=
-
> >  t/t3430-rebase-merges.sh |  6 ++++
> >  2 files changed, 51 insertions(+), 27 deletions(-)
>
> I think Dscho meant to Cc you as these two patches are meant to be a
> more complete solution to supersede your [*1*].

Whoops, totally forgot. Thanks!
Dscho

>
>
> [Reference]
>
> *1* <860dee65f49ea7eacf5a0c7c8ffe59095a51b1ce.1573205699.git.congdanhqx@=
gmail.com>
>

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 19, 2019

This patch series was integrated into pu via git@da8e83b.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 19, 2019

This patch series was integrated into pu via git@eff2a14.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 20, 2019

This patch series was integrated into pu via git@3ff83f1.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 20, 2019

This patch series was integrated into pu via git@bb7e150.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 21, 2019

On the Git mailing list, Junio C Hamano wrote (reply to this):

Danh Doan <congdanhqx@gmail.com> writes:

> Anyway, when reading their patch, I discovered a problem with
> git rebase --rebase-merges when its message is onto.
>
> Here is a patch to fix it.
>
> I applied Dscho's patches over my dd/sequencer-utf8 then writing this
> patch, in case you have problem applying it.
>
> The context for the diff is coming from Dscho's patches.

Thanks.  While technically this is independent from the "safer
rebase-merges labels" topic (specifically its preparation step),
in the larger picture, this too is to ensure we do not use a wrong
string as a label ;-), so I'll queue it on top of those two patches,
just like how you developed.

Thanks.

> -------8<--------------------
> From 48205889b404b82baa4b30c2eedd52243c349e3e Mon Sep 17 00:00:00 2001
> From: Doan Tran Cong Danh <congdanhqx@gmail.com>
> Date: Mon, 18 Nov 2019 18:02:05 +0700
> Subject: [PATCH] sequencer: handle rebase-merge for "onto" message
>
> In order to work correctly, git-rebase --rebase-merges needs to make
> initial todo list with unique labels.
>
> Those unique labels is being handled by employing a hashmap and
> suffixing an unique number if any duplicate is found.
>
> But we forgat that beside of those labels for side branches,
> we also make a special label `onto' for our so-called new-base.
>
> In a special case that any of those labels for side branches named
> `onto', git will run into trouble.
>
> Correct it.
>
> Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
> ---
>  sequencer.c              |  5 +++++
>  t/t3430-rebase-merges.sh | 21 +++++++++++++++++++++
>  2 files changed, 26 insertions(+)
>
> diff --git a/sequencer.c b/sequencer.c
> index 350045b1b4..fc81e43f0f 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4569,10 +4569,15 @@ static int make_script_with_merges(struct pretty_print_context *pp,
>  	strbuf_init(&state.buf, 32);
>  
>  	if (revs->cmdline.nr && (revs->cmdline.rev[0].flags & BOTTOM)) {
> +		struct labels_entry *onto_label_entry;
>  		struct object_id *oid = &revs->cmdline.rev[0].item->oid;
>  		FLEX_ALLOC_STR(entry, string, "onto");
>  		oidcpy(&entry->entry.oid, oid);
>  		oidmap_put(&state.commit2label, entry);
> +
> +		FLEX_ALLOC_STR(onto_label_entry, label, "onto");
> +		hashmap_entry_init(&onto_label_entry->entry, strihash("onto"));
> +		hashmap_add(&state.labels, &onto_label_entry->entry);
>  	}
>  
>  	/*
> diff --git a/t/t3430-rebase-merges.sh b/t/t3430-rebase-merges.sh
> index f728aba995..4e2c0ede51 100755
> --- a/t/t3430-rebase-merges.sh
> +++ b/t/t3430-rebase-merges.sh
> @@ -474,4 +474,25 @@ test_expect_success '--rebase-merges with commit that can generate bad character
>  	git rebase --rebase-merges --force-rebase E
>  '
>  
> +test_expect_success '--rebase-merges with message matched with onto label' '
> +	git checkout -b onto-label E &&
> +	git merge -m onto G &&
> +	git rebase --rebase-merges --force-rebase E &&
> +	test_cmp_graph <<-\EOF
> +	*   onto
> +	|\
> +	| * G
> +	| * F
> +	* |   E
> +	|\ \
> +	| * | B
> +	* | | D
> +	| |/
> +	|/|
> +	* | C
> +	|/
> +	* A
> +	EOF
> +'
> +
>  test_done

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 21, 2019

This patch series was integrated into pu via git@107077a.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 21, 2019

This patch series was integrated into pu via git@94b3cc1.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 21, 2019

This patch series was integrated into next via git@791d51b.

@gitgitgadget gitgitgadget bot added the next label Nov 21, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Nov 22, 2019

This patch series was integrated into pu via git@089c065.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 25, 2019

This patch series was integrated into pu via git@81c0274.

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 27, 2019

This patch series was integrated into pu via git@cee86dc.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2019

This patch series was integrated into pu via git@8c396f1.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 3, 2019

This patch series was integrated into pu via git@f65c70b.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 5, 2019

This patch series was integrated into pu via git@917d0d6.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 5, 2019

This patch series was integrated into next via git@917d0d6.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 5, 2019

This patch series was integrated into master via git@917d0d6.

@gitgitgadget gitgitgadget bot added the master label Dec 5, 2019
@gitgitgadget gitgitgadget bot closed this Dec 5, 2019
@gitgitgadget
Copy link

gitgitgadget bot commented Dec 5, 2019

Closed via 917d0d6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants