Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

range-diff: show submodule changes irrespective of diff.submodule #1244

Closed

Conversation

phil-blain
Copy link

@phil-blain phil-blain commented May 29, 2022

Changes since v1:

  • added a comparison without '--creation-factor' to the test, as suggested by
    Ævar
  • remove separate 'git add sub' invocations in favor of 'git commit -m msg
    sub', as suggested by Dscho

CC: Johannes Schindelin Johannes.Schindelin@gmx.de
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

@gitgitgadget
Copy link

gitgitgadget bot commented May 29, 2022

There are issues in commit b0fbf76:
range-diff: show submodule changes irrespective of diff.submodule
Commit not signed off
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.

@gitgitgadget
Copy link

gitgitgadget bot commented May 29, 2022

There are issues in commit 8731728:
range-diff: show submodule changes irrespective of diff.submodule
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.

@phil-blain
Copy link
Author

waiting for gitgitgadget/gitgitgadget#993

@phil-blain
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented May 30, 2022

Submitted as pull.1244.git.1653916145441.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1244/phil-blain/range-diff-submodule-diff-v1

To fetch this version to local tag pr-1244/phil-blain/range-diff-submodule-diff-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1244/phil-blain/range-diff-submodule-diff-v1

@gitgitgadget
Copy link

gitgitgadget bot commented May 30, 2022

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):

On Mon, May 30 2022, Philippe Blain via GitGitGadget wrote:

> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> After generating diffs for each range to be compared using a 'git log'
> invocation, range-diff.c::read_patches looks for the "diff --git" header
> in those diffs to recognize the beginning of a new change.
>
> In a project with submodules, and with 'diff.submodule=log' set in the
> config, this header is missing for the diff of a changed submodule, so
> any submodule changes are quietly ignored in the range-diff.
>
> When 'diff.submodule=diff' is set in the config, the "diff --git" header
> is also missing for the submodule itself, but is shown for submodule
> content changes, which can easily confuse 'git range-diff' and lead to
> errors such as:
>
>     error: git apply: bad git-diff - inconsistent old filename on line 1
>     error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
>     '
>     error: could not parse log for '@{u}..@{1}'
>
> Force the submodule diff format to its default ("short") when invoking
> 'git log' to generate the patches for each range, such that submodule
> changes are always shown.
>
> Note that the test must use '--creation-factor=100' to force the second
> commit in the range not to be considered a complete rewrite.
>
> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> ---
>     range-diff: show submodule changes irrespective of diff.submodule
>     
>     This fixes a bug that I reported last summer [1].
>     
>     [1]
>     https://lore.kernel.org/git/e469038c-d78c-cd4b-0214-7094746b9281@gmail.com/
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1244%2Fphil-blain%2Frange-diff-submodule-diff-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1244/phil-blain/range-diff-submodule-diff-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1244
>
>  range-diff.c          |  2 +-
>  t/t3206-range-diff.sh | 44 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)

Thanks for picking this up again, and nice to have a test on this
iteration!

> diff --git a/range-diff.c b/range-diff.c
> index b72eb9fdbee..068bf214544 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -44,7 +44,7 @@ static int read_patches(const char *range, struct string_list *list,
>  
>  	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
>  		     "--reverse", "--date-order", "--decorate=no",
> -		     "--no-prefix",
> +		     "--no-prefix", "--submodule=short",
>  		     /*
>  		      * Choose indicators that are not used anywhere
>  		      * else in diffs, but still look reasonable
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index e30bc48a290..ac848c42536 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -772,4 +772,48 @@ test_expect_success '--left-only/--right-only' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'submodule changes are shown irrespective of diff.submodule' '
> +	git init sub-repo &&
> +	test_commit -C sub-repo sub-first &&
> +	sub_oid1=$(git -C sub-repo rev-parse HEAD) &&
> +	test_commit -C sub-repo sub-second &&
> +	sub_oid2=$(git -C sub-repo rev-parse HEAD) &&
> +	test_commit -C sub-repo sub-third &&
> +	sub_oid3=$(git -C sub-repo rev-parse HEAD) &&
> +
> +	git checkout -b main-sub topic &&
> +	git submodule add ./sub-repo sub &&
> +	git -C sub checkout --detach sub-first &&
> +	git add sub &&
> +	git commit -m "add sub" &&
> +	sup_oid1=$(git rev-parse --short HEAD) &&
> +	git checkout -b topic-sub &&
> +	git -C sub checkout sub-second &&
> +	git add sub &&
> +	git commit -m "change sub" &&
> +	sup_oid2=$(git rev-parse --short HEAD) &&
> +	git checkout -b modified-sub main-sub &&
> +	git -C sub checkout sub-third &&
> +	git add sub &&
> +	git commit -m "change sub" &&
> +	sup_oid3=$(git rev-parse --short HEAD) &&
> +
> +	test_config diff.submodule log &&
> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
> +	cat >expect <<-EOF &&
> +	1:  $sup_oid1 = 1:  $sup_oid1 add sub
> +	2:  $sup_oid2 ! 2:  $sup_oid3 change sub
> +	    @@ Commit message
> +	      ## sub ##
> +	     @@
> +	     -Subproject commit $sub_oid1
> +	    -+Subproject commit $sub_oid2
> +	    ++Subproject commit $sub_oid3
> +	EOF
> +	test_cmp expect actual &&
> +	test_config diff.submodule diff &&
> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
> +	test_cmp expect actual
> +'
> +

I'd find this much easier to follow if this were a two-part where we do
most of this test code in the 1st commit, and assert the current
(failing) behavior with a test_expect_failure.

Then this commit would narrowly be the bugfix itself.

I also see that the --creation-factor=100 isn't necessary and seems
somewhat orthagonal, i.e. we'd like to test this *without* that option
and see how we behave, i.e. we'll emit the "full replacement".

Why not compare the output without --creation-factor=100, and then just
have another --creation-factor=100 test to show what we emit if we "look
into" those commits and diff their contents?

@gitgitgadget
Copy link

gitgitgadget bot commented May 30, 2022

User Ævar Arnfjörð Bjarmason <avarab@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented May 31, 2022

On the Git mailing list, Philippe Blain wrote (reply to this):

Hi Ævar,

Le 2022-05-30 à 09:46, Ævar Arnfjörð Bjarmason a écrit :
> 
> On Mon, May 30 2022, Philippe Blain via GitGitGadget wrote:
> 
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>>
>> After generating diffs for each range to be compared using a 'git log'
>> invocation, range-diff.c::read_patches looks for the "diff --git" header
>> in those diffs to recognize the beginning of a new change.
>>
>> In a project with submodules, and with 'diff.submodule=log' set in the
>> config, this header is missing for the diff of a changed submodule, so
>> any submodule changes are quietly ignored in the range-diff.
>>
>> When 'diff.submodule=diff' is set in the config, the "diff --git" header
>> is also missing for the submodule itself, but is shown for submodule
>> content changes, which can easily confuse 'git range-diff' and lead to
>> errors such as:
>>
>>     error: git apply: bad git-diff - inconsistent old filename on line 1
>>     error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
>>     '
>>     error: could not parse log for '@{u}..@{1}'
>>
>> Force the submodule diff format to its default ("short") when invoking
>> 'git log' to generate the patches for each range, such that submodule
>> changes are always shown.
>>
>> Note that the test must use '--creation-factor=100' to force the second
>> commit in the range not to be considered a complete rewrite.
>>
>> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
>> ---
>>     range-diff: show submodule changes irrespective of diff.submodule
>>     
>>     This fixes a bug that I reported last summer [1].
>>     
>>     [1]
>>     https://lore.kernel.org/git/e469038c-d78c-cd4b-0214-7094746b9281@gmail.com/
>>
>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1244%2Fphil-blain%2Frange-diff-submodule-diff-v1
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1244/phil-blain/range-diff-submodule-diff-v1
>> Pull-Request: https://github.com/gitgitgadget/git/pull/1244
>>
>>  range-diff.c          |  2 +-
>>  t/t3206-range-diff.sh | 44 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 45 insertions(+), 1 deletion(-)
> 
> Thanks for picking this up again, and nice to have a test on this
> iteration!

Well, what I sent last summer was really just a bug report with a "I think
this should fix it" diff, in case anybody wanted to pick it up. 
It was not sent as a patch, I would not send a 
patch fixing a bug without correponding tests ;)

> 
>> diff --git a/range-diff.c b/range-diff.c
>> index b72eb9fdbee..068bf214544 100644
>> --- a/range-diff.c
>> +++ b/range-diff.c
>> @@ -44,7 +44,7 @@ static int read_patches(const char *range, struct string_list *list,
>>  
>>  	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
>>  		     "--reverse", "--date-order", "--decorate=no",
>> -		     "--no-prefix",
>> +		     "--no-prefix", "--submodule=short",
>>  		     /*
>>  		      * Choose indicators that are not used anywhere
>>  		      * else in diffs, but still look reasonable
>> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
>> index e30bc48a290..ac848c42536 100755
>> --- a/t/t3206-range-diff.sh
>> +++ b/t/t3206-range-diff.sh
>> @@ -772,4 +772,48 @@ test_expect_success '--left-only/--right-only' '
>>  	test_cmp expect actual
>>  '
>>  
>> +test_expect_success 'submodule changes are shown irrespective of diff.submodule' '
>> +	git init sub-repo &&
>> +	test_commit -C sub-repo sub-first &&
>> +	sub_oid1=$(git -C sub-repo rev-parse HEAD) &&
>> +	test_commit -C sub-repo sub-second &&
>> +	sub_oid2=$(git -C sub-repo rev-parse HEAD) &&
>> +	test_commit -C sub-repo sub-third &&
>> +	sub_oid3=$(git -C sub-repo rev-parse HEAD) &&
>> +
>> +	git checkout -b main-sub topic &&
>> +	git submodule add ./sub-repo sub &&
>> +	git -C sub checkout --detach sub-first &&
>> +	git add sub &&
>> +	git commit -m "add sub" &&
>> +	sup_oid1=$(git rev-parse --short HEAD) &&
>> +	git checkout -b topic-sub &&
>> +	git -C sub checkout sub-second &&
>> +	git add sub &&
>> +	git commit -m "change sub" &&
>> +	sup_oid2=$(git rev-parse --short HEAD) &&
>> +	git checkout -b modified-sub main-sub &&
>> +	git -C sub checkout sub-third &&
>> +	git add sub &&
>> +	git commit -m "change sub" &&
>> +	sup_oid3=$(git rev-parse --short HEAD) &&
>> +
>> +	test_config diff.submodule log &&
>> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
>> +	cat >expect <<-EOF &&
>> +	1:  $sup_oid1 = 1:  $sup_oid1 add sub
>> +	2:  $sup_oid2 ! 2:  $sup_oid3 change sub
>> +	    @@ Commit message
>> +	      ## sub ##
>> +	     @@
>> +	     -Subproject commit $sub_oid1
>> +	    -+Subproject commit $sub_oid2
>> +	    ++Subproject commit $sub_oid3
>> +	EOF
>> +	test_cmp expect actual &&
>> +	test_config diff.submodule diff &&
>> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
>> +	test_cmp expect actual
>> +'
>> +
> 
> I'd find this much easier to follow if this were a two-part where we do
> most of this test code in the 1st commit, and assert the current
> (failing) behavior with a test_expect_failure.
> 
> Then this commit would narrowly be the bugfix itself.

I agree that I would prefer if it were the norm to do it this way in this 
project. This way you can check that the test you are adding actually fails
without the fix, not only that it passes with the fix. This is what I usually
do locally before sending.

However Junio has expressed his view about that approach before [1] and he prefers
to see regression / bug fixes as a single commit with code and tests. So that's why
I sent this as a single commit.

> 
> I also see that the --creation-factor=100 isn't necessary and seems
> somewhat orthagonal, i.e. we'd like to test this *without* that option
> and see how we behave, i.e. we'll emit the "full replacement".
> 
> Why not compare the output without --creation-factor=100, and then just
> have another --creation-factor=100 test to show what we emit if we "look
> into" those commits and diff their contents?
> 

OK, I think this makes sense. It's true that even without '--creation-factor=100',
with '-c diff.submodule=log' we end up showing the second commit as an exact match
in both ranges, which is wrong. I'll add that to the test.

Thanks for taking a look,

Philippe.


[1] https://lore.kernel.org/git/37DD13D4-FBE4-4DB7-85F5-824E850BA9AE@gmail.com/

@gitgitgadget
Copy link

gitgitgadget bot commented May 31, 2022

User Philippe Blain <levraiphilippeblain@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 2, 2022

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Philippe,

On Mon, 30 May 2022, Philippe Blain via GitGitGadget wrote:

> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> After generating diffs for each range to be compared using a 'git log'
> invocation, range-diff.c::read_patches looks for the "diff --git" header
> in those diffs to recognize the beginning of a new change.
>
> In a project with submodules, and with 'diff.submodule=log' set in the
> config, this header is missing for the diff of a changed submodule, so
> any submodule changes are quietly ignored in the range-diff.

This means that we can go two ways here: either we explicitly disable
`diff.submodule` for the invocation that is spawned from `range-diff`, or
we allow it but then handle the diff header as expected.

>
> When 'diff.submodule=diff' is set in the config, the "diff --git" header
> is also missing for the submodule itself, but is shown for submodule
> content changes, which can easily confuse 'git range-diff' and lead to
> errors such as:
>
>     error: git apply: bad git-diff - inconsistent old filename on line 1
>     error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
>     '
>     error: could not parse log for '@{u}..@{1}'
>
> Force the submodule diff format to its default ("short") when invoking
> 'git log' to generate the patches for each range, such that submodule
> changes are always shown.

Full disclosure: I do not see much value in range-diffs in the presence of
submodules. Nothing in the design of range-diffs is prepared for
submodules.

But since `--submodules=short` does not change anything when running
`range-diff` in repositories without submodules, I don't mind this change.

>
> Note that the test must use '--creation-factor=100' to force the second
> commit in the range not to be considered a complete rewrite.

Thank you for this considerate note!

>
> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> ---
>     range-diff: show submodule changes irrespective of diff.submodule
>
>     This fixes a bug that I reported last summer [1].
>
>     [1]
>     https://lore.kernel.org/git/e469038c-d78c-cd4b-0214-7094746b9281@gmail.com/
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1244%2Fphil-blain%2Frange-diff-submodule-diff-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1244/phil-blain/range-diff-submodule-diff-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1244
>
>  range-diff.c          |  2 +-
>  t/t3206-range-diff.sh | 44 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/range-diff.c b/range-diff.c
> index b72eb9fdbee..068bf214544 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -44,7 +44,7 @@ static int read_patches(const char *range, struct string_list *list,
>
>  	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
>  		     "--reverse", "--date-order", "--decorate=no",
> -		     "--no-prefix",
> +		     "--no-prefix", "--submodule=short",

As I mentioned above, since this does not change anything in the intended
scenarios (i.e. without submodules), I am fine with it.

>  		     /*
>  		      * Choose indicators that are not used anywhere
>  		      * else in diffs, but still look reasonable
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index e30bc48a290..ac848c42536 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -772,4 +772,48 @@ test_expect_success '--left-only/--right-only' '
>  	test_cmp expect actual
>  '
>
> +test_expect_success 'submodule changes are shown irrespective of diff.submodule' '
> +	git init sub-repo &&
> +	test_commit -C sub-repo sub-first &&
> +	sub_oid1=$(git -C sub-repo rev-parse HEAD) &&
> +	test_commit -C sub-repo sub-second &&
> +	sub_oid2=$(git -C sub-repo rev-parse HEAD) &&
> +	test_commit -C sub-repo sub-third &&
> +	sub_oid3=$(git -C sub-repo rev-parse HEAD) &&
> +
> +	git checkout -b main-sub topic &&
> +	git submodule add ./sub-repo sub &&
> +	git -C sub checkout --detach sub-first &&
> +	git add sub &&
> +	git commit -m "add sub" &&

Just a suggestion: use `git commit -m sub-first sub` instead (one `git`
invocation instead of two).

> +	sup_oid1=$(git rev-parse --short HEAD) &&
> +	git checkout -b topic-sub &&
> +	git -C sub checkout sub-second &&
> +	git add sub &&
> +	git commit -m "change sub" &&
> +	sup_oid2=$(git rev-parse --short HEAD) &&
> +	git checkout -b modified-sub main-sub &&

Another suggestion: instead of naming the branches, use the `sup_oid*`
variables directly.

> +	git -C sub checkout sub-third &&
> +	git add sub &&
> +	git commit -m "change sub" &&
> +	sup_oid3=$(git rev-parse --short HEAD) &&
> +
> +	test_config diff.submodule log &&
> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
> +	cat >expect <<-EOF &&
> +	1:  $sup_oid1 = 1:  $sup_oid1 add sub
> +	2:  $sup_oid2 ! 2:  $sup_oid3 change sub
> +	    @@ Commit message
> +	      ## sub ##
> +	     @@
> +	     -Subproject commit $sub_oid1
> +	    -+Subproject commit $sub_oid2
> +	    ++Subproject commit $sub_oid3
> +	EOF
> +	test_cmp expect actual &&
> +	test_config diff.submodule diff &&
> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
> +	test_cmp expect actual
> +'

This test case is very clear and concise, even without my suggested
changes. Therefore, if you want to keep the patch as-is, I am fine with
that, too.

Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Thank you,
Dscho

> +
>  test_done
>
> base-commit: 7a3eb286977746bc09a5de7682df0e5a7085e17c
> --
> gitgitgadget
>

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 2, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Force the submodule diff format to its default ("short") when invoking
>> 'git log' to generate the patches for each range, such that submodule
>> changes are always shown.
>
> Full disclosure: I do not see much value in range-diffs in the presence of
> submodules. Nothing in the design of range-diffs is prepared for
> submodules.
>
> But since `--submodules=short` does not change anything when running
> `range-diff` in repositories without submodules, I don't mind this change.

IOW, "I wrote it for the purpose of doing X, I do not care those who
have been using it for doing Y, I am OK with changing behaviour on
them".

Philippe, do you have a good guess on other users and workflows that
may benefit from the current behaviour?  I suspect in the longer term
this might have to become configurable, and I am having a hard time
judging if (1) a temporary regression (to them) is acceptable or (2)
the new feature to also show submodule changes is not urgent enough
that it may be better to make it configurable from day one, instead
of using a different hardcoded and only setting like this patch does.

> This test case is very clear and concise, even without my suggested
> changes. Therefore, if you want to keep the patch as-is, I am fine with
> that, too.
>
> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Thanks for a review.

Will queue.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 2, 2022

This branch is now known as pb/range-diff-with-submodule.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 2, 2022

This patch series was integrated into seen via git@03ac318.

@gitgitgadget gitgitgadget bot added the seen label Jun 2, 2022
@gitgitgadget
Copy link

gitgitgadget bot commented Jun 3, 2022

This patch series was integrated into seen via git@36d31f2.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 3, 2022

This patch series was integrated into seen via git@76f184d.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 6, 2022

This patch series was integrated into seen via git@2c8f3eb.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 6, 2022

On the Git mailing list, Philippe Blain wrote (reply to this):

Hi Dscho,

Le 2022-06-02 à 11:36, Johannes Schindelin a écrit :
> Hi Philippe,
> 
> On Mon, 30 May 2022, Philippe Blain via GitGitGadget wrote:
> 
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>>
>> After generating diffs for each range to be compared using a 'git log'
>> invocation, range-diff.c::read_patches looks for the "diff --git" header
>> in those diffs to recognize the beginning of a new change.
>>
>> In a project with submodules, and with 'diff.submodule=log' set in the
>> config, this header is missing for the diff of a changed submodule, so
>> any submodule changes are quietly ignored in the range-diff.
> 
> This means that we can go two ways here: either we explicitly disable
> `diff.submodule` for the invocation that is spawned from `range-diff`, or
> we allow it but then handle the diff header as expected.
> 
>>
>> When 'diff.submodule=diff' is set in the config, the "diff --git" header
>> is also missing for the submodule itself, but is shown for submodule
>> content changes, which can easily confuse 'git range-diff' and lead to
>> errors such as:
>>
>>     error: git apply: bad git-diff - inconsistent old filename on line 1
>>     error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
>>     '
>>     error: could not parse log for '@{u}..@{1}'
>>
>> Force the submodule diff format to its default ("short") when invoking
>> 'git log' to generate the patches for each range, such that submodule
>> changes are always shown.
> 
> Full disclosure: I do not see much value in range-diffs in the presence of
> submodules. Nothing in the design of range-diffs is prepared for
> submodules.
> 
> But since `--submodules=short` does not change anything when running
> `range-diff` in repositories without submodules, I don't mind this change.
> 
>>
>> Note that the test must use '--creation-factor=100' to force the second
>> commit in the range not to be considered a complete rewrite.
> 
> Thank you for this considerate note!
> 
>>
>> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
>> ---
>>     range-diff: show submodule changes irrespective of diff.submodule
>>
>>     This fixes a bug that I reported last summer [1].
>>
>>     [1]
>>     https://lore.kernel.org/git/e469038c-d78c-cd4b-0214-7094746b9281@gmail.com/
>>
>> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1244%2Fphil-blain%2Frange-diff-submodule-diff-v1
>> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1244/phil-blain/range-diff-submodule-diff-v1
>> Pull-Request: https://github.com/gitgitgadget/git/pull/1244
>>
>>  range-diff.c          |  2 +-
>>  t/t3206-range-diff.sh | 44 +++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/range-diff.c b/range-diff.c
>> index b72eb9fdbee..068bf214544 100644
>> --- a/range-diff.c
>> +++ b/range-diff.c
>> @@ -44,7 +44,7 @@ static int read_patches(const char *range, struct string_list *list,
>>
>>  	strvec_pushl(&cp.args, "log", "--no-color", "-p", "--no-merges",
>>  		     "--reverse", "--date-order", "--decorate=no",
>> -		     "--no-prefix",
>> +		     "--no-prefix", "--submodule=short",
> 
> As I mentioned above, since this does not change anything in the intended
> scenarios (i.e. without submodules), I am fine with it.
> 
>>  		     /*
>>  		      * Choose indicators that are not used anywhere
>>  		      * else in diffs, but still look reasonable
>> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
>> index e30bc48a290..ac848c42536 100755
>> --- a/t/t3206-range-diff.sh
>> +++ b/t/t3206-range-diff.sh
>> @@ -772,4 +772,48 @@ test_expect_success '--left-only/--right-only' '
>>  	test_cmp expect actual
>>  '
>>
>> +test_expect_success 'submodule changes are shown irrespective of diff.submodule' '
>> +	git init sub-repo &&
>> +	test_commit -C sub-repo sub-first &&
>> +	sub_oid1=$(git -C sub-repo rev-parse HEAD) &&
>> +	test_commit -C sub-repo sub-second &&
>> +	sub_oid2=$(git -C sub-repo rev-parse HEAD) &&
>> +	test_commit -C sub-repo sub-third &&
>> +	sub_oid3=$(git -C sub-repo rev-parse HEAD) &&
>> +
>> +	git checkout -b main-sub topic &&
>> +	git submodule add ./sub-repo sub &&
>> +	git -C sub checkout --detach sub-first &&
>> +	git add sub &&
>> +	git commit -m "add sub" &&
> 
> Just a suggestion: use `git commit -m sub-first sub` instead (one `git`
> invocation instead of two).

OK, good idea. I'll tweak that.

> 
>> +	sup_oid1=$(git rev-parse --short HEAD) &&
>> +	git checkout -b topic-sub &&
>> +	git -C sub checkout sub-second &&
>> +	git add sub &&
>> +	git commit -m "change sub" &&
>> +	sup_oid2=$(git rev-parse --short HEAD) &&
>> +	git checkout -b modified-sub main-sub &&
> 
> Another suggestion: instead of naming the branches, use the `sup_oid*`
> variables directly.
> 

I think I like the branch names, they make the test closer to a 
"real-life" scenario (in my opinion). So I think I'll keep them,
since you write later in your reply that you do not mind that much.

>> +	git -C sub checkout sub-third &&
>> +	git add sub &&
>> +	git commit -m "change sub" &&
>> +	sup_oid3=$(git rev-parse --short HEAD) &&
>> +
>> +	test_config diff.submodule log &&
>> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
>> +	cat >expect <<-EOF &&
>> +	1:  $sup_oid1 = 1:  $sup_oid1 add sub
>> +	2:  $sup_oid2 ! 2:  $sup_oid3 change sub
>> +	    @@ Commit message
>> +	      ## sub ##
>> +	     @@
>> +	     -Subproject commit $sub_oid1
>> +	    -+Subproject commit $sub_oid2
>> +	    ++Subproject commit $sub_oid3
>> +	EOF
>> +	test_cmp expect actual &&
>> +	test_config diff.submodule diff &&
>> +	git range-diff --creation-factor=100 topic topic-sub modified-sub >actual &&
>> +	test_cmp expect actual
>> +'
> 
> This test case is very clear and concise, even without my suggested
> changes. Therefore, if you want to keep the patch as-is, I am fine with
> that, too.
> 
> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> Thank you,

Thanks,

Philippe.	

After generating diffs for each range to be compared using a 'git log'
invocation, range-diff.c::read_patches looks for the "diff --git" header
in those diffs to recognize the beginning of a new change.

In a project with submodules, and with 'diff.submodule=log' set in the
config, this header is missing for the diff of a changed submodule, so
any submodule changes are quietly ignored in the range-diff.

When 'diff.submodule=diff' is set in the config, the "diff --git" header
is also missing for the submodule itself, but is shown for submodule
content changes, which can easily confuse 'git range-diff' and lead to
errors such as:

    error: git apply: bad git-diff - inconsistent old filename on line 1
    error: could not parse git header 'diff --git path/to/submodule/and/some/file/within
    '
    error: could not parse log for '@{u}..@{1}'

Force the submodule diff format to its default ("short") when invoking
'git log' to generate the patches for each range, such that submodule
changes are always detected.

Add a test, including an invocation with '--creation-factor=100' to
force the second commit in the range not to be considered a complete
rewrite, in order to verify we do indeed get the "short" format.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
@gitgitgadget
Copy link

gitgitgadget bot commented Jun 6, 2022

On the Git mailing list, Philippe Blain wrote (reply to this):

Hi Junio,

Le 2022-06-02 à 13:36, Junio C Hamano a écrit :
> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
>>> Force the submodule diff format to its default ("short") when invoking
>>> 'git log' to generate the patches for each range, such that submodule
>>> changes are always shown.
>>
>> Full disclosure: I do not see much value in range-diffs in the presence of
>> submodules. Nothing in the design of range-diffs is prepared for
>> submodules.
>>
>> But since `--submodules=short` does not change anything when running
>> `range-diff` in repositories without submodules, I don't mind this change.
> 
> IOW, "I wrote it for the purpose of doing X, I do not care those who
> have been using it for doing Y, I am OK with changing behaviour on
> them".
> 
> Philippe, do you have a good guess on other users and workflows that
> may benefit from the current behaviour?  I suspect in the longer term
> this might have to become configurable, and I am having a hard time
> judging if (1) a temporary regression (to them) is acceptable or (2)
> the new feature to also show submodule changes is not urgent enough
> that it may be better to make it configurable from day one, instead
> of using a different hardcoded and only setting like this patch does.
> 
Just to be clear: the "out of the box" behaviour (i.e. nothing in the config)
is correct, i.e. submodule changes are detected and shown by 'git range-diff'.

It's only if someone has 'diff.submodule={log,diff}' in their config that submodule changes are
quietly ignored (log) or might crash 'git range-diff' (diff). So I do not think
of any user or workflow that benefit from the current behaviour, no. If you have 
diff.submodule={log,diff} set in your config, it's most probably because you work
in projects that involve submodules and you do care about submodule changes. So 
having these changes "hidden" by range-diff (or having range-diff crash) just because the output format 
of 'git -c diff.submodule={log,diff} log' does not use a 'diff --git' header for submodules is really
not expected. So I do not think we need to make that configurable. I think hardcoding
'--submodule=short' is an easy fix and a good first step in making 'git range-diff'
more useful for submodule users. 


Thanks,

Philippe.

@phil-blain
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 6, 2022

Submitted as pull.1244.v2.git.1654549153769.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1244/phil-blain/range-diff-submodule-diff-v2

To fetch this version to local tag pr-1244/phil-blain/range-diff-submodule-diff-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1244/phil-blain/range-diff-submodule-diff-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 7, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

Philippe Blain <levraiphilippeblain@gmail.com> writes:

> Just to be clear: the "out of the box" behaviour (i.e. nothing in the config)
> is correct, i.e. submodule changes are detected and shown by 'git range-diff'.
>
> It's only if someone has 'diff.submodule={log,diff}' in their
> config that submodule changes are quietly ignored (log) or might
> crash 'git range-diff' (diff). So I do not think of any user or
> workflow that benefit from the current behaviour, no. If you have
> diff.submodule={log,diff} set in your config, it's most probably
> because you work in projects that involve submodules and you do
> care about submodule changes. So having these changes "hidden" by
> range-diff (or having range-diff crash) just because the output
> format of 'git -c diff.submodule={log,diff} log' does not use a
> 'diff --git' header for submodules is really not expected. So I do
> not think we need to make that configurable. I think hardcoding
> '--submodule=short' is an easy fix and a good first step in making
> 'git range-diff' more useful for submodule users.

OK.  As "diff.submodule=none" does not exist, hardcoding "short"
would not hurt anybody, I agree.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 7, 2022

This patch series was integrated into seen via git@1dfd1d2.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 7, 2022

This patch series was integrated into seen via git@02fbb86.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 7, 2022

This patch series was integrated into next via git@e5e3159.

@gitgitgadget gitgitgadget bot added the next label Jun 7, 2022
@gitgitgadget
Copy link

gitgitgadget bot commented Jun 8, 2022

There was a status update in the "New Topics" section about the branch pb/range-diff-with-submodule on the Git mailing list:

"git range-diff" did not show anything for submodules that changed
in the ranges being compared.  Change the behaviour to include the
"--submodule=short" output unconditionally to be compared.

Will merge to 'master'.
source: <pull.1244.v2.git.1654549153769.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 8, 2022

This patch series was integrated into seen via git@d5cb226.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 10, 2022

This patch series was integrated into seen via git@16e4bd6.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 10, 2022

This patch series was integrated into seen via git@24c976f.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 11, 2022

There was a status update in the "Cooking" section about the branch pb/range-diff-with-submodule on the Git mailing list:

"git range-diff" did not show anything for submodules that changed
in the ranges being compared.  Change the behaviour to include the
"--submodule=short" output unconditionally to be compared.

Will merge to 'master'.
source: <pull.1244.v2.git.1654549153769.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 13, 2022

This patch series was integrated into seen via git@fc74137.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 14, 2022

This patch series was integrated into seen via git@ecbd60a.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 14, 2022

This patch series was integrated into master via git@ecbd60a.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 14, 2022

This patch series was integrated into next via git@ecbd60a.

@gitgitgadget gitgitgadget bot added the master label Jun 14, 2022
@gitgitgadget gitgitgadget bot closed this Jun 14, 2022
@gitgitgadget
Copy link

gitgitgadget bot commented Jun 14, 2022

Closed via ecbd60a.

@phil-blain phil-blain deleted the range-diff-submodule-diff branch August 4, 2022 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant