-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
diff --color-moved[-ws] speedups #981
diff --color-moved[-ws] speedups #981
Conversation
/submit |
There is an issue in commit 27f23b5: |
1 similar comment
There is an issue in commit 27f23b5: |
How about: |
a6a18d9
to
220664d
Compare
Thanks, I've reworded it - do I need to submit it again? |
/submit |
Submitted as pull.981.git.1623675888.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
Yes, it was not submitted automatically. |
@@ -1142,14 +1142,14 @@ static void mark_color_as_moved(struct diff_options *o, | |||
struct moved_block *pmb = NULL; /* potentially moved blocks */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Subject: Re: [PATCH 01/10] diff --color-moved=zerba: fix alternate coloring
Zerba?
> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> b0a2ba4776 ("diff --color-moved=zebra: be stricter with color
> alternation", 2018-11-23) sought to avoid using the alternate colors
> unless there are two adjacent moved blocks of the same
> sign. Unfortunately it contains two bugs that prevented it from fixing
> the problem properly. Firstly `last_symbol` is reset at the start of
> each iteration of the loop losing the symbol of the last line and
> secondly when deciding whether to use the alternate color it should be
> checking if the current line is the same sign of the last line, not a
> different sign. The combination of the two errors means that we still
> use the alternate color when we should do but we also use it when we
> shouldn't. This is most noticable when using
> --color-moved-ws=allow-indentation-change with hunks like
>
> -this line gets indented
> + this line gets indented
>
> where the post image is colored with newMovedAlternate rather than
> newMoved. While this does not matter much, the next commit will change
> the coloring to be correct in this case, so lets fix the bug here to
> make it clear why the output is changing and add a regression test.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
> diff.c | 4 +--
> t/t4015-diff-whitespace.sh | 72 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 74 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index 52c791574b71..cb068f8258c0 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1142,6 +1142,7 @@ static void mark_color_as_moved(struct diff_options *o,
> struct moved_block *pmb = NULL; /* potentially moved blocks */
> int pmb_nr = 0, pmb_alloc = 0;
> int n, flipped_block = 0, block_length = 0;
> + enum diff_symbol last_symbol = 0;
>
>
> for (n = 0; n < o->emitted_symbols->nr; n++) {
> @@ -1149,7 +1150,6 @@ static void mark_color_as_moved(struct diff_options *o,
> struct moved_entry *key;
> struct moved_entry *match = NULL;
> struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
> - enum diff_symbol last_symbol = 0;
>
> switch (l->s) {
> case DIFF_SYMBOL_PLUS:
> @@ -1214,7 +1214,7 @@ static void mark_color_as_moved(struct diff_options *o,
> }
>
> if (adjust_last_block(o, n, block_length) &&
> - pmb_nr && last_symbol != l->s)
> + pmb_nr && last_symbol == l->s)
> flipped_block = (flipped_block + 1) % 2;
> else
> flipped_block = 0;
> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
> index 2c13b62d3c65..920114cd795c 100755
> --- a/t/t4015-diff-whitespace.sh
> +++ b/t/t4015-diff-whitespace.sh
> @@ -1442,6 +1442,78 @@ test_expect_success 'detect permutations inside moved code -- dimmed-zebra' '
> test_cmp expected actual
> '
>
> +test_expect_success 'zebra alternate color is only used when necessary' '
> + cat >old.txt <<-\EOF &&
> + line 1A should be marked as oldMoved newMovedAlternate
> + line 1B should be marked as oldMoved newMovedAlternate
> + unchanged
> + line 2A should be marked as oldMoved newMovedAlternate
> + line 2B should be marked as oldMoved newMovedAlternate
> + line 3A should be marked as oldMovedAlternate newMoved
> + line 3B should be marked as oldMovedAlternate newMoved
> + unchanged
> + line 4A should be marked as oldMoved newMovedAlternate
> + line 4B should be marked as oldMoved newMovedAlternate
> + line 5A should be marked as oldMovedAlternate newMoved
> + line 5B should be marked as oldMovedAlternate newMoved
> + line 6A should be marked as oldMoved newMoved
> + line 6B should be marked as oldMoved newMoved
> + EOF
> + cat >new.txt <<-\EOF &&
> + line 1A should be marked as oldMoved newMovedAlternate
> + line 1B should be marked as oldMoved newMovedAlternate
> + unchanged
> + line 3A should be marked as oldMovedAlternate newMoved
> + line 3B should be marked as oldMovedAlternate newMoved
> + line 2A should be marked as oldMoved newMovedAlternate
> + line 2B should be marked as oldMoved newMovedAlternate
> + unchanged
> + line 6A should be marked as oldMoved newMoved
> + line 6B should be marked as oldMoved newMoved
> + line 4A should be marked as oldMoved newMovedAlternate
> + line 4B should be marked as oldMoved newMovedAlternate
> + line 5A should be marked as oldMovedAlternate newMoved
> + line 5B should be marked as oldMovedAlternate newMoved
> + EOF
> + test_expect_code 1 git diff --no-index --color --color-moved=zebra \
> + --color-moved-ws=allow-indentation-change \
> + old.txt new.txt >output &&
> + grep -v index output | test_decode_color >actual &&
> + cat >expected <<-\EOF &&
> + <BOLD>diff --git a/old.txt b/new.txt<RESET>
> + <BOLD>--- a/old.txt<RESET>
> + <BOLD>+++ b/new.txt<RESET>
> + <CYAN>@@ -1,14 +1,14 @@<RESET>
> + <BOLD;MAGENTA>-line 1A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;MAGENTA>-line 1B should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 1A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 1B should be marked as oldMoved newMovedAlternate<RESET>
> + unchanged<RESET>
> + <BOLD;MAGENTA>-line 2A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;MAGENTA>-line 2B should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;BLUE>-line 3A should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;BLUE>-line 3B should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 3A should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 3B should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 2A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 2B should be marked as oldMoved newMovedAlternate<RESET>
> + unchanged<RESET>
> + <BOLD;MAGENTA>-line 4A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;MAGENTA>-line 4B should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;BLUE>-line 5A should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;BLUE>-line 5B should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;MAGENTA>-line 6A should be marked as oldMoved newMoved<RESET>
> + <BOLD;MAGENTA>-line 6B should be marked as oldMoved newMoved<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 6A should be marked as oldMoved newMoved<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 6B should be marked as oldMoved newMoved<RESET>
> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 4A should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 4B should be marked as oldMoved newMovedAlternate<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 5A should be marked as oldMovedAlternate newMoved<RESET>
> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 5B should be marked as oldMovedAlternate newMoved<RESET>
> + EOF
> + test_cmp expected actual
> +'
> +
> test_expect_success 'cmd option assumes configured colored-moved' '
> test_config color.diff.oldMoved "magenta" &&
> test_config color.diff.newMoved "cyan" &&
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Phillip Wood wrote (reply to this):
On 15/06/2021 04:24, Junio C Hamano wrote:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> Subject: Re: [PATCH 01/10] diff --color-moved=zerba: fix alternate coloring
>
> Zerba?
They're quite rare in the wild :-/ Thanks for pointing that out I'll fix
it when I re-roll
Best Wishes
Phillip
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> b0a2ba4776 ("diff --color-moved=zebra: be stricter with color
>> alternation", 2018-11-23) sought to avoid using the alternate colors
>> unless there are two adjacent moved blocks of the same
>> sign. Unfortunately it contains two bugs that prevented it from fixing
>> the problem properly. Firstly `last_symbol` is reset at the start of
>> each iteration of the loop losing the symbol of the last line and
>> secondly when deciding whether to use the alternate color it should be
>> checking if the current line is the same sign of the last line, not a
>> different sign. The combination of the two errors means that we still
>> use the alternate color when we should do but we also use it when we
>> shouldn't. This is most noticable when using
>> --color-moved-ws=allow-indentation-change with hunks like
>>
>> -this line gets indented
>> + this line gets indented
>>
>> where the post image is colored with newMovedAlternate rather than
>> newMoved. While this does not matter much, the next commit will change
>> the coloring to be correct in this case, so lets fix the bug here to
>> make it clear why the output is changing and add a regression test.
>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>> diff.c | 4 +--
>> t/t4015-diff-whitespace.sh | 72 ++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 74 insertions(+), 2 deletions(-)
>>
>> diff --git a/diff.c b/diff.c
>> index 52c791574b71..cb068f8258c0 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1142,6 +1142,7 @@ static void mark_color_as_moved(struct diff_options *o,
>> struct moved_block *pmb = NULL; /* potentially moved blocks */
>> int pmb_nr = 0, pmb_alloc = 0;
>> int n, flipped_block = 0, block_length = 0;
>> + enum diff_symbol last_symbol = 0;
>>
>>
>> for (n = 0; n < o->emitted_symbols->nr; n++) {
>> @@ -1149,7 +1150,6 @@ static void mark_color_as_moved(struct diff_options *o,
>> struct moved_entry *key;
>> struct moved_entry *match = NULL;
>> struct emitted_diff_symbol *l = &o->emitted_symbols->buf[n];
>> - enum diff_symbol last_symbol = 0;
>>
>> switch (l->s) {
>> case DIFF_SYMBOL_PLUS:
>> @@ -1214,7 +1214,7 @@ static void mark_color_as_moved(struct diff_options *o,
>> }
>>
>> if (adjust_last_block(o, n, block_length) &&
>> - pmb_nr && last_symbol != l->s)
>> + pmb_nr && last_symbol == l->s)
>> flipped_block = (flipped_block + 1) % 2;
>> else
>> flipped_block = 0;
>> diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
>> index 2c13b62d3c65..920114cd795c 100755
>> --- a/t/t4015-diff-whitespace.sh
>> +++ b/t/t4015-diff-whitespace.sh
>> @@ -1442,6 +1442,78 @@ test_expect_success 'detect permutations inside moved code -- dimmed-zebra' '
>> test_cmp expected actual
>> '
>>
>> +test_expect_success 'zebra alternate color is only used when necessary' '
>> + cat >old.txt <<-\EOF &&
>> + line 1A should be marked as oldMoved newMovedAlternate
>> + line 1B should be marked as oldMoved newMovedAlternate
>> + unchanged
>> + line 2A should be marked as oldMoved newMovedAlternate
>> + line 2B should be marked as oldMoved newMovedAlternate
>> + line 3A should be marked as oldMovedAlternate newMoved
>> + line 3B should be marked as oldMovedAlternate newMoved
>> + unchanged
>> + line 4A should be marked as oldMoved newMovedAlternate
>> + line 4B should be marked as oldMoved newMovedAlternate
>> + line 5A should be marked as oldMovedAlternate newMoved
>> + line 5B should be marked as oldMovedAlternate newMoved
>> + line 6A should be marked as oldMoved newMoved
>> + line 6B should be marked as oldMoved newMoved
>> + EOF
>> + cat >new.txt <<-\EOF &&
>> + line 1A should be marked as oldMoved newMovedAlternate
>> + line 1B should be marked as oldMoved newMovedAlternate
>> + unchanged
>> + line 3A should be marked as oldMovedAlternate newMoved
>> + line 3B should be marked as oldMovedAlternate newMoved
>> + line 2A should be marked as oldMoved newMovedAlternate
>> + line 2B should be marked as oldMoved newMovedAlternate
>> + unchanged
>> + line 6A should be marked as oldMoved newMoved
>> + line 6B should be marked as oldMoved newMoved
>> + line 4A should be marked as oldMoved newMovedAlternate
>> + line 4B should be marked as oldMoved newMovedAlternate
>> + line 5A should be marked as oldMovedAlternate newMoved
>> + line 5B should be marked as oldMovedAlternate newMoved
>> + EOF
>> + test_expect_code 1 git diff --no-index --color --color-moved=zebra \
>> + --color-moved-ws=allow-indentation-change \
>> + old.txt new.txt >output &&
>> + grep -v index output | test_decode_color >actual &&
>> + cat >expected <<-\EOF &&
>> + <BOLD>diff --git a/old.txt b/new.txt<RESET>
>> + <BOLD>--- a/old.txt<RESET>
>> + <BOLD>+++ b/new.txt<RESET>
>> + <CYAN>@@ -1,14 +1,14 @@<RESET>
>> + <BOLD;MAGENTA>-line 1A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;MAGENTA>-line 1B should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 1A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 1B should be marked as oldMoved newMovedAlternate<RESET>
>> + unchanged<RESET>
>> + <BOLD;MAGENTA>-line 2A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;MAGENTA>-line 2B should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;BLUE>-line 3A should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;BLUE>-line 3B should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 3A should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 3B should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 2A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 2B should be marked as oldMoved newMovedAlternate<RESET>
>> + unchanged<RESET>
>> + <BOLD;MAGENTA>-line 4A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;MAGENTA>-line 4B should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;BLUE>-line 5A should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;BLUE>-line 5B should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;MAGENTA>-line 6A should be marked as oldMoved newMoved<RESET>
>> + <BOLD;MAGENTA>-line 6B should be marked as oldMoved newMoved<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 6A should be marked as oldMoved newMoved<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 6B should be marked as oldMoved newMoved<RESET>
>> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 4A should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;YELLOW>+<RESET><BOLD;YELLOW> line 4B should be marked as oldMoved newMovedAlternate<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 5A should be marked as oldMovedAlternate newMoved<RESET>
>> + <BOLD;CYAN>+<RESET><BOLD;CYAN> line 5B should be marked as oldMovedAlternate newMoved<RESET>
>> + EOF
>> + test_cmp expected actual
>> +'
>> +
>> test_expect_success 'cmd option assumes configured colored-moved' '
>> test_config color.diff.oldMoved "magenta" &&
>> test_config color.diff.newMoved "cyan" &&
This patch series was integrated into seen via git@434d4a9. |
This branch is now known as |
There was a status update in the "New Topics" section about the branch Long-overdue correctness and performance update to "diff --color-moved" feature. Waiting for reviews. |
User |
This patch series was integrated into seen via git@f329af8. |
This patch series was integrated into seen via git@5d625c0. |
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
|
User |
This patch series was integrated into seen via git@4885b7e. |
There was a status update in the "Cooking" section about the branch Long-overdue correctness and performance update to "diff --color-moved" feature. Waiting for reviews. |
This patch series was integrated into seen via git@a92be0a. |
This patch series was integrated into seen via git@7bf7283. |
On the Git mailing list, Phillip Wood wrote (reply to this):
|
This patch series was integrated into seen via git@e2e6481. |
This patch series was integrated into seen via git@7bc5c6e. |
This patch series was integrated into seen via git@3259b7d. |
There was a status update in the "Cooking" section about the branch Long-overdue correctness and performance update to "diff --color-moved" feature. Waiting for reviews. |
This patch series was integrated into seen via git@d842cd6. |
This patch series was integrated into seen via git@0a4f9b4. |
5177f66
to
350fa55
Compare
/submit |
Submitted as pull.981.v5.git.1639045809.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@55ef4eb. |
There was a status update in the "Cooking" section about the branch Correctness and performance update to "diff --color-moved" feature. Will merge to 'next'? source: <pull.981.v5.git.1639045809.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@232f609. |
This patch series was integrated into seen via git@7b70897. |
This patch series was integrated into seen via git@92da42b. |
There was a status update in the "Cooking" section about the branch Correctness and performance update to "diff --color-moved" feature. Will merge to 'next'. source: <pull.981.v5.git.1639045809.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@e6541db. |
This patch series was integrated into seen via git@2fec60e. |
This patch series was integrated into seen via git@89bd8e5. |
This patch series was integrated into seen via git@86e5a5e. |
This patch series was integrated into seen via git@756770a. |
There was a status update in the "Cooking" section about the branch Correctness and performance update to "diff --color-moved" feature. Will merge to 'next'. source: <pull.981.v5.git.1639045809.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@3d6c7ae. |
This patch series was integrated into seen via git@e249cf7. |
This patch series was integrated into next via git@443dd9f. |
There was a status update in the "Cooking" section about the branch Correctness and performance update to "diff --color-moved" feature. Will merge to 'master'. source: <pull.981.v5.git.1639045809.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@f161a38. |
This patch series was integrated into seen via git@eec77b9. |
There was a status update in the "Cooking" section about the branch Correctness and performance update to "diff --color-moved" feature. Will merge to 'master'. source: <pull.981.v5.git.1639045809.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@ca694ce. |
This patch series was integrated into seen via git@2b755b3. |
This patch series was integrated into next via git@2b755b3. |
This patch series was integrated into master via git@2b755b3. |
Closed via 2b755b3. |
Thanks to Dscho for his comments on V3.
Changes since V4:
Changes since V3:
Changes since V2:
The bug fix in patch 3 degrades the performance, but by the end of the series the timings are the same as V2 - see the range diff.
V2 Cover Letter:
Thanks to Ævar and Elijah for their comments, I've reworded the commit messages, addressed the enum initialization issue in patch 2 (now 3) and added some perf tests.
There are two new patches in this round. The first patch is new and adds the perf tests suggested by Ævar, the penultimate patch is also new and coverts the existing code to use a designated initializer.
I've converted the benchmark results in the commit messages to use the new tests, the percentage changes are broadly similar to the previous results though I ended up running them on a different computer this time.
V1 cover letter:
The current implementation of
diff --color-moved-ws=allow-indentation-change
is considerably slower that the implementation ofdiff --color-moved
which is in turn slower than a regulardiff
. This patch series starts with a couple of bug fixes and then reworks the implementation ofdiff --color-moved
anddiff --color-moved-ws=allow-indentation-change
to speed them up on large diffs. The time to rungit diff --color-moved --no-color-moved-ws v2.28.0 v2.29.0
is reduced by 33% and the time to rungit diff --color-moved --color-moved-ws=allow-indentation-change v2.28.0 v2.29.0
is reduced by 88%. There is a small slowdown for commit sized diffs with--color-moved
- the time to rungit log -p --color-moved --no-color-moved-ws --no-merges -n1000 v2.29.0
is increased by 2% on recent processors. On older processors these patches reduce the running time in all cases that I've tested. In general the larger the diff the larger the speed up. As an extreme example the time to rundiff --color-moved --color-moved-ws=allow-indentation-change v2.25.0 v2.30.0
goes down from 8 minutes to 6 seconds.cc: Phillip Wood phillip.wood@dunelm.org.uk
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com
cc: Elijah Newren newren@gmail.com
cc: Phillip Wood phillip.wood123@gmail.com
cc: Johannes Schindelin Johannes.Schindelin@gmx.de