Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtin/ls-files.c:add git ls-file --dedup option #832

Closed
wants to merge 3 commits into from

Conversation

adlternative
Copy link

@adlternative adlternative commented Jan 3, 2021

I am reading the source code of git ls-files and learned
that git ls-files may have duplicate files name when there
are unmerged path in a branch merge or when different options
are used at the same time. Users may fell confuse when
they see these duplicate file names.

As Junio C Hamano said ,it have odd behaviour.

Therefore, we can provide an additional option to git ls-files to
delete those repeated information.

This fixes #198

Thanks!

cc: Eric Sunshine sunshine@sunshineco.com
cc: 胡哲宁 adlternative@gmail.com
cc: Junio C Hamano gitster@pobox.com
cc: Johannes Schindelin Johannes.Schindelin@gmx.de

@dscho
Copy link
Member

dscho commented Jan 5, 2021

1.When we use git ls-files with both -m -d,
we would find that repeated path,sometimes
it is confusing.
2.When we are performing a branch merge,
the default git ls-files will also output
multiple repeated file names.
Therefore, I added the --dedup option to git ls-files.

1. It can be achieved that only the deleted file name
   is displayed when using -m, -d, and --dedup at the same time.

2. Add --dedup when merging branches to remove duplicate file
   names. (unless -s, -u are used)

Signed-off-by: ZheNing Hu adlternative@gmail.com

Please do not repeat the commit message in the first comment (which will be used as cover letter).

Also: I thought the idea was to deduplicate entries automatically unless --stage was passed?

Finally, if you add the line This fixes https://github.com/gitgitgadget/git/issues/198 in the commit message, that ticket will be closed automatically once the patch is accepted into Git's main branch.

@adlternative
Copy link
Author

1.When we use git ls-files with both -m -d,
we would find that repeated path,sometimes
it is confusing.
2.When we are performing a branch merge,
the default git ls-files will also output
multiple repeated file names.
Therefore, I added the --dedup option to git ls-files.

1. It can be achieved that only the deleted file name
   is displayed when using -m, -d, and --dedup at the same time.

2. Add --dedup when merging branches to remove duplicate file
   names. (unless -s, -u are used)

Signed-off-by: ZheNing Hu adlternative@gmail.com

Please do not repeat the commit message in the first comment (which will be used as cover letter).

Also: I thought the idea was to deduplicate entries automatically unless --stage was passed?

Finally, if you add the line This fixes https://github.com/gitgitgadget/git/issues/198 in the commit message, that ticket will be closed automatically once the patch is accepted into Git's main branch.

@dscho Thanks,your repeat make me feel that i'm not alone😬 .

I have changed my comment message with the issue url (should I rewrite my commit message too?but if I use git commit --amend && git push -f will it generate duplicate record in this push request?).

From a technical point of view,

There may be duplicate entries in a three-way merge. If we don’t use git ls-files -s or git ls-files -u (-u force contains -s),
In the show_files function, there are many possible choices of show_others, show_killed, show_cached, show_stage, show_deleted, show_modified,
Because show_stage is mandatory in show_unmerged, show_others and show_killed have their special processing functions.
show_cached represents the default condition of git ls-files.

The --dedup I write now is to remove duplication in the case of the remaining (show_cached || show_deleted || show_modified) (Is my thinking incomplete?)

@dscho
Copy link
Member

dscho commented Jan 6, 2021

if I use git commit --amend && git push -f will it generate duplicate record in this push request?

It might, but that's a good thing, no?

The --dedup I write now is to remove duplication in the case of the remaining (show_cached || show_deleted || show_modified) (Is my thinking incomplete?)

My thinking is that it is not useful in the remaining cases to confusingly show multiple, identical entries. (When calling with --stage, the entries are not identical because they differ in the stage.)

But you should take this discussion to the mailing list, I am not a gatekeeper. In other words, once you feel confident that this patch is good, feel free to /submit.

@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 6, 2021

Submitted as pull.832.git.1609923182451.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-832/adlternative/ls-files-dedup-v1

To fetch this version to local tag pr-832/adlternative/ls-files-dedup-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-832/adlternative/ls-files-dedup-v1

@adlternative
Copy link
Author

But you should take this discussion to the mailing list, I am not a gatekeeper. In other words, once you feel confident that this patch is good, feel free to /submit.

Ok,I understand that now !

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 7, 2021

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Jan 6, 2021 at 3:54 AM 阿德烈 via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> [...]
> Therefore, I added the --dedup option to git ls-files.
> 1. It can be achieved that only the deleted file name
> is displayed when using -m, -d, and --dedup at the same time.
> 2. Add --dedup when merging branches to remove duplicate file
>  names. (unless -s, -u are used)

I'm just pointing out a few minor style issues below; I'm not properly
reviewing the patch...

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>  builtin/ls-files.c | 43 ++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 38 insertions(+), 5 deletions(-)

This change adds a new command-line option, so the documentation
(Documentation/git-ls-files.txt) should be updated and at least one
new test should be added (in one of the t/t30??-ls-files-*.sh scripts
probably).

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> @@ -301,6 +302,7 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>         struct strbuf fullname = STRBUF_INIT;
> +       const struct cache_entry *last_stage=NULL;

Add spaces around `=` similar to the preceding line:

    const struct cache_entry *last_stage = NULL;

> @@ -315,7 +317,20 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>                 for (i = 0; i < repo->index->cache_nr; i++) {
>                         const struct cache_entry *ce = repo->index->cache[i];
> -

This patch deletes the blank line but this project usually prefers to
have a blank line after declarations.

> +                       if(show_cached && delete_dup){

Add space after `if` and before `{`:

    if (show_cached && delete_dup) {

> @@ -336,7 +351,20 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
> +                       if(delete_dup){

Style: if (delete_dup) {

> @@ -347,10 +375,14 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
> -                       if (show_deleted && err)
> +                       if(delete_dup && show_deleted && show_modified && err)

Style: if (delete_dup && ...

> -                       if (show_modified && ie_modified(repo->index, ce, &st, 0))
> -                               show_ce(repo, dir, ce, fullname.buf, tag_modified);
> +                       else{

Style: else {

> +                               if (show_deleted && err)/* you can't find it,so it's actually removed at all! */

Add space before `/* comment */`.
Add space in "...it, so...".

> @@ -578,6 +610,7 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
> +               OPT_BOOL(0, "dedup", &delete_dup, N_("delete duplicate entry in index")),

The short help makes it seem like it's modifying the index. Perhaps instead:

    N_("suppress duplicate entries")

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 7, 2021

User Eric Sunshine <sunshine@sunshineco.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 7, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Eric Sunshine <sunshine@sunshineco.com> writes:

>> +                               if (show_deleted && err)/* you can't find it,so it's actually removed at all! */
>
> Add space before `/* comment */`.
> Add space in "...it, so...".

Avoid overly long lines.

@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 8, 2021

Submitted as pull.832.v2.git.1610116600.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-832/adlternative/ls-files-dedup-v2

To fetch this version to local tag pr-832/adlternative/ls-files-dedup-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-832/adlternative/ls-files-dedup-v2

@@ -13,6 +13,7 @@ SYNOPSIS
(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Fri, Jan 8, 2021 at 9:36 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> builtin:ls-files.c:add git ls-file --dedup option

This subject concisely explains the purpose of the patch. That's good.
A more typical way to write it would be:

    ls-files: add --dedup option

> This commit standardizes the code format.

Fixing problems pointed out by reviewers is good. Normally, however,
when you submit a new version of your patch or patch series, you
should apply these fixes directly to the patch(es) which introduced
the problems in the first place rather than adding one or more
additional patches to fix problems introduced in earlier patches. To
do this, you typically would use `git rebase -i` or `git commit
--amend` to squash the fixes into the problematic patches. Thus, when
you re-submit the patches, they will appear to be "perfect".

For this particular two-patch series, patch [2/2] is doing two things:
(1) fixing style problems from patch [1/2], and (2) adding
documentation and tests which logically belong with the feature added
by patch [1/2]. Taking the above advice into account, a better
presentation when you re-submit this series would be to squash these
two patches into a single patch.

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> @@ -81,6 +82,9 @@ OPTIONS
> +--dedup::
> +       Suppress duplicates entries when conflicts happen or
> +       specify -d -m at the same time.

For consistency with typesetting elsewhere in this file, use backticks
around the command-line options. It also often is a good idea to spell
the options using long form since it is typically easier to search for
the long form of an option in documentation. So, perhaps the above can
be written like this:

    Suppress duplicate entries when `--deleted` and `--modified` are
    combined.

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> -       const struct cache_entry *last_stage=NULL;
> +       const struct cache_entry *last_stage = NULL;
> -                       if(show_cached && delete_dup){
> +                       if (show_cached && delete_dup) {
> -                                       last_stage=ce;
> +                                       last_stage = ce;
> -                       if(delete_dup){
> +                       if (delete_dup) {
> -                       if(delete_dup && show_deleted && show_modified && err)
> +                       if (delete_dup && show_deleted && show_modified && err)
> -                       else{
> -                               if (show_deleted && err)/* you can't find it,so it's actually removed at all! */
> +                       else {
> +                               if (show_deleted && err)

As mentioned above, these style fixes should be squashed into the
first patch, rather than being done in a separate patch, so that
reviewers see a nicely polished patch rather than a patch which
requires later fixing up.

> diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> @@ -0,0 +1,63 @@
> +test_expect_success 'master branch setup and write expect1 expect2 and commit' '

We usually give this test a simple title such as "setup" so that we
don't have to worry about the title becoming outdated as people make
changes to the test itself.

> +       touch a.txt &&
> +       touch b.txt &&
> +       touch delete.txt &&

On this project, we use `touch` when the timestamp of the empty files
is important to the test. If the timestamp is not important, then we
just use `>`, like this:

    >a.txt &&
    >b.txt &&
    >delete.txt &&

> +       cat <<-EOF >expect1 &&
> +       M a.txt
> +       H b.txt
> +       H delete.txt
> +       H expect1
> +       H expect2
> +       EOF
> +       cat <<-EOF >expect2 &&
> +       C a.txt
> +       R delete.txt
> +       EOF

When no variables are being interpolated in the here-doc content, we
use -\EOF to let readers know that the here-doc body is literal. So:

    cat >expect1 <<-\EOF &&
    ...
    EOF

> +       git add a.txt b.txt delete.txt expect1 expect2 &&
> +       git commit -m master:1
> +'
> +
> +test_expect_success 'main commit again' '
> +       echo a>a.txt &&
> +       echo b>b.txt &&
> +       echo delete>delete.txt &&
> +       git add a.txt b.txt delete.txt &&
> +       git commit -m master:2
> +'
> +
> +test_expect_success 'dev commit' '
> +       git checkout HEAD~ &&
> +       git switch -c dev &&
> +       echo change>a.txt &&
> +       git add a.txt &&
> +       git commit -m dev:1
> +'

These two tests following the "setup" test also seem to be doing setup
tasks rather than testing the new --dedup functionality. If this is
the case, then it probably would make sense to combine all three tests
into a single "setup" test.

> +test_expect_success 'dev merge master' '
> +       test_must_fail git merge master &&
> +       git ls-files -t --dedup >actual1 &&
> +       test_cmp expect1 actual1 &&
> +       rm delete.txt &&
> +       git ls-files -d -m -t --dedup >actual2 &&
> +       test_cmp expect2 actual2
> +'

Do you foresee that people will add more tests to this file which will
use the files and branches set up by the "setup" test(s)? If not, if
those branches and files are only ever going to be used by this one
test, then it probably would be better to combine all the above code
into a single test.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, 胡哲宁 wrote (reply to this):

You can see that the coding and documentation of GIT community are really very
standard, which may be one of the things I lack and need to improve ;)
Thanks for patiently correct my errors.

Eric Sunshine <sunshine@sunshineco.com> 于2021年1月14日周四 下午2:39写道:
>
> On Fri, Jan 8, 2021 at 9:36 AM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > builtin:ls-files.c:add git ls-file --dedup option
>
> This subject concisely explains the purpose of the patch. That's good.
> A more typical way to write it would be:
>
>     ls-files: add --dedup option
>
OK.I will correct it more specification.
> > This commit standardizes the code format.
>
> Fixing problems pointed out by reviewers is good. Normally, however,
> when you submit a new version of your patch or patch series, you
> should apply these fixes directly to the patch(es) which introduced
> the problems in the first place rather than adding one or more
> additional patches to fix problems introduced in earlier patches. To
> do this, you typically would use `git rebase -i` or `git commit
> --amend` to squash the fixes into the problematic patches. Thus, when
> you re-submit the patches, they will appear to be "perfect".
>
> For this particular two-patch series, patch [2/2] is doing two things:
> (1) fixing style problems from patch [1/2], and (2) adding
> documentation and tests which logically belong with the feature added
> by patch [1/2]. Taking the above advice into account, a better
> presentation when you re-submit this series would be to squash these
> two patches into a single patch.
>
I thought before this was gitgitgadget would sent duplicate patch
over and over again. It seems like I really should go straight ahead
and squash my commits , so I know what I should do.
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> > diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
> > @@ -81,6 +82,9 @@ OPTIONS
> > +--dedup::
> > +       Suppress duplicates entries when conflicts happen or
> > +       specify -d -m at the same time.
>
> For consistency with typesetting elsewhere in this file, use backticks
> around the command-line options. It also often is a good idea to spell
> the options using long form since it is typically easier to search for
> the long form of an option in documentation. So, perhaps the above can
> be written like this:
>
>     Suppress duplicate entries when `--deleted` and `--modified` are
>     combined.
>
> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > -       const struct cache_entry *last_stage=NULL;
> > +       const struct cache_entry *last_stage = NULL;
> > -                       if(show_cached && delete_dup){
> > +                       if (show_cached && delete_dup) {
> > -                                       last_stage=ce;
> > +                                       last_stage = ce;
> > -                       if(delete_dup){
> > +                       if (delete_dup) {
> > -                       if(delete_dup && show_deleted && show_modified && err)
> > +                       if (delete_dup && show_deleted && show_modified && err)
> > -                       else{
> > -                               if (show_deleted && err)/* you can't find it,so it's actually removed at all! */
> > +                       else {
> > +                               if (show_deleted && err)
>
> As mentioned above, these style fixes should be squashed into the
> first patch, rather than being done in a separate patch, so that
> reviewers see a nicely polished patch rather than a patch which
> requires later fixing up.
>
> > diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> > @@ -0,0 +1,63 @@
> > +test_expect_success 'master branch setup and write expect1 expect2 and commit' '
>
> We usually give this test a simple title such as "setup" so that we
> don't have to worry about the title becoming outdated as people make
> changes to the test itself.
>
> > +       touch a.txt &&
> > +       touch b.txt &&
> > +       touch delete.txt &&
>
> On this project, we use `touch` when the timestamp of the empty files
> is important to the test. If the timestamp is not important, then we
> just use `>`, like this:
>
>     >a.txt &&
>     >b.txt &&
>     >delete.txt &&
>
OK,maybe because I always use touch to generate files.
> > +       cat <<-EOF >expect1 &&
> > +       M a.txt
> > +       H b.txt
> > +       H delete.txt
> > +       H expect1
> > +       H expect2
> > +       EOF
> > +       cat <<-EOF >expect2 &&
> > +       C a.txt
> > +       R delete.txt
> > +       EOF
>
> When no variables are being interpolated in the here-doc content, we
> use -\EOF to let readers know that the here-doc body is literal. So:
>
>     cat >expect1 <<-\EOF &&
>     ...
>     EOF
>
> > +       git add a.txt b.txt delete.txt expect1 expect2 &&
> > +       git commit -m master:1
> > +'
> > +
> > +test_expect_success 'main commit again' '
> > +       echo a>a.txt &&
> > +       echo b>b.txt &&
> > +       echo delete>delete.txt &&
> > +       git add a.txt b.txt delete.txt &&
> > +       git commit -m master:2
> > +'
> > +
> > +test_expect_success 'dev commit' '
> > +       git checkout HEAD~ &&
> > +       git switch -c dev &&
> > +       echo change>a.txt &&
> > +       git add a.txt &&
> > +       git commit -m dev:1
> > +'
>
> These two tests following the "setup" test also seem to be doing setup
> tasks rather than testing the new --dedup functionality. If this is
> the case, then it probably would make sense to combine all three tests
> into a single "setup" test.
>
> > +test_expect_success 'dev merge master' '
> > +       test_must_fail git merge master &&
> > +       git ls-files -t --dedup >actual1 &&
> > +       test_cmp expect1 actual1 &&
> > +       rm delete.txt &&
> > +       git ls-files -d -m -t --dedup >actual2 &&
> > +       test_cmp expect2 actual2
> > +'
>
> Do you foresee that people will add more tests to this file which will
> use the files and branches set up by the "setup" test(s)? If not, if
> those branches and files are only ever going to be used by this one
> test, then it probably would be better to combine all the above code
> into a single test.
No,the test file may just need only one.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 14, 2021

User 胡哲宁 <adlternative@gmail.com> has been added to the cc: list.

@adlternative
Copy link
Author

/preview

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 14, 2021

Error: Could not determine public email of adlternative

@adlternative
Copy link
Author

/preview

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 14, 2021

Error: Could not determine public email of adlternative

@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 14, 2021

Submitted as pull.832.v3.git.1610626942677.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-832/adlternative/ls-files-dedup-v3

To fetch this version to local tag pr-832/adlternative/ls-files-dedup-v3:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-832/adlternative/ls-files-dedup-v3

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 15, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"阿德烈 via GitGitGadget"  <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> In order to provide users a better experience
> when viewing information about files in the index
> and the working tree, the `--dedup` option will suppress
> some duplicate options under some conditions.
>
> In a merge conflict, one item of "git ls-files" output may
> appear multiple times. For example,now the file `a.c` has
> a conflict,`a.c` will appear three times in the output of
> "git ls-files".We can use "git ls-files --dedup" to output
> `a.c` only one time.(unless `--stage` or `--unmerged` is
> used to view all the detailed information in the index)

Unlike these option names we see in the description, "dedup" is not
a full word.  Perhaps spell it fully "--deduplicate" while letting
parse-options machinery to accept unique prefix (including
"--dedup"?

> In addition, if you use both `--delete` and `--modify` in
> the same time, The `--dedup` option can also suppress modified

"at the same time", I think.

> entries output.

[let's call this point "point A"]

> `--dedup` option relevant descriptions in
> `Documentation/git-ls-files.txt`,

I am not sure what this means.

> the test script in `t/t3012-ls-files-dedup.sh`
> prove the correctness of the `--dedup` option.

No amount of tests "proves" any correctness, but that is OK.  I
think you meant to say "a few tests have been added to t3012 to
protect the new feature from future breakage" or something like
that.

In any case, I think everything after "point A" and before your sign
off does not belong to the log message.  The diffstat shows that
documentation and tests have been added already.

> +--dedup::
> +	Suppress duplicate entries when conflict happen

"conflict happen" -> "there are unmerged paths", as the term
"unmerged" is already shown to readers of "ls-files --help".

> +	or `--deleted` and `--modified` are combined.

I somehow thought that you refrained from deduping when you are
showing the stages with "ls-files -u" and "ls-files -s", or you are
showing status with "ls-files -t", because you will otherwise lose
information.  In other words, showing only one cache entry out of
many that share the same name makes sense only when we are showing
name and nothing else.

Has that been changed from the previous rounds?

> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index c8eae899b82..bc4eded19ab 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -316,6 +318,20 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>  		for (i = 0; i < repo->index->cache_nr; i++) {
>  			const struct cache_entry *ce = repo->index->cache[i];
>  
> +			if (show_cached && delete_dup) {
> +				switch (ce_stage(ce)) {
> +				case 0:
> +				default:
> +					break;

This part looks somewhat strange for two reasons:

 - The code enumerates ALL the possible stage numbers from 0 to 3;
   if we were to have "default", I'd expect it would be a separate
   switch arm from the possible values that calls out an programming
   error, e.g. BUG("at stage #%d???", ce_stage(ce)).  Simply removing
   the "default" arm would be another way out of this strangeness.

 - When we see a stage #0 entry, we know we will not have higher
   stage entries with the same name.  Not clearing last_stage here
   feels wrong, as the primary reason why last_stage variable is
   used is to remember the last ce that was shown, so that other
   entries with the same name can be skipped.

By the way, "last_shown_ce" may be a much better name for the
variable, as you do not really care what stage number the ce you
showed last was at (you care about its name).

Also, I do not see a good reason why the last_shown_ce variable
should have lifetime longer than the block that contains this for()
loop (and the other for loop for deleted/modified codepath we see
later).  Especially since you initialize the variable that you made
visible to the entire function to NULL before entering the first for
loop, but you do not set it back to NULL before entering the second
for loop, it is inviting a subtle bug.  You may have been given
show_cached and show_modified at the same time, so you enter the
first loop and have shown the first stage of the last conflicted
path, whose cache entry is left in the last_stage variable.  Since
the variable has longer lifespan than necessary, when the second
loop is entered, it still points at the cache entry for the highest
stage of the last conflicted path.  That is because the code forgets
to clear it to NULL before entering the second for loop.

Having said all that, I suspect that we may be much better off if we
can somehow merge the two loops into one.  You may be dedup adjacent
entries in each loop separately with the approach taken by this
patch, but I do not think the patch would work to deduplicate across
two loops.  For example, what happens if you do this?

    $ git reset --hard
    $ echo >>builtin/ls-files.c
    $ git ls-files -c -m builtin/ls-files.c
    $ git ls-files -t -c -m builtin/ls-files.c

I think you see the path twice in the output, with or without your
--dedup option (remember what I said about proving, by the way? ;-)).

Once we successfully merged two loops into one, the part that shows
tracked paths in the function would have only one loop, and it would
become a lot cleaner to add the logic to "skip showing the ce if it
has the same name as the previously shown one, only when doing so
won't lose information", by doing something like this:

	static void show_files(....)
	{
		/* show_others || show_killed done here */
		...

		/* leave early if not showing anything */
		if (! (show_cached || show_stage || show_deleted || show_modified))
			return;

		last_shown_ce = NULL;
		for (i = 0; i < repo->index->cache_nr; i++) {
			const struct cache_entry *ce = repo->index->cache[i];

			if (skipping_duplicates && last_shown_ce)
				if (!strcmp(ce->name, last_shown_ce->name))
					continue;

			construct_fullname();

                        /* various reasons to skip the entry tested */
			if (showing ignored directory and ce is excluded)
				continue;
			if (show_unmerged && !ce_stage(ce))
				continue;
			if (ce->ce_flags & CE_UPDATE)
				continue;
			... other reasons may appear here ...

			/* now we are committed to show it */
			last_shown_ce = ce;

			... various different ways to show ce come here ... 
			show_ce(...);
		}
	}

where "skipping_duplicates" would be set when "--deduplicate" is
asked and we are not showing information other than the pathname
via various options e.g. the tags (-t) or stages (-s/-u).

> +			if (delete_dup && show_deleted && show_modified && err)
>  				show_ce(repo, dir, ce, fullname.buf, tag_removed);

I actually think the original code that is still shown here ...

> +			else {
> +				if (show_deleted && err)
> +					show_ce(repo, dir, ce, fullname.buf, tag_removed);
> +				if (show_modified && ie_modified(repo->index, ce, &st, 0))

... about modified file is buggy.  If lstat() failed, then &st has
no useful information, so it is wrong to feed it to ie_modified().

Perhaps a three-patch series that is structured like this may be in
order?

 #1: bugfix for --deleted and --modified.

	err = lstat(fullname.buf, &st);
	if (err) {
		/* deleted? */
		if (errno != E_NOENT)
			error_errno("cannot lstat '%s'", fullname.buf);
		else {
			if (show_deleted)
                        	show_ce(..., tag_removed);
			if (show_modified)
                        	show_ce(..., tag_modified);
		}
	} else if (show_modified && ie_modified(...))
		show_ce(..., tag_modified);
    
     This hopefully should not change the semantics.  If you ask
     --deleted and --modified, a deleted path would be listed twice.

 #2: consolidate two for loops into one.

     The two loops have slightly different condition to skip a ce,
     and different logic on what tag each path is shown with.  When
     --cached and --modified or --deleted are asked for at the same
     time, we'd show them multiple times (this is done inside the
     loop for each ce)

	if (show_cached || show_stage)
		show_ce(... ce_stage(ce) ? tag_unmerged : ...);
	err = lstat(fullname.buf, &st);
	if (err) {
        	/* deleted? */
                ... code that corresponds to the
		... illustration in #1 above come here.
	} else if (...)
		show_ce(..., tag_modified);

     This changes the semantics.  The original iterates the index
     twice, so you may see the same entry from --cached once and
     then again from --modified.  The updated one still will show
     the same entry twice but next to each other.

 #3: optionally deduplicate.

     Once we have a single loop, deduplicationg based on names is
     trivial, as we seen before.


Hmm?

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 16, 2021

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Thu, Jan 14, 2021 at 7:22 AM 阿德烈 via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> In order to provide users a better experience
> when viewing information about files in the index
> and the working tree, the `--dedup` option will suppress
> some duplicate options under some conditions.
> [...]

I have a few very minor comments alongside Junio's review comments...

> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> @@ -0,0 +1,54 @@
> +test_description='git ls-files --dedup test.
> +
> +This test prepares the following in the cache:
> +
> +    a.txt       - a file(base)
> +    a.txt      - a file(master)
> +    a.txt       - a file(dev)
> +    b.txt       - a file
> +    delete.txt  - a file
> +    expect1    - a file
> +    expect2    - a file
> +
> +'

This test script description is outdated now. Perhaps shorten it to:

    test_description='ls-files dedup tests'

Or, it might be suitable to simply add the new test to the existing
t3004-ls-files-basic.sh instead.

> +test_expect_success 'setup' '
> +       > a.txt &&
> +       > b.txt &&
> +       > delete.txt &&
> +       cat >expect1<<-\EOF &&

Style nits: no space after redirection operator and a space before
redirection operator:

    >a.txt &&
    >b.txt &&
    >delete.txt &&
    cat >expect1 <<-\EOF &&

> +       cat >expect2<<-EOF &&

Nit: missing the backslash (and wrong spacing):

    cat >expect2 <<-\EOF &&

> +       echo a>a.txt &&
> +       echo b>b.txt &&

Style:

    echo a >a.txt &&
    echo b >b.txt &&

> +       echo delete >delete.txt &&
> +       git add a.txt b.txt delete.txt &&
> +       git commit -m master:2 &&
> +       git checkout HEAD~ &&
> +       git switch -c dev &&

If someone adds a new test after this test, then that new test will
run in the "dev" branch, which might be unexpected or undesirable. It
often is a good idea to ensure that tests do certain types of cleanup
to avoid breaking subsequent tests. Here, it would be a good idea to
ensure that the test switches back to the original branch when it
finishes (regardless of whether it finishes successfully or
unsuccessfully).

    git switch -c dev &&
    test_when_finished "git switch master" &&

Or you could use `git switch -` if you don't want to hard-code the
name "master" in the test (since there has been effort lately to
remove that name from tests.

> +       echo change >a.txt &&
> +       git add a.txt &&
> +       git commit -m dev:1 &&
> +       test_must_fail git merge master &&
> +       git ls-files -t --dedup >actual1 &&
> +       test_cmp expect1 actual1 &&
> +       rm delete.txt &&
> +       git ls-files -d -m -t --dedup >actual2 &&
> +       test_cmp expect2 actual2

We usually don't bother giving temporary files unique names like
"actual1" and "actual2" unless those files must exist at the same
time. This is because unique names like this may confuse readers into
wondering if there is some hidden interdependency between the files.
In this case, the files don't need to exist at the same time, so it
may be better simply to use the names "actual" and "expect", like
this:

    ...other stuff...
    cat >expect <<-\EOF &&
    ...
    EOF
    git ls-files -t --dedup >actual &&
    test_cmp expect actual &&
    rm delete.txt &&
    cat >expect <<-\EOF &&
    ...
    EOF
    git ls-files -d -m -t --dedup >actual &&
    test_cmp expect actual

(It also has the benefit that the "expect" content is closer to the
place where it is actually used, which may make it a bit easier for a
person reading the test to understand what is supposed to be
produced.)

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 17, 2021

On the Git mailing list, 胡哲宁 wrote (reply to this):

Junio, thank you for your patience to review
my patch and guide me how to modify it.

Junio C Hamano <gitster@pobox.com> 于2021年1月15日周五 上午8:59写道:
>
> "阿德烈 via GitGitGadget"  <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In order to provide users a better experience
> > when viewing information about files in the index
> > and the working tree, the `--dedup` option will suppress
> > some duplicate options under some conditions.
> >
> > In a merge conflict, one item of "git ls-files" output may
> > appear multiple times. For example,now the file `a.c` has
> > a conflict,`a.c` will appear three times in the output of
> > "git ls-files".We can use "git ls-files --dedup" to output
> > `a.c` only one time.(unless `--stage` or `--unmerged` is
> > used to view all the detailed information in the index)
>
> Unlike these option names we see in the description, "dedup" is not
> a full word.  Perhaps spell it fully "--deduplicate" while letting
> parse-options machinery to accept unique prefix (including
> "--dedup"?
>
Ok i have modified "--dedup" to "--deduplicate".
> > In addition, if you use both `--delete` and `--modify` in
> > the same time, The `--dedup` option can also suppress modified
>
> "at the same time", I think.
>
My poor English grammar :-)
> > entries output.
>
> [let's call this point "point A"]
>
> > `--dedup` option relevant descriptions in
> > `Documentation/git-ls-files.txt`,
>
> I am not sure what this means.
>
> > the test script in `t/t3012-ls-files-dedup.sh`
> > prove the correctness of the `--dedup` option.
>
> No amount of tests "proves" any correctness, but that is OK.  I
> think you meant to say "a few tests have been added to t3012 to
> protect the new feature from future breakage" or something like
> that.
>
Alright, I understand!
> In any case, I think everything after "point A" and before your sign
> off does not belong to the log message.  The diffstat shows that
> documentation and tests have been added already.
>
> > +--dedup::
> > +     Suppress duplicate entries when conflict happen
>
> "conflict happen" -> "there are unmerged paths", as the term
> "unmerged" is already shown to readers of "ls-files --help".
>
Well, maybe I'm not good enough with these proper nouns.
> > +     or `--deleted` and `--modified` are combined.
>
> I somehow thought that you refrained from deduping when you are
> showing the stages with "ls-files -u" and "ls-files -s", or you are
> showing status with "ls-files -t", because you will otherwise lose
> information.  In other words, showing only one cache entry out of
> many that share the same name makes sense only when we are showing
> name and nothing else.
>
You are right, "--deduplicate" should only work on duplicate file names,
so "ls-files -t" also needs to be corrected.
Well,This is true a bug I haven't notice.
> Having said all that, I suspect that we may be much better off if we
> can somehow merge the two loops into one.  You may be dedup adjacent
> entries in each loop separately with the approach taken by this
> patch, but I do not think the patch would work to deduplicate across
> two loops.  For example, what happens if you do this?
>
>     $ git reset --hard
>     $ echo >>builtin/ls-files.c
>     $ git ls-files -c -m builtin/ls-files.c
>     $ git ls-files -t -c -m builtin/ls-files.c
>
> I think you see the path twice in the output, with or without your
> --dedup option (remember what I said about proving, by the way? ;-)).
>
Yeah,This is because I may have missed the -c option with other options
at the same time.
Here I may disagree with your point of view:
                 if (errno != E_NOENT)
                         error_errno("cannot lstat '%s'", fullname.buf);
With this sentence included, the patch will fail the test:
t/t3010-ls-files-killed-modified.sh.
the errno maybe ENOTDIR when you try to lstat a file`r` with `lstat("r/f",&st);`
So I temporarily removed the judgment of errno.
>  #2: consolidate two for loops into one.
>
>      The two loops have slightly different condition to skip a ce,
>      and different logic on what tag each path is shown with.  When
>      --cached and --modified or --deleted are asked for at the same
>      time, we'd show them multiple times (this is done inside the
>      loop for each ce)
>
>         if (show_cached || show_stage)
>                 show_ce(... ce_stage(ce) ? tag_unmerged : ...);
>         err = lstat(fullname.buf, &st);
>         if (err) {
>                 /* deleted? */
>                 ... code that corresponds to the
>                 ... illustration in #1 above come here.
>         } else if (...)
>                 show_ce(..., tag_modified);
>
>      This changes the semantics.  The original iterates the index
>      twice, so you may see the same entry from --cached once and
>      then again from --modified.  The updated one still will show
>      the same entry twice but next to each other.
>
Well,This does change the semantics. I think people who used two
for loops before may want to separate different outputs.
Now, if you don’t use "--deduplicate", You may see six consecutive
items under a combination of multiple options.
>  #3: optionally deduplicate.
>
>      Once we have a single loop, deduplicationg based on names is
>      trivial, as we seen before.
>
>
Indeed so.
> Hmm?
THANKS.

Junio C Hamano <gitster@pobox.com> 于2021年1月15日周五 上午8:59写道:
>
> "阿德烈 via GitGitGadget"  <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > In order to provide users a better experience
> > when viewing information about files in the index
> > and the working tree, the `--dedup` option will suppress
> > some duplicate options under some conditions.
> >
> > In a merge conflict, one item of "git ls-files" output may
> > appear multiple times. For example,now the file `a.c` has
> > a conflict,`a.c` will appear three times in the output of
> > "git ls-files".We can use "git ls-files --dedup" to output
> > `a.c` only one time.(unless `--stage` or `--unmerged` is
> > used to view all the detailed information in the index)
>
> Unlike these option names we see in the description, "dedup" is not
> a full word.  Perhaps spell it fully "--deduplicate" while letting
> parse-options machinery to accept unique prefix (including
> "--dedup"?
>
> > In addition, if you use both `--delete` and `--modify` in
> > the same time, The `--dedup` option can also suppress modified
>
> "at the same time", I think.
>
> > entries output.
>
> [let's call this point "point A"]
>
> > `--dedup` option relevant descriptions in
> > `Documentation/git-ls-files.txt`,
>
> I am not sure what this means.
>
> > the test script in `t/t3012-ls-files-dedup.sh`
> > prove the correctness of the `--dedup` option.
>
> No amount of tests "proves" any correctness, but that is OK.  I
> think you meant to say "a few tests have been added to t3012 to
> protect the new feature from future breakage" or something like
> that.
>
> In any case, I think everything after "point A" and before your sign
> off does not belong to the log message.  The diffstat shows that
> documentation and tests have been added already.
>
> > +--dedup::
> > +     Suppress duplicate entries when conflict happen
>
> "conflict happen" -> "there are unmerged paths", as the term
> "unmerged" is already shown to readers of "ls-files --help".
>
> > +     or `--deleted` and `--modified` are combined.
>
> I somehow thought that you refrained from deduping when you are
> showing the stages with "ls-files -u" and "ls-files -s", or you are
> showing status with "ls-files -t", because you will otherwise lose
> information.  In other words, showing only one cache entry out of
> many that share the same name makes sense only when we are showing
> name and nothing else.
>
> Has that been changed from the previous rounds?
>
> > diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> > index c8eae899b82..bc4eded19ab 100644
> > --- a/builtin/ls-files.c
> > +++ b/builtin/ls-files.c
> > @@ -316,6 +318,20 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
> >               for (i = 0; i < repo->index->cache_nr; i++) {
> >                       const struct cache_entry *ce = repo->index->cache[i];
> >
> > +                     if (show_cached && delete_dup) {
> > +                             switch (ce_stage(ce)) {
> > +                             case 0:
> > +                             default:
> > +                                     break;
>
> This part looks somewhat strange for two reasons:
>
>  - The code enumerates ALL the possible stage numbers from 0 to 3;
>    if we were to have "default", I'd expect it would be a separate
>    switch arm from the possible values that calls out an programming
>    error, e.g. BUG("at stage #%d???", ce_stage(ce)).  Simply removing
>    the "default" arm would be another way out of this strangeness.
>
>  - When we see a stage #0 entry, we know we will not have higher
>    stage entries with the same name.  Not clearing last_stage here
>    feels wrong, as the primary reason why last_stage variable is
>    used is to remember the last ce that was shown, so that other
>    entries with the same name can be skipped.
>
> By the way, "last_shown_ce" may be a much better name for the
> variable, as you do not really care what stage number the ce you
> showed last was at (you care about its name).
>
> Also, I do not see a good reason why the last_shown_ce variable
> should have lifetime longer than the block that contains this for()
> loop (and the other for loop for deleted/modified codepath we see
> later).  Especially since you initialize the variable that you made
> visible to the entire function to NULL before entering the first for
> loop, but you do not set it back to NULL before entering the second
> for loop, it is inviting a subtle bug.  You may have been given
> show_cached and show_modified at the same time, so you enter the
> first loop and have shown the first stage of the last conflicted
> path, whose cache entry is left in the last_stage variable.  Since
> the variable has longer lifespan than necessary, when the second
> loop is entered, it still points at the cache entry for the highest
> stage of the last conflicted path.  That is because the code forgets
> to clear it to NULL before entering the second for loop.
>
> Having said all that, I suspect that we may be much better off if we
> can somehow merge the two loops into one.  You may be dedup adjacent
> entries in each loop separately with the approach taken by this
> patch, but I do not think the patch would work to deduplicate across
> two loops.  For example, what happens if you do this?
>
>     $ git reset --hard
>     $ echo >>builtin/ls-files.c
>     $ git ls-files -c -m builtin/ls-files.c
>     $ git ls-files -t -c -m builtin/ls-files.c
>
> I think you see the path twice in the output, with or without your
> --dedup option (remember what I said about proving, by the way? ;-)).
>
> Once we successfully merged two loops into one, the part that shows
> tracked paths in the function would have only one loop, and it would
> become a lot cleaner to add the logic to "skip showing the ce if it
> has the same name as the previously shown one, only when doing so
> won't lose information", by doing something like this:
>
>         static void show_files(....)
>         {
>                 /* show_others || show_killed done here */
>                 ...
>
>                 /* leave early if not showing anything */
>                 if (! (show_cached || show_stage || show_deleted || show_modified))
>                         return;
>
>                 last_shown_ce = NULL;
>                 for (i = 0; i < repo->index->cache_nr; i++) {
>                         const struct cache_entry *ce = repo->index->cache[i];
>
>                         if (skipping_duplicates && last_shown_ce)
>                                 if (!strcmp(ce->name, last_shown_ce->name))
>                                         continue;
>
>                         construct_fullname();
>
>                         /* various reasons to skip the entry tested */
>                         if (showing ignored directory and ce is excluded)
>                                 continue;
>                         if (show_unmerged && !ce_stage(ce))
>                                 continue;
>                         if (ce->ce_flags & CE_UPDATE)
>                                 continue;
>                         ... other reasons may appear here ...
>
>                         /* now we are committed to show it */
>                         last_shown_ce = ce;
>
>                         ... various different ways to show ce come here ...
>                         show_ce(...);
>                 }
>         }
>
> where "skipping_duplicates" would be set when "--deduplicate" is
> asked and we are not showing information other than the pathname
> via various options e.g. the tags (-t) or stages (-s/-u).
>
> > +                     if (delete_dup && show_deleted && show_modified && err)
> >                               show_ce(repo, dir, ce, fullname.buf, tag_removed);
>
> I actually think the original code that is still shown here ...
>
> > +                     else {
> > +                             if (show_deleted && err)
> > +                                     show_ce(repo, dir, ce, fullname.buf, tag_removed);
> > +                             if (show_modified && ie_modified(repo->index, ce, &st, 0))
>
> ... about modified file is buggy.  If lstat() failed, then &st has
> no useful information, so it is wrong to feed it to ie_modified().
>
> Perhaps a three-patch series that is structured like this may be in
> order?
>
>  #1: bugfix for --deleted and --modified.
>
>         err = lstat(fullname.buf, &st);
>         if (err) {
>                 /* deleted? */
>                 if (errno != E_NOENT)
>                         error_errno("cannot lstat '%s'", fullname.buf);
>                 else {
>                         if (show_deleted)
>                                 show_ce(..., tag_removed);
>                         if (show_modified)
>                                 show_ce(..., tag_modified);
>                 }
>         } else if (show_modified && ie_modified(...))
>                 show_ce(..., tag_modified);
>
>      This hopefully should not change the semantics.  If you ask
>      --deleted and --modified, a deleted path would be listed twice.
>
>  #2: consolidate two for loops into one.
>
>      The two loops have slightly different condition to skip a ce,
>      and different logic on what tag each path is shown with.  When
>      --cached and --modified or --deleted are asked for at the same
>      time, we'd show them multiple times (this is done inside the
>      loop for each ce)
>
>         if (show_cached || show_stage)
>                 show_ce(... ce_stage(ce) ? tag_unmerged : ...);
>         err = lstat(fullname.buf, &st);
>         if (err) {
>                 /* deleted? */
>                 ... code that corresponds to the
>                 ... illustration in #1 above come here.
>         } else if (...)
>                 show_ce(..., tag_modified);
>
>      This changes the semantics.  The original iterates the index
>      twice, so you may see the same entry from --cached once and
>      then again from --modified.  The updated one still will show
>      the same entry twice but next to each other.
>
>  #3: optionally deduplicate.
>
>      Once we have a single loop, deduplicationg based on names is
>      trivial, as we seen before.
>
>
> Hmm?

@adlternative
Copy link
Author

/preview

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 17, 2021

On the Git mailing list, 胡哲宁 wrote (reply to this):

Eric,Thanks!
I have little confuse about I can use` test_when_finished "git switch master" `,
but I can't use` test_when_finished "git switch -" `,
why?

Eric Sunshine <sunshine@sunshineco.com> 于2021年1月16日周六 下午3:13写道:
>
> On Thu, Jan 14, 2021 at 7:22 AM 阿德烈 via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > In order to provide users a better experience
> > when viewing information about files in the index
> > and the working tree, the `--dedup` option will suppress
> > some duplicate options under some conditions.
> > [...]
>
> I have a few very minor comments alongside Junio's review comments...
>
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> > diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
> > @@ -0,0 +1,54 @@
> > +test_description='git ls-files --dedup test.
> > +
> > +This test prepares the following in the cache:
> > +
> > +    a.txt       - a file(base)
> > +    a.txt      - a file(master)
> > +    a.txt       - a file(dev)
> > +    b.txt       - a file
> > +    delete.txt  - a file
> > +    expect1    - a file
> > +    expect2    - a file
> > +
> > +'
>
> This test script description is outdated now. Perhaps shorten it to:
>
>     test_description='ls-files dedup tests'
>
> Or, it might be suitable to simply add the new test to the existing
> t3004-ls-files-basic.sh instead.
>
> > +test_expect_success 'setup' '
> > +       > a.txt &&
> > +       > b.txt &&
> > +       > delete.txt &&
> > +       cat >expect1<<-\EOF &&
>
> Style nits: no space after redirection operator and a space before
> redirection operator:
>
>     >a.txt &&
>     >b.txt &&
>     >delete.txt &&
>     cat >expect1 <<-\EOF &&
>
> > +       cat >expect2<<-EOF &&
>
> Nit: missing the backslash (and wrong spacing):
>
>     cat >expect2 <<-\EOF &&
>
> > +       echo a>a.txt &&
> > +       echo b>b.txt &&
>
> Style:
>
>     echo a >a.txt &&
>     echo b >b.txt &&
>
> > +       echo delete >delete.txt &&
> > +       git add a.txt b.txt delete.txt &&
> > +       git commit -m master:2 &&
> > +       git checkout HEAD~ &&
> > +       git switch -c dev &&
>
> If someone adds a new test after this test, then that new test will
> run in the "dev" branch, which might be unexpected or undesirable. It
> often is a good idea to ensure that tests do certain types of cleanup
> to avoid breaking subsequent tests. Here, it would be a good idea to
> ensure that the test switches back to the original branch when it
> finishes (regardless of whether it finishes successfully or
> unsuccessfully).
>
>     git switch -c dev &&
>     test_when_finished "git switch master" &&
>
> Or you could use `git switch -` if you don't want to hard-code the
> name "master" in the test (since there has been effort lately to
> remove that name from tests.
>
> > +       echo change >a.txt &&
> > +       git add a.txt &&
> > +       git commit -m dev:1 &&
> > +       test_must_fail git merge master &&
> > +       git ls-files -t --dedup >actual1 &&
> > +       test_cmp expect1 actual1 &&
> > +       rm delete.txt &&
> > +       git ls-files -d -m -t --dedup >actual2 &&
> > +       test_cmp expect2 actual2
>
> We usually don't bother giving temporary files unique names like
> "actual1" and "actual2" unless those files must exist at the same
> time. This is because unique names like this may confuse readers into
> wondering if there is some hidden interdependency between the files.
> In this case, the files don't need to exist at the same time, so it
> may be better simply to use the names "actual" and "expect", like
> this:
>
>     ...other stuff...
>     cat >expect <<-\EOF &&
>     ...
>     EOF
>     git ls-files -t --dedup >actual &&
>     test_cmp expect actual &&
>     rm delete.txt &&
>     cat >expect <<-\EOF &&
>     ...
>     EOF
>     git ls-files -d -m -t --dedup >actual &&
>     test_cmp expect actual
>
> (It also has the benefit that the "expect" content is closer to the
> place where it is actually used, which may make it a bit easier for a
> person reading the test to understand what is supposed to be
> produced.)

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 17, 2021

Error: Could not determine public email of adlternative

@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 17, 2021

Submitted as pull.832.v4.git.1610856136.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-832/adlternative/ls-files-dedup-v4

To fetch this version to local tag pr-832/adlternative/ls-files-dedup-v4:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-832/adlternative/ls-files-dedup-v4

@@ -13,6 +13,7 @@ SYNOPSIS
(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Additional instructions:
> In order to display entries information,`deduplicate` suppresses
> the output of duplicate file names, not the output of duplicate
> entries information, so under the option of `-t`, `--stage`, `--unmerge`,
> `--deduplicate` will have no effect.

That information belongs to the end-user documentation.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 23, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

From: ZheNing Hu <adlternative@gmail.com>

This situation may occur in the original code: lstat() failed
but we use `&st` to feed ie_modified() later.

Therefore, we can directly execute show_ce without the judgment of
ie_modified() when lstat() has failed.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: fixed misindented code]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/ls-files.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index c8eae899b8..ce6f6ad00e 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -335,7 +335,7 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		for (i = 0; i < repo->index->cache_nr; i++) {
 			const struct cache_entry *ce = repo->index->cache[i];
 			struct stat st;
-			int err;
+			int stat_err;
 
 			construct_fullname(&fullname, repo, ce);
 
@@ -346,10 +346,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 				continue;
 			if (ce_skip_worktree(ce))
 				continue;
-			err = lstat(fullname.buf, &st);
-			if (show_deleted && err)
+			stat_err = lstat(fullname.buf, &st);
+			if (stat_err && (errno != ENOENT && errno != ENOTDIR))
+				error_errno("cannot lstat '%s'", fullname.buf);
+			if (stat_err && show_deleted)
 				show_ce(repo, dir, ce, fullname.buf, tag_removed);
-			if (show_modified && ie_modified(repo->index, ce, &st, 0))
+			if (show_modified &&
+			    (stat_err || ie_modified(repo->index, ce, &st, 0)))
 				show_ce(repo, dir, ce, fullname.buf, tag_modified);
 		}
 	}
-- 
2.30.0-491-g302c625a7b

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 23, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

From: ZheNing Hu <adlternative@gmail.com>

This will make it easier to show only one entry per filename in the
next step.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: corrected the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/ls-files.c | 63 ++++++++++++++++++++--------------------------
 1 file changed, 27 insertions(+), 36 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index ce6f6ad00e..e94d724aff 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -312,49 +312,40 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		if (show_killed)
 			show_killed_files(repo->index, dir);
 	}
-	if (show_cached || show_stage) {
-		for (i = 0; i < repo->index->cache_nr; i++) {
-			const struct cache_entry *ce = repo->index->cache[i];
 
-			construct_fullname(&fullname, repo, ce);
+	if (!(show_cached || show_stage || show_deleted || show_modified))
+		return;
+	for (i = 0; i < repo->index->cache_nr; i++) {
+		const struct cache_entry *ce = repo->index->cache[i];
+		struct stat st;
+		int stat_err;
 
-			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(dir, repo->index, fullname.buf, ce))
-				continue;
-			if (show_unmerged && !ce_stage(ce))
-				continue;
-			if (ce->ce_flags & CE_UPDATE)
-				continue;
+		construct_fullname(&fullname, repo, ce);
+
+		if ((dir->flags & DIR_SHOW_IGNORED) &&
+			!ce_excluded(dir, repo->index, fullname.buf, ce))
+			continue;
+		if (ce->ce_flags & CE_UPDATE)
+			continue;
+		if ((show_cached || show_stage) &&
+		    (!show_unmerged || ce_stage(ce)))
 			show_ce(repo, dir, ce, fullname.buf,
 				ce_stage(ce) ? tag_unmerged :
 				(ce_skip_worktree(ce) ? tag_skip_worktree :
 				 tag_cached));
-		}
-	}
-	if (show_deleted || show_modified) {
-		for (i = 0; i < repo->index->cache_nr; i++) {
-			const struct cache_entry *ce = repo->index->cache[i];
-			struct stat st;
-			int stat_err;
-
-			construct_fullname(&fullname, repo, ce);
 
-			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(dir, repo->index, fullname.buf, ce))
-				continue;
-			if (ce->ce_flags & CE_UPDATE)
-				continue;
-			if (ce_skip_worktree(ce))
-				continue;
-			stat_err = lstat(fullname.buf, &st);
-			if (stat_err && (errno != ENOENT && errno != ENOTDIR))
-				error_errno("cannot lstat '%s'", fullname.buf);
-			if (stat_err && show_deleted)
-				show_ce(repo, dir, ce, fullname.buf, tag_removed);
-			if (show_modified &&
-			    (stat_err || ie_modified(repo->index, ce, &st, 0)))
-				show_ce(repo, dir, ce, fullname.buf, tag_modified);
-		}
+		if (!(show_deleted || show_modified))
+			continue;
+		if (ce_skip_worktree(ce))
+			continue;
+		stat_err = lstat(fullname.buf, &st);
+		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
+			error_errno("cannot lstat '%s'", fullname.buf);
+		if (stat_err && show_deleted)
+			show_ce(repo, dir, ce, fullname.buf, tag_removed);
+		if (show_modified &&
+		    (stat_err || ie_modified(repo->index, ce, &st, 0)))
+			show_ce(repo, dir, ce, fullname.buf, tag_modified);
 	}
 
 	strbuf_release(&fullname);
-- 
2.30.0-491-g302c625a7b

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 23, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

From: ZheNing Hu <adlternative@gmail.com>

During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage.  If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.

When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.

Introduce a new '--deduplicate' option to suppress them.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/git-ls-files.txt |  8 +++++
 builtin/ls-files.c             | 31 ++++++++++++++--
 t/t3012-ls-files-dedup.sh      | 66 ++++++++++++++++++++++++++++++++++
 3 files changed, 102 insertions(+), 3 deletions(-)
 create mode 100755 t/t3012-ls-files-dedup.sh

diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt
index 0a3b5265b3..6d11ab506b 100644
--- a/Documentation/git-ls-files.txt
+++ b/Documentation/git-ls-files.txt
@@ -13,6 +13,7 @@ SYNOPSIS
 		(--[cached|deleted|others|ignored|stage|unmerged|killed|modified])*
 		(-[c|d|o|i|s|u|k|m])*
 		[--eol]
+		[--deduplicate]
 		[-x <pattern>|--exclude=<pattern>]
 		[-X <file>|--exclude-from=<file>]
 		[--exclude-per-directory=<file>]
@@ -80,6 +81,13 @@ OPTIONS
 	\0 line termination on output and do not quote filenames.
 	See OUTPUT below for more information.
 
+--deduplicate::
+	When only filenames are shown, suppress duplicates that may
+	come from having multiple stages during a merge, or giving
+	`--deleted` and `--modified` option at the same time.
+	When any of the `-t`, `--unmerged`, or `--stage` option is
+	in use, this option has no effect.
+
 -x <pattern>::
 --exclude=<pattern>::
 	Skip untracked files matching pattern.
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index e94d724aff..f6f9e483b2 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -35,6 +35,7 @@ static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
 static int recurse_submodules;
+static int skipping_duplicates;
 
 static const char *prefix;
 static int max_prefix_len;
@@ -328,11 +329,14 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		if (ce->ce_flags & CE_UPDATE)
 			continue;
 		if ((show_cached || show_stage) &&
-		    (!show_unmerged || ce_stage(ce)))
+		    (!show_unmerged || ce_stage(ce))) {
 			show_ce(repo, dir, ce, fullname.buf,
 				ce_stage(ce) ? tag_unmerged :
 				(ce_skip_worktree(ce) ? tag_skip_worktree :
 				 tag_cached));
+			if (skipping_duplicates)
+				goto skip_to_next_name;
+		}
 
 		if (!(show_deleted || show_modified))
 			continue;
@@ -341,11 +345,28 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		stat_err = lstat(fullname.buf, &st);
 		if (stat_err && (errno != ENOENT && errno != ENOTDIR))
 			error_errno("cannot lstat '%s'", fullname.buf);
-		if (stat_err && show_deleted)
+		if (stat_err && show_deleted) {
 			show_ce(repo, dir, ce, fullname.buf, tag_removed);
+			if (skipping_duplicates)
+				goto skip_to_next_name;
+		}
 		if (show_modified &&
-		    (stat_err || ie_modified(repo->index, ce, &st, 0)))
+		    (stat_err || ie_modified(repo->index, ce, &st, 0))) {
 			show_ce(repo, dir, ce, fullname.buf, tag_modified);
+			if (skipping_duplicates)
+				goto skip_to_next_name;
+		}
+		continue;
+
+skip_to_next_name:
+		{
+			int j;
+			struct cache_entry **cache = repo->index->cache;
+			for (j = i + 1; j < repo->index->cache_nr; j++)
+				if (strcmp(ce->name, cache[j]->name))
+					break;
+			i = j - 1; /* compensate for the for loop */
+		}
 	}
 
 	strbuf_release(&fullname);
@@ -572,6 +593,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("pretend that paths removed since <tree-ish> are still present")),
 		OPT__ABBREV(&abbrev),
 		OPT_BOOL(0, "debug", &debug_mode, N_("show debugging data")),
+		OPT_BOOL(0, "deduplicate", &skipping_duplicates,
+			 N_("suppress duplicate entries")),
 		OPT_END()
 	};
 
@@ -611,6 +634,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 		 * you also show the stage information.
 		 */
 		show_stage = 1;
+	if (show_tag || show_stage)
+		skipping_duplicates = 0;
 	if (dir.exclude_per_dir)
 		exc_given = 1;
 
diff --git a/t/t3012-ls-files-dedup.sh b/t/t3012-ls-files-dedup.sh
new file mode 100755
index 0000000000..2682b1f43a
--- /dev/null
+++ b/t/t3012-ls-files-dedup.sh
@@ -0,0 +1,66 @@
+#!/bin/sh
+
+test_description='git ls-files --deduplicate test'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	>a.txt &&
+	>b.txt &&
+	>delete.txt &&
+	git add a.txt b.txt delete.txt &&
+	git commit -m base &&
+	echo a >a.txt &&
+	echo b >b.txt &&
+	echo delete >delete.txt &&
+	git add a.txt b.txt delete.txt &&
+	git commit -m tip &&
+	git tag tip &&
+	git reset --hard HEAD^ &&
+	echo change >a.txt &&
+	git commit -a -m side &&
+	git tag side
+'
+
+test_expect_success 'git ls-files --deduplicate to show unique unmerged path' '
+	test_must_fail git merge tip &&
+	git ls-files --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	b.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	git merge --abort
+'
+
+test_expect_success 'git ls-files -d -m --deduplicate with different display options' '
+	git reset --hard side &&
+	test_must_fail git merge tip &&
+	rm delete.txt &&
+	git ls-files -d -m --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	git ls-files -d -m -t --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	C a.txt
+	C a.txt
+	C a.txt
+	R delete.txt
+	C delete.txt
+	EOF
+	test_cmp expect actual &&
+	git ls-files -d -m -c --deduplicate >actual &&
+	cat >expect <<-\EOF &&
+	a.txt
+	b.txt
+	delete.txt
+	EOF
+	test_cmp expect actual &&
+	git merge --abort
+'
+
+test_done
-- 
2.30.0-491-g302c625a7b

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 24, 2021

This patch series was integrated into seen via git@357dc61.

This situation may occur in the original code: lstat() failed
but we use `&st` to feed ie_modified() later.

Therefore, we can directly execute show_ce without the judgment of
ie_modified() when lstat() has failed.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: fixed misindented code]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This will make it easier to show only one entry per filename in the
next step.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: corrected the log message]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
During a merge conflict, the name of a file may appear multiple
times in "git ls-files" output, once for each stage.  If you use
both `--delete` and `--modify` at the same time, the output may
mention a deleted file twice.

When none of the '-t', '-u', or '-s' options is in use, these
duplicate entries do not add much value to the output.

Introduce a new '--deduplicate' option to suppress them.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
[jc: extended doc and rewritten commit log]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 24, 2021

Submitted as pull.832.v7.git.1611485667.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-832/adlternative/ls-files-dedup-v7

To fetch this version to local tag pr-832/adlternative/ls-files-dedup-v7:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-832/adlternative/ls-files-dedup-v7

@@ -335,7 +335,7 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
for (i = 0; i < repo->index->cache_nr; i++) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: ZheNing Hu <adlternative@gmail.com>
>
> This situation may occur in the original code: lstat() failed
> but we use `&st` to feed ie_modified() later.
>
> Therefore, we can directly execute show_ce without the judgment of
> ie_modified() when lstat() has failed.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> [jc: fixed misindented code]

I noticed that you reverted my fix in this version, when this is
compared with the one I sent last night.

Comparing the result of applying all three with what I sent last
night, this v7 looks worse (see below).  Let's discard this round
and declare victory with what is already on 'seen'.

Thanks.


---

comparison between what these three patches would produce (preimage)
and what is on 'seen' (postimage)is shown here.

diff --git w/builtin/ls-files.c c/builtin/ls-files.c
index fb9cf50d76..f6f9e483b2 100644
--- w/builtin/ls-files.c
+++ c/builtin/ls-files.c
@@ -313,7 +313,8 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		if (show_killed)
 			show_killed_files(repo->index, dir);
 	}
-	if (! (show_cached || show_stage || show_deleted || show_modified))
+
+	if (!(show_cached || show_stage || show_deleted || show_modified))
 		return;
 	for (i = 0; i < repo->index->cache_nr; i++) {
 		const struct cache_entry *ce = repo->index->cache[i];
@@ -328,15 +329,16 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 		if (ce->ce_flags & CE_UPDATE)
 			continue;
 		if ((show_cached || show_stage) &&
-			(!show_unmerged || ce_stage(ce))) {
-				show_ce(repo, dir, ce, fullname.buf,
-					ce_stage(ce) ? tag_unmerged :
-					(ce_skip_worktree(ce) ? tag_skip_worktree :
-						tag_cached));
+		    (!show_unmerged || ce_stage(ce))) {
+			show_ce(repo, dir, ce, fullname.buf,
+				ce_stage(ce) ? tag_unmerged :
+				(ce_skip_worktree(ce) ? tag_skip_worktree :
+				 tag_cached));
 			if (skipping_duplicates)
 				goto skip_to_next_name;
 		}
-		if (!show_deleted && !show_modified)
+
+		if (!(show_deleted || show_modified))
 			continue;
 		if (ce_skip_worktree(ce))
 			continue;
@@ -349,12 +351,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 				goto skip_to_next_name;
 		}
 		if (show_modified &&
-			(stat_err || ie_modified(repo->index, ce, &st, 0))) {
-				show_ce(repo, dir, ce, fullname.buf, tag_modified);
+		    (stat_err || ie_modified(repo->index, ce, &st, 0))) {
+			show_ce(repo, dir, ce, fullname.buf, tag_modified);
 			if (skipping_duplicates)
 				goto skip_to_next_name;
 		}
 		continue;
+
 skip_to_next_name:
 		{
 			int j;
@@ -362,7 +365,7 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
 			for (j = i + 1; j < repo->index->cache_nr; j++)
 				if (strcmp(ce->name, cache[j]->name))
 					break;
-			i = j - 1; /* compensate for outer for loop */
+			i = j - 1; /* compensate for the for loop */
 		}
 	}
 
@@ -590,7 +593,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
 			N_("pretend that paths removed since <tree-ish> are still present")),
 		OPT__ABBREV(&abbrev),
 		OPT_BOOL(0, "debug", &debug_mode, N_("show debugging data")),
-		OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),
+		OPT_BOOL(0, "deduplicate", &skipping_duplicates,
+			 N_("suppress duplicate entries")),
 		OPT_END()
 	};
 

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, 胡哲宁 wrote (reply to this):

OK,I didn’t notice any formatting changes before.

Am I free from this patch now?I should probably
look for other issues.

Junio, thank you for all your patient help.
I may often make some low-level mistakes.
I am grateful.

Cheers.

Junio C Hamano <gitster@pobox.com> 于2021年1月25日周一 上午6:04写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > This situation may occur in the original code: lstat() failed
> > but we use `&st` to feed ie_modified() later.
> >
> > Therefore, we can directly execute show_ce without the judgment of
> > ie_modified() when lstat() has failed.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > [jc: fixed misindented code]
>
> I noticed that you reverted my fix in this version, when this is
> compared with the one I sent last night.
>
> Comparing the result of applying all three with what I sent last
> night, this v7 looks worse (see below).  Let's discard this round
> and declare victory with what is already on 'seen'.
>
> Thanks.
>
>
> ---
>
> comparison between what these three patches would produce (preimage)
> and what is on 'seen' (postimage)is shown here.
>
> diff --git w/builtin/ls-files.c c/builtin/ls-files.c
> index fb9cf50d76..f6f9e483b2 100644
> --- w/builtin/ls-files.c
> +++ c/builtin/ls-files.c
> @@ -313,7 +313,8 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>                 if (show_killed)
>                         show_killed_files(repo->index, dir);
>         }
> -       if (! (show_cached || show_stage || show_deleted || show_modified))
> +
> +       if (!(show_cached || show_stage || show_deleted || show_modified))
>                 return;
>         for (i = 0; i < repo->index->cache_nr; i++) {
>                 const struct cache_entry *ce = repo->index->cache[i];
> @@ -328,15 +329,16 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>                 if (ce->ce_flags & CE_UPDATE)
>                         continue;
>                 if ((show_cached || show_stage) &&
> -                       (!show_unmerged || ce_stage(ce))) {
> -                               show_ce(repo, dir, ce, fullname.buf,
> -                                       ce_stage(ce) ? tag_unmerged :
> -                                       (ce_skip_worktree(ce) ? tag_skip_worktree :
> -                                               tag_cached));
> +                   (!show_unmerged || ce_stage(ce))) {
> +                       show_ce(repo, dir, ce, fullname.buf,
> +                               ce_stage(ce) ? tag_unmerged :
> +                               (ce_skip_worktree(ce) ? tag_skip_worktree :
> +                                tag_cached));
>                         if (skipping_duplicates)
>                                 goto skip_to_next_name;
>                 }
> -               if (!show_deleted && !show_modified)
> +
> +               if (!(show_deleted || show_modified))
>                         continue;
>                 if (ce_skip_worktree(ce))
>                         continue;
> @@ -349,12 +351,13 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>                                 goto skip_to_next_name;
>                 }
>                 if (show_modified &&
> -                       (stat_err || ie_modified(repo->index, ce, &st, 0))) {
> -                               show_ce(repo, dir, ce, fullname.buf, tag_modified);
> +                   (stat_err || ie_modified(repo->index, ce, &st, 0))) {
> +                       show_ce(repo, dir, ce, fullname.buf, tag_modified);
>                         if (skipping_duplicates)
>                                 goto skip_to_next_name;
>                 }
>                 continue;
> +
>  skip_to_next_name:
>                 {
>                         int j;
> @@ -362,7 +365,7 @@ static void show_files(struct repository *repo, struct dir_struct *dir)
>                         for (j = i + 1; j < repo->index->cache_nr; j++)
>                                 if (strcmp(ce->name, cache[j]->name))
>                                         break;
> -                       i = j - 1; /* compensate for outer for loop */
> +                       i = j - 1; /* compensate for the for loop */
>                 }
>         }
>
> @@ -590,7 +593,8 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>                         N_("pretend that paths removed since <tree-ish> are still present")),
>                 OPT__ABBREV(&abbrev),
>                 OPT_BOOL(0, "debug", &debug_mode, N_("show debugging data")),
> -               OPT_BOOL(0,"deduplicate",&skipping_duplicates,N_("suppress duplicate entries")),
> +               OPT_BOOL(0, "deduplicate", &skipping_duplicates,
> +                        N_("suppress duplicate entries")),
>                 OPT_END()
>         };
>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

胡哲宁 <adlternative@gmail.com> writes:

> OK,I didn’t notice any formatting changes before.
>
> Am I free from this patch now?I should probably
> look for other issues.

I think we are pretty much done with it.  Thanks for working on the
topic so patiently.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 25, 2021

This patch series was integrated into seen via git@4583798.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 25, 2021

This patch series was integrated into seen via git@f0a190e.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 26, 2021

This patch series was integrated into seen via git@9b23fd9.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 26, 2021

This patch series was integrated into seen via git@7df4964.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 27, 2021

This patch series was integrated into seen via git@bcff65d.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 27, 2021

This patch series was integrated into next via git@af7522d.

@gitgitgadget gitgitgadget bot added the next label Jan 27, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Jan 31, 2021

This patch series was integrated into seen via git@ce0049b.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 1, 2021

This patch series was integrated into seen via git@da779c9.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 2, 2021

This patch series was integrated into seen via git@af7522d.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 2, 2021

This patch series was integrated into seen via git@73de2fa.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 3, 2021

This patch series was integrated into seen via git@efd12f4.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 6, 2021

This patch series was integrated into seen via git@5198426.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 6, 2021

This patch series was integrated into next via git@5198426.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 6, 2021

This patch series was integrated into master via git@5198426.

@gitgitgadget gitgitgadget bot added the master label Feb 6, 2021
@gitgitgadget gitgitgadget bot closed this Feb 6, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 6, 2021

Closed via 5198426.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid duplicate entries in git ls-files for unmerged entries
2 participants