Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range diff with ranges lacking dotdot #841

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
11 changes: 11 additions & 0 deletions Documentation/git-range-diff.txt
Expand Up @@ -28,6 +28,17 @@ Finally, the list of matching commits is shown in the order of the
second commit range, with unmatched commits being inserted just after
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> There are three forms, depending whether the user specifies one, two or
> three non-option arguments. We've never actually explained how this
> works in the manual, so let's explain it.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  Documentation/git-range-diff.txt | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index 9701c1e5fdd..76359baf26d 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -28,6 +28,19 @@ Finally, the list of matching commits is shown in the order of the
>  second commit range, with unmatched commits being inserted just after
>  all of their ancestors have been shown.
>  
> +There are three ways to specify the commit ranges:
> +
> +- `<range1> <range2>`: Either commit range can be of the form
> +  `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
> +  in linkgit:gitrevisions[7] for more details.

Good.

> +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
> +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
> +  equivalent to `<base>..<rev1> <base>..<rev2>` where `<base>` is the
> +  merge base as obtained via `git merge-base <rev1> <rev2>`.

Does this merely resemble?  Isn't it exactly what a symmetric range is?

> +- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
> +  <base>..<rev2>`.

Nice to see this documented.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 21 Jan 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > There are three forms, depending whether the user specifies one, two or
> > three non-option arguments. We've never actually explained how this
> > works in the manual, so let's explain it.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  Documentation/git-range-diff.txt | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> > index 9701c1e5fdd..76359baf26d 100644
> > --- a/Documentation/git-range-diff.txt
> > +++ b/Documentation/git-range-diff.txt
> > @@ -28,6 +28,19 @@ Finally, the list of matching commits is shown in the order of the
> >  second commit range, with unmatched commits being inserted just after
> >  all of their ancestors have been shown.
> >
> > +There are three ways to specify the commit ranges:
> > +
> > +- `<range1> <range2>`: Either commit range can be of the form
> > +  `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
> > +  in linkgit:gitrevisions[7] for more details.
>
> Good.
>
> > +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
> > +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
> > +  equivalent to `<base>..<rev1> <base>..<rev2>` where `<base>` is the
> > +  merge base as obtained via `git merge-base <rev1> <rev2>`.
>
> Does this merely resemble?  Isn't it exactly what a symmetric range is?

No, it is not exactly what a symmetric range is because `range-diff`
treats both arms of the symmetric range independently, as two distinct
non-symmetric ranges.

> > +- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
> > +  <base>..<rev2>`.
>
> Nice to see this documented.

Yes. I was rather surprised that I had overlooked that in my original
submission of the `range-diff` command.

Ciao,
Dscho

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> > +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
>> > +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
>> > +  equivalent to `<base>..<rev1> <base>..<rev2>` where `<base>` is the
>> > +  merge base as obtained via `git merge-base <rev1> <rev2>`.

The above paragraph says A...B is turned into $(git merge-base A
B)..A and $(git merge-base A B)..B, but I wonder if we should be
rewriting it into A..B and B..A instead; that would make it
unnecessary to explain what should happen when there are more than
one merge bases.

>> Does this merely resemble?  Isn't it exactly what a symmetric range is?
>
> No, it is not exactly what a symmetric range is because `range-diff`
> treats both arms of the symmetric range independently, as two distinct
> non-symmetric ranges.

This however is an end-user documentation, isn't it?

The fact that range-diff internally translates A...B into something
else is an implementation detail of achieving something, and from
the end-users' point of view, isn't it "take A...B from the
end-user, and show the difference between the left and right
branches of the symmetric range" that range-diff achieves to do?

So it processes exactly the same two sets of commits as what is
given by "--left-right A...B", no?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Fri, 22 Jan 2021, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> >> > +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
> >> > +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
> >> > +  equivalent to `<base>..<rev1> <base>..<rev2>` where `<base>` is the
> >> > +  merge base as obtained via `git merge-base <rev1> <rev2>`.
>
> The above paragraph says A...B is turned into $(git merge-base A
> B)..A and $(git merge-base A B)..B, but I wonder if we should be
> rewriting it into A..B and B..A instead; that would make it
> unnecessary to explain what should happen when there are more than
> one merge bases.

You know what? I lied. The code already does what you suggested. Look
[here](https://github.com/git/git/blob/v2.30.0/builtin/range-diff.c#L59-L77):

	[...]
        } else if (argc == 1) {
                const char *b = strstr(argv[0], "..."), *a = argv[0];
                int a_len;

                if (!b) {
                        error(_("single arg format must be symmetric range"));
                        usage_with_options(builtin_range_diff_usage, options);
                }

                a_len = (int)(b - a);
                if (!a_len) {
                        a = "HEAD";
                        a_len = strlen(a);
                }
                b += 3;
                if (!*b)
                        b = "HEAD";
                strbuf_addf(&range1, "%s..%.*s", b, a_len, a);
                strbuf_addf(&range2, "%.*s..%s", a_len, a, b);
	[...]

> >> Does this merely resemble?  Isn't it exactly what a symmetric range is?
> >
> > No, it is not exactly what a symmetric range is because `range-diff`
> > treats both arms of the symmetric range independently, as two distinct
> > non-symmetric ranges.
>
> This however is an end-user documentation, isn't it?

Yes, and the end user is talking about _two_ commit ranges in the context
of `git range-diff`, and about _one_ commit range in the context of `git
log`.

So I still think that it really just only resembles the symmetric range.
It is fundamentally as different from it as the number 2 is different from
the number 1.

Ciao,
Dscho

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> The above paragraph says A...B is turned into $(git merge-base A
>> B)..A and $(git merge-base A B)..B, but I wonder if we should be
>> rewriting it into A..B and B..A instead; that would make it
>> unnecessary to explain what should happen when there are more than
>> one merge bases.
>
> You know what? I lied.

I knew.  So we need an update to the patch.

>> >> Does this merely resemble?  Isn't it exactly what a symmetric range is?
>> >
>> > No, it is not exactly what a symmetric range is because `range-diff`
>> > treats both arms of the symmetric range independently, as two distinct
>> > non-symmetric ranges.
>>
>> This however is an end-user documentation, isn't it?
>
> Yes, and the end user is talking about _two_ commit ranges in the context
> of `git range-diff`, and about _one_ commit range in the context of `git
> log`.

You are forgetting that "log A...B" shows only one half of what a
symmetric range means, and hides which side each commit belongs to.

"git log --left-right A...B" shows what a symmetric range really is;
there are two sides, left and right, and "git range-diff A...B" is
a natural way to compare these two ranges.

I've been quite happy with the way how "git range-diff @{-1}..."
shows the comparison of two sides of the symmetric range, given by
"git log --oneline --left-right @{-1}..."


Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> There are three forms, depending whether the user specifies one, two or
> three non-option arguments. We've never actually explained how this
> works in the manual, so let's explain it.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  Documentation/git-range-diff.txt | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> index 9701c1e5fdd5..14bffb272a06 100644
> --- a/Documentation/git-range-diff.txt
> +++ b/Documentation/git-range-diff.txt
> @@ -28,6 +28,18 @@ Finally, the list of matching commits is shown in the order of the
>  second commit range, with unmatched commits being inserted just after
>  all of their ancestors have been shown.
>  
> +There are three ways to specify the commit ranges:
> +
> +- `<range1> <range2>`: Either commit range can be of the form
> +  `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
> +  in linkgit:gitrevisions[7] for more details.
> +
> +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
> +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
> +  equivalent to `<rev2>..<rev1> <rev1>..<rev2>`.

As I said before, this _is_ a symmetric range that has commits
reachable from rev1 but not from rev2 on the left hand side, and
commits reachable from rev2 but not from rev1 on the right hand
side, not just something else that resembles a symmetric range.

> +- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
> +  <base>..<rev2>`.
>  
>  OPTIONS
>  -------

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 4 Feb 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > There are three forms, depending whether the user specifies one, two or
> > three non-option arguments. We've never actually explained how this
> > works in the manual, so let's explain it.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> >  Documentation/git-range-diff.txt | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt
> > index 9701c1e5fdd5..14bffb272a06 100644
> > --- a/Documentation/git-range-diff.txt
> > +++ b/Documentation/git-range-diff.txt
> > @@ -28,6 +28,18 @@ Finally, the list of matching commits is shown in the order of the
> >  second commit range, with unmatched commits being inserted just after
> >  all of their ancestors have been shown.
> >
> > +There are three ways to specify the commit ranges:
> > +
> > +- `<range1> <range2>`: Either commit range can be of the form
> > +  `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
> > +  in linkgit:gitrevisions[7] for more details.
> > +
> > +- `<rev1>...<rev2>`. This resembles the symmetric ranges mentioned in
> > +  the `SPECIFYING RANGES` section of linkgit:gitrevisions[7], and is
> > +  equivalent to `<rev2>..<rev1> <rev1>..<rev2>`.
>
> As I said before, this _is_ a symmetric range that has commits
> reachable from rev1 but not from rev2 on the left hand side, and
> commits reachable from rev2 but not from rev1 on the right hand
> side, not just something else that resembles a symmetric range.

If we were talking about one commit range instead of two, I would agree
with you.

To put an end to this pointless discussion, I struck the sentence from the
documentation. It might even make it easier to read for users. So that's a
win.

Ciao,
Dscho

> > +- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
> > +  <base>..<rev2>`.
> >
> >  OPTIONS
> >  -------
>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> As I said before, this _is_ a symmetric range that has commits
>> reachable from rev1 but not from rev2 on the left hand side, and
>> commits reachable from rev2 but not from rev1 on the right hand
>> side, not just something else that resembles a symmetric range.
>
> If we were talking about one commit range instead of two, I would agree
> with you.

It is one symmetric commit range A...B that has two sides, i.e. B..A
on the left side and A..B on the right side.

all of their ancestors have been shown.

There are three ways to specify the commit ranges:

- `<range1> <range2>`: Either commit range can be of the form
`<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES`
in linkgit:gitrevisions[7] for more details.

- `<rev1>...<rev2>`. This is equivalent to
`<rev2>..<rev1> <rev1>..<rev2>`.

- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1>
<base>..<rev2>`.

OPTIONS
-------
Expand Down
2 changes: 1 addition & 1 deletion builtin/log.c
Expand Up @@ -1680,7 +1680,7 @@ static void infer_range_diff_ranges(struct strbuf *r1,
struct commit *head)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Currently, when called with exactly two arguments, `git range-diff`
> tests for a literal `..` in each of the two. Likewise, the argument
> provided via `--range-diff` to `git format-patch` is checked in the same
> manner.
>
> However, `<commit>^!` is a perfectly valid commit range, equivalent to
> `<commit>^..<commit>` according to the `SPECIFYING RANGES` section of
> gitrevisions[7].
>
> In preparation for allowing more sophisticated ways to specify commit
> ranges, let's refactor the check into its own function.

I think the sharing between the two makes sense, but the helper
function should make it clear in its name that this is "the kind of
commit range range-diff wants to take".  Among the commit range "git
log" and friends can take, range-diff can take only a subset of it,
and only a subset of it is meaningful to range-diff (e.g. HEAD^@ is
still a commit range you can give to "git log", but it would not
make much sense to give it to range-diff).

Perhaps

    s/specifies_commit_range/is_range_diff_range/

or something.

> diff --git a/revision.h b/revision.h

Move this to range-diff.h, not revision.h which is about true commit
ranges.

> index 086ff10280d..66777c8e60f 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -457,4 +457,11 @@ int rewrite_parents(struct rev_info *revs,
>   */
>  struct commit_list *get_saved_parents(struct rev_info *revs, const struct commit *commit);
>  
> +/*
> + * Determine whether the given argument defines a commit range, e.g. A..B.
> + * Note that this only validates the format but does _not_ parse it, i.e.
> + * it does _not_ look up the specified commits in the local repository.
> + */

And s/defines a commit range/is usable as a range to range-diff/

Thanks.

> +int specifies_commit_range(const char *range);
> +
>  #endif

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Uwe Kleine-König wrote (reply to this):


--aftbhbozzha65n2k
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jan 22, 2021 at 12:27:10PM -0800, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>=20
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > Currently, when called with exactly two arguments, `git range-diff`
> > tests for a literal `..` in each of the two. Likewise, the argument
> > provided via `--range-diff` to `git format-patch` is checked in the same
> > manner.
> >
> > However, `<commit>^!` is a perfectly valid commit range, equivalent to
> > `<commit>^..<commit>` according to the `SPECIFYING RANGES` section of
> > gitrevisions[7].
> >
> > In preparation for allowing more sophisticated ways to specify commit
> > ranges, let's refactor the check into its own function.
>=20
> I think the sharing between the two makes sense, but the helper
> function should make it clear in its name that this is "the kind of
> commit range range-diff wants to take".  Among the commit range "git
> log" and friends can take, range-diff can take only a subset of it,
> and only a subset of it is meaningful to range-diff (e.g. HEAD^@ is
> still a commit range you can give to "git log", but it would not
> make much sense to give it to range-diff).

Does it make so little sense to forbid passing HEAD^@ as a range to
range-diff? I can imagine situations where is would make sense, e.g. I
often create customer patch stacks from a set of topic branches using
octopus merge. To compare two of these ^@ might be handy.

My POV is that if it's easy to use the same function (and so the same
set of range descriptors) for git log and git range-diff then do so.
This yields a consistent behaviour which is IMHO better than preventing
people to do things that are considered strange today.

Best regards
Uwe

--=20
Pengutronix e.K.                           | Uwe Kleine-K=F6nig            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

--aftbhbozzha65n2k
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQEzBAABCgAdFiEEfnIqFpAYrP8+dKQLwfwUeK3K7AkFAmAOdKkACgkQwfwUeK3K
7AlxZwf/SLEM299R6ngrm/9LBrEqqypGPZp9sysvqH2WJBLYnn4TtJW2eyVHqoAY
22T8N3cVzRgWgx+cuQO0stTtlqKKu31Ruw5anR2KGJwF88t2bZoYSw9ePoKmgXLc
HR8XJKPhGCl17InySausPGBJiXqFLjMzH7wW3bF0jlwaOeBtxBCp0tsFbe0xKccU
VVHnLQkaxNd6R81rEMBxgDRzt+SSI4vgkd/6BZKbIPM9X58BdsuMr2MgC0OWmCA8
QZqmKGT5GviAbw96jT2rrO/IBJwJyoCXc/VFPIkhG2ODXcQMAF3uTmrFZjpuocFQ
o7OJW7rZb1MD2uu0nsTo0o3WGEYa9w==
=zI9f
-----END PGP SIGNATURE-----

--aftbhbozzha65n2k--

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Uwe Kleine-König wrote (reply to this):


--oxlh2ygc52upvp5w
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hello Junio,

On Mon, Jan 25, 2021 at 11:24:39AM -0800, Junio C Hamano wrote:
> Uwe Kleine-K=F6nig <u.kleine-koenig@pengutronix.de> writes:
>=20
> >> > In preparation for allowing more sophisticated ways to specify commit
> >> > ranges, let's refactor the check into its own function.
> >>=20
> >> I think the sharing between the two makes sense, but the helper
> >> function should make it clear in its name that this is "the kind of
> >> commit range range-diff wants to take".  Among the commit range "git
> >> log" and friends can take, range-diff can take only a subset of it,
> >> and only a subset of it is meaningful to range-diff (e.g. HEAD^@ is
> >> still a commit range you can give to "git log", but it would not
> >> make much sense to give it to range-diff).
> >
> > Does it make so little sense to forbid passing HEAD^@ as a range to
> > range-diff? I can imagine situations where is would make sense, e.g. I
> > often create customer patch stacks from a set of topic branches using
> > octopus merge. To compare two of these ^@ might be handy.
>=20
> You can discuss for each individual syntax of a single-token range
> and decide which ones are and are not suitable for range-diff, but I
> suspect the reason behind this business started with dot-dot is to
> perform a superficial "sanity check" at the command line parser
> level before passing them to the revision machinery, and having to
> deal with potential errors and/or having to compare unreasonably
> large segments of history that the user did not intend.
>=20
> Also I first thought that the command changes the behaviour, given
> two tokens, depending on the shape of these two tokens (i.e. when
> they satisfy the "is-range?" we are discussing, they are taken as
> two ranges to be compared, and otherwise does something else), but
> after revisiting the code and "git help range-diff", it always does
> one thing when given=20
>=20
>  (1) one arg: gives a symmetric range and what is to be compared
>      is its left and right half,
>=20
>  (2) two args: each is meant to name a set of commits and these two
>      are to be compared) or
>=20
>  (3) three args: each is meant to name a commit, and the arg1..arg2
>      and arg1..arg3 are the ranges to be compared.
>=20
> so ...
>=20
> > My POV is that if it's easy to use the same function (and so the same
> > set of range descriptors) for git log and git range-diff then do so.
> > This yields a consistent behaviour which is IMHO better than preventing
> > people to do things that are considered strange today.
>=20
> ... I am OK with that point of view.  It certainly is simpler to
> explain to end users.

It seems you understood my argument :-)

> Having said that, it would make it much harder to implement
> efficiently, though.  For example, when your user says
>=20
> 	git range-diff A B
>=20
> to compare "git log A" (all the way down to the root) and "git log
> B" (likewise), you'd certainly optimize the older common part of the
> history out, essentially turning it into
>=20
> 	git range-diff A..B B..A
>=20
> or its moral equivalent
>=20
> 	git range-diff A...B
>=20
> But you cannot apply such an optimization blindly.  When the user
> gives A..B and B..A as two args, you somehow need to notice that=20
> you shouldn't rewrite it to "A..B...B..A", and for that, you'd still
> need some "parsing" of these args.

I agree that for a long history

	git range-diff A B

is an expensive request and I wouldn't invest too many cycles optimizing
it. (And if I'd optimize it, it wouldn't be done using textual
combination of the two strings but by checking if the two ranges
intersect. This way something like

	git range-diff v4.0..v4.6-rc1 v4.0..v4.5.6

and maybe even

	git range-diff v4.0..v4.6-rc1 v4.0-rc1..v4.5.6

would benefit, too. But note I'm not (anymore) familiar with the git
source code, so I don't know if this is easy/sensible to do and I'm just
looking at the problem from an architectural and theoretical POV.)

> So, I dunno.  Limiting the second form to only forms that the
> implementation does not have to do such optimization would certainly
> make it simpler for Dscho to implement ;-)

I don't want to make it more complicated for Dscho, I'm happy if I can
in the near future use range-diff with $rev1^! $ref2^! . So feel free to
ignore me.

Best regards
Uwe

--=20
Pengutronix e.K.                           | Uwe Kleine-K=F6nig            |
Industrial Linux Solutions                 | https://www.pengutronix.de/ |

--oxlh2ygc52upvp5w
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQEzBAABCgAdFiEEfnIqFpAYrP8+dKQLwfwUeK3K7AkFAmAPN0IACgkQwfwUeK3K
7AkT2Qf/U0hjauGpubVi6j2+BMClfZDlVRMYx+9TV+KWKoSTvqbNC1BngzO/fBmS
eBIIpS8u4jCMKKfxRRW1vl5OW+b/tYRAME3RYO2shUroXVeUWrY4KaIwwBxgcHOR
4Rw0e2Pi7vkK8Zrd6FWZCYVTzs9YOXfgBRJJf9OBzSGsjV66qkonVKY1SCwfIur1
Za8+Abw8VTg4syBWwcuKvnbeyYyJuvCKE4vi17I6gqEppNfxZ9CJEe0Vo3QM7zKy
mEnznudTd2fF2NyIuz5EdwUlStket/1PsT9m8dMPyfqnpTQUjTJ6x+/k4R0t3Sgk
U5+1znHBEsUP1/dfAcA+reDHHyfPgw==
=urxk
-----END PGP SIGNATURE-----

--oxlh2ygc52upvp5w--

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:

>> > In preparation for allowing more sophisticated ways to specify commit
>> > ranges, let's refactor the check into its own function.
>> 
>> I think the sharing between the two makes sense, but the helper
>> function should make it clear in its name that this is "the kind of
>> commit range range-diff wants to take".  Among the commit range "git
>> log" and friends can take, range-diff can take only a subset of it,
>> and only a subset of it is meaningful to range-diff (e.g. HEAD^@ is
>> still a commit range you can give to "git log", but it would not
>> make much sense to give it to range-diff).
>
> Does it make so little sense to forbid passing HEAD^@ as a range to
> range-diff? I can imagine situations where is would make sense, e.g. I
> often create customer patch stacks from a set of topic branches using
> octopus merge. To compare two of these ^@ might be handy.

You can discuss for each individual syntax of a single-token range
and decide which ones are and are not suitable for range-diff, but I
suspect the reason behind this business started with dot-dot is to
perform a superficial "sanity check" at the command line parser
level before passing them to the revision machinery, and having to
deal with potential errors and/or having to compare unreasonably
large segments of history that the user did not intend.

Also I first thought that the command changes the behaviour, given
two tokens, depending on the shape of these two tokens (i.e. when
they satisfy the "is-range?" we are discussing, they are taken as
two ranges to be compared, and otherwise does something else), but
after revisiting the code and "git help range-diff", it always does
one thing when given 

 (1) one arg: gives a symmetric range and what is to be compared
     is its left and right half,

 (2) two args: each is meant to name a set of commits and these two
     are to be compared) or

 (3) three args: each is meant to name a commit, and the arg1..arg2
     and arg1..arg3 are the ranges to be compared.

so ...

> My POV is that if it's easy to use the same function (and so the same
> set of range descriptors) for git log and git range-diff then do so.
> This yields a consistent behaviour which is IMHO better than preventing
> people to do things that are considered strange today.

... I am OK with that point of view.  It certainly is simpler to
explain to end users.

Having said that, it would make it much harder to implement
efficiently, though.  For example, when your user says

	git range-diff A B

to compare "git log A" (all the way down to the root) and "git log
B" (likewise), you'd certainly optimize the older common part of the
history out, essentially turning it into

	git range-diff A..B B..A

or its moral equivalent

	git range-diff A...B

But you cannot apply such an optimization blindly.  When the user
gives A..B and B..A as two args, you somehow need to notice that 
you shouldn't rewrite it to "A..B...B..A", and for that, you'd still
need some "parsing" of these args.

So, I dunno.  Limiting the second form to only forms that the
implementation does not have to do such optimization would certainly
make it simpler for Dscho to implement ;-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes:

>> > My POV is that if it's easy to use the same function (and so the same
>> > set of range descriptors) for git log and git range-diff then do so.
>> > This yields a consistent behaviour which is IMHO better than preventing
>> > people to do things that are considered strange today.
>> 
>> ... I am OK with that point of view.  It certainly is simpler to
>> explain to end users.
>
> It seems you understood my argument :-)

So it seems ;-).

>> Having said that, it would make it much harder to implement
>> efficiently, though.  For example, when your user says
>> 
>> 	git range-diff A B
>> 
>> to compare "git log A" (all the way down to the root) and "git log
>> B" (likewise), you'd certainly optimize the older common part of the
>> history out, essentially turning it into
>> 
>> 	git range-diff A..B B..A
>> 
>> or its moral equivalent
>> 
>> 	git range-diff A...B
>> 
>> But you cannot apply such an optimization blindly.  When the user
>> gives A..B and B..A as two args, you somehow need to notice that 
>> you shouldn't rewrite it to "A..B...B..A", and for that, you'd still
>> need some "parsing" of these args.
>
> I agree that for a long history
>
> 	git range-diff A B
>
> is an expensive request and I wouldn't invest too many cycles optimizing
> it.

Well, your devil's advocate can argue that accepting an input that
can easily make the tool unusable would be irresponsible, though.

And there are two possible ways out:

 (1) declare that optimizing "A B" into "A...B" is way too difficult
     to do in general, and find a good enough way to see if A and B
     individually gives a "range" that should be manageable; or

 (2) invest cycles to optimize, so your users do not have to care.

I think the series takes the former approach, and I find it
understandable.

It is a different matter if the way found and implemented in the
patch is "good enough" to tell if a given argument names a
manageable range.  You said something about "rev^{/^here are
two..dots}" that can be mistaken as a "good enough" range, but it
actually names a revision and all the way down to the root.

> (And if I'd optimize it, it wouldn't be done using textual
> combination of the two strings but by checking if the two ranges
> intersect.

Yup, that is in line with what I mumbled earlier about
setup_revisions() and inspecting the rev_info.pending, I think.

>> So, I dunno.  Limiting the second form to only forms that the
>> implementation does not have to do such optimization would certainly
>> make it simpler for Dscho to implement ;-)
>
> I don't want to make it more complicated for Dscho, I'm happy if I can
> in the near future use range-diff with $rev1^! $ref2^! . So feel free to
> ignore me.
>
> Best regards
> Uwe

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/range-diff.h b/range-diff.h
> index 583ced2e8e74..c17dbc2e75a8 100644
> --- a/range-diff.h
> +++ b/range-diff.h
> @@ -16,4 +16,12 @@ int show_range_diff(const char *range1, const char *range2,
>  		    const struct diff_options *diffopt,
>  		    const struct strvec *other_arg);
>  
> +/*
> + * Determine whether the given argument is usable as a range argument of `git
> + * range-diff`, e.g. A..B. Note that this only validates the format but does
> + * _not_ parse it, i.e. it does _not_ look up the specified commits in the
> + * local repository.
> + */
> +int is_range_diff_range(const char *arg);

If we were to use [v4 2/3], then we do parse it, even though we do
use the parse result to reject some valid ranges (like "a history
all the way down to root" in the implementation).  I think just
dropping everything after "Note that" is a sufficient fix.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 4 Feb 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > diff --git a/range-diff.h b/range-diff.h
> > index 583ced2e8e74..c17dbc2e75a8 100644
> > --- a/range-diff.h
> > +++ b/range-diff.h
> > @@ -16,4 +16,12 @@ int show_range_diff(const char *range1, const char *range2,
> >  		    const struct diff_options *diffopt,
> >  		    const struct strvec *other_arg);
> >
> > +/*
> > + * Determine whether the given argument is usable as a range argument of `git
> > + * range-diff`, e.g. A..B. Note that this only validates the format but does
> > + * _not_ parse it, i.e. it does _not_ look up the specified commits in the
> > + * local repository.
> > + */
> > +int is_range_diff_range(const char *arg);
>
> If we were to use [v4 2/3], then we do parse it, even though we do
> use the parse result to reject some valid ranges (like "a history
> all the way down to root" in the implementation).  I think just
> dropping everything after "Note that" is a sufficient fix.

Fair. I will add the sentence in 1/3 (because it is still correct there)
and remove it as part of 2/3.

Ciao,
Dscho

{
const char *head_oid = oid_to_hex(&head->object.oid);
int prev_is_range = !!strstr(prev, "..");
int prev_is_range = is_range_diff_range(prev);

if (prev_is_range)
strbuf_addstr(r1, prev);
Expand Down
9 changes: 5 additions & 4 deletions builtin/range-diff.c
Expand Up @@ -3,6 +3,7 @@
#include "parse-options.h"
#include "range-diff.h"
#include "config.h"
#include "revision.h"

static const char * const builtin_range_diff_usage[] = {
N_("git range-diff [<options>] <old-base>..<old-tip> <new-base>..<new-tip>"),
Expand Down Expand Up @@ -46,12 +47,12 @@ int cmd_range_diff(int argc, const char **argv, const char *prefix)
diffopt.use_color = 1;

if (argc == 2) {
if (!strstr(argv[0], ".."))
die(_("no .. in range: '%s'"), argv[0]);
if (!is_range_diff_range(argv[0]))
die(_("not a commit range: '%s'"), argv[0]);
strbuf_addstr(&range1, argv[0]);

if (!strstr(argv[1], ".."))
die(_("no .. in range: '%s'"), argv[1]);
if (!is_range_diff_range(argv[1]))
die(_("not a commit range: '%s'"), argv[1]);
strbuf_addstr(&range2, argv[1]);
} else if (argc == 3) {
strbuf_addf(&range1, "%s..%s", argv[0], argv[1]);
Expand Down
29 changes: 29 additions & 0 deletions range-diff.c
Expand Up @@ -11,6 +11,7 @@
#include "pretty.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

>  int is_range_diff_range(const char *arg)
>  {
> -	return !!strstr(arg, "..");
> +	char *copy = xstrdup(arg); /* setup_revisions() modifies it */
> +	const char *argv[] = { "", copy, "--", NULL };
> +	int i, positive = 0, negative = 0;
> +	struct rev_info revs;
> +
> +	init_revisions(&revs, NULL);
> +	if (setup_revisions(3, argv, &revs, 0) == 1) {
> +		for (i = 0; i < revs.pending.nr; i++)
> +			if (revs.pending.objects[i].item->flags & UNINTERESTING)
> +				negative++;
> +			else
> +				positive++;
> +	}
> +
> +	free(copy);
> +	object_array_clear(&revs.pending);
> +	return negative > 0 && positive > 0;
>  }

One thing that worries me with this code is that I do not see
anybody that clears UNINTERESTING bit in the flags.  In-core objects
are singletons, so if a user fed the command two ranges,

	git range-diff A..B C..A

and this code first handled "A..B", smudging the in-core instance of
the commit object A with UNINTERESTING bit, that in-core instance
will be reused when the second range argument "C..A" is given to
this function again.

At that point, has anybody cleared the UNINTERESTING bit in the
flags word for the in-core commit A?  I do not see it done in this
function, but perhaps I am missing it done in the init/setup
functions (I somehow doubt it, though)?

Shoudn't we be calling clear_commit_marks(ALL_REF_FLAGS) on the
commits in revs.pending[] array before we clear it?  Depending on
the shape of "arg" that is end-user supplied, we may have walked the
history in handle_dotdot_1() to parse it (e.g. "A...B").

Also we'd want to see what needs to be cleared in revs.cmdline
that would have been populated by calls to add_rev_cmdline().

Other than that, I quite like the way the actual code turned out to
be.

> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 6eb344be0312..e217cecac9ed 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -150,6 +150,14 @@ test_expect_success 'simple A B C (unmodified)' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'A^! and A^-<n> (unmodified)' '
> +	git range-diff --no-color topic^! unmodified^-1 >actual &&
> +	cat >expect <<-EOF &&
> +	1:  $(test_oid t4) = 1:  $(test_oid u4) s/12/B/
> +	EOF
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'trivial reordering' '
>  	git range-diff --no-color master topic reordered >actual &&
>  	cat >expect <<-EOF &&

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 4 Feb 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> >  int is_range_diff_range(const char *arg)
> >  {
> > -	return !!strstr(arg, "..");
> > +	char *copy = xstrdup(arg); /* setup_revisions() modifies it */
> > +	const char *argv[] = { "", copy, "--", NULL };
> > +	int i, positive = 0, negative = 0;
> > +	struct rev_info revs;
> > +
> > +	init_revisions(&revs, NULL);
> > +	if (setup_revisions(3, argv, &revs, 0) == 1) {
> > +		for (i = 0; i < revs.pending.nr; i++)
> > +			if (revs.pending.objects[i].item->flags & UNINTERESTING)
> > +				negative++;
> > +			else
> > +				positive++;
> > +	}
> > +
> > +	free(copy);
> > +	object_array_clear(&revs.pending);
> > +	return negative > 0 && positive > 0;
> >  }
>
> One thing that worries me with this code is that I do not see
> anybody that clears UNINTERESTING bit in the flags.  In-core objects
> are singletons, so if a user fed the command two ranges,
>
> 	git range-diff A..B C..A
>
> and this code first handled "A..B", smudging the in-core instance of
> the commit object A with UNINTERESTING bit, that in-core instance
> will be reused when the second range argument "C..A" is given to
> this function again.
>
> At that point, has anybody cleared the UNINTERESTING bit in the
> flags word for the in-core commit A?  I do not see it done in this
> function, but perhaps I am missing it done in the init/setup
> functions (I somehow doubt it, though)?
>
> Shoudn't we be calling clear_commit_marks(ALL_REF_FLAGS) on the
> commits in revs.pending[] array before we clear it?

Yes, we should ;-)

> Depending on the shape of "arg" that is end-user supplied, we may have
> walked the history in handle_dotdot_1() to parse it (e.g. "A...B").

The code in `handle_dotdot_1()` that handles `...` calls
`get_merge_bases()`, which eventually calls `get_merge_bases_many_0()`
with the `cleanup` parameter set to `1`, i.e. it _does_ clean up the
flags.

Otherwise, `git log A...B` couldn't work, could it? ;-)

> Also we'd want to see what needs to be cleared in revs.cmdline
> that would have been populated by calls to add_rev_cmdline().

Is this cleared, like, ever? I don't see any code that handles `cmdline`
that way. Seems as we're willfully leaking this.

Ciao,
Dscho

> Other than that, I quite like the way the actual code turned out to
> be.
>
> > diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> > index 6eb344be0312..e217cecac9ed 100755
> > --- a/t/t3206-range-diff.sh
> > +++ b/t/t3206-range-diff.sh
> > @@ -150,6 +150,14 @@ test_expect_success 'simple A B C (unmodified)' '
> >  	test_cmp expect actual
> >  '
> >
> > +test_expect_success 'A^! and A^-<n> (unmodified)' '
> > +	git range-diff --no-color topic^! unmodified^-1 >actual &&
> > +	cat >expect <<-EOF &&
> > +	1:  $(test_oid t4) = 1:  $(test_oid u4) s/12/B/
> > +	EOF
> > +	test_cmp expect actual
> > +'
> > +
> >  test_expect_success 'trivial reordering' '
> >  	git range-diff --no-color master topic reordered >actual &&
> >  	cat >expect <<-EOF &&
>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +test_expect_success 'A^! and A^-<n> (unmodified)' '
> +	git range-diff --no-color topic^! unmodified^-1 >actual &&
> +	cat >expect <<-EOF &&
> +	1:  $(test_oid t4) = 1:  $(test_oid u4) s/12/B/
> +	EOF
> +	test_cmp expect actual
> +'

Now we actually parse the single-token range, instead of relying on
"does it have dot-dot" heuristics, we can make sure that we reject

    "HEAD^{/^string with .. in it}"

as "not a range with negative and positive ends".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 4 Feb 2021, Junio C Hamano wrote:

> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > +test_expect_success 'A^! and A^-<n> (unmodified)' '
> > +	git range-diff --no-color topic^! unmodified^-1 >actual &&
> > +	cat >expect <<-EOF &&
> > +	1:  $(test_oid t4) = 1:  $(test_oid u4) s/12/B/
> > +	EOF
> > +	test_cmp expect actual
> > +'
>
> Now we actually parse the single-token range, instead of relying on
> "does it have dot-dot" heuristics, we can make sure that we reject
>
>     "HEAD^{/^string with .. in it}"
>
> as "not a range with negative and positive ends".

Sure, why not.

Ciao,
Dscho

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> +	init_revisions(&revs, NULL);
> +	if (setup_revisions(3, argv, &revs, 0) == 1)
> +		for (i = 0; i < revs.pending.nr; i++) {
> +			struct object *obj = revs.pending.objects[i].item;
> +
> +			if (obj->flags & UNINTERESTING)
> +				negative++;
> +			else
> +				positive++;
> +			if (obj->type == OBJ_COMMIT)
> +				clear_commit_marks((struct commit *)obj,
> +						   ALL_REV_FLAGS);

Ah, I didn't think of adding this inside the same loop.  As long as
the pending objects are independent, this should work, but it is
unsafe, I think.  Imagine what happens if revs.pending[0].item can
reach the object revs.pending[1].item?  By the time the loop wants
to inspect the latter, its flags may have been cleared already
because we clear flags in objects that are reachable from the
former.

Let's do this as a post-processing for safety (or correctness).

	int negpos = 0;

	if (setup_revisions(3, argv, &revs, 0) != 1)
		goto cleanup;
	for (i = 0; i < revs.pending.nr && negpos != 3; i++) {
		struct object *obj = revs.pending.objects[i].item;
		negpos |= (obj->flags & UNINTERESTING) ? 01 : 02;
	}

	for (i = 0; i < revs.pending.nr; i++) {
		struct object *obj = revs.pending.objects[i].item;
                if (obj->type == OBJ_COMMIT)
			clear_commit_marks((struct commit *)obj, ALL_REV_FLAGS);
	}

    cleanup:
> +	free(copy);
> +	object_array_clear(&revs.pending);
> +	return negative > 0 && positive > 0;
>  }

or something like that.

> +test_expect_success 'A^{/..} is not mistaken for a range' '
> +	test_must_fail git range-diff topic^.. topic^{/..} 2>error &&
> +	test_i18ngrep "not a commit rang" error
> +'

s/rang/range/, I would think.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Junio C Hamano <gitster@pobox.com> writes:

> Ah, I didn't think of adding this inside the same loop.  As long as
> the pending objects are independent, this should work, but it is
> unsafe, I think.

Here is what I added on top of the 3-patch series for tonight's
pushout.

 * No need to count positive/negative; just seeing both exists is
   sufficient and there is no need to examine any later elements, if
   any.

 * Clearing commit marks needs to be done after we have seen enough,
   or it can clear the stuff we would want to inspect before we
   have a chance to.  Do it in a separate post-cleaning loop.

 * Dedent by judicious use of 'goto'.

 * The last parameter to setup_revisions() is a pointer, so spell it
   NULL not 0.

 * "rang" -> "range" typofix (it might be even better to look for
   "range:")

----- >8 ---------- >8 ---------- >8 ---------- >8 ---------- >8 ---------- >8 -----
Subject: [PATCH] fixup! range-diff/format-patch: handle commit ranges other than A..B

 range-diff.c          | 30 +++++++++++++++++-------------
 t/t3206-range-diff.sh |  2 +-
 2 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/range-diff.c b/range-diff.c
index c307bca9de..7a38dc8715 100644
--- a/range-diff.c
+++ b/range-diff.c
@@ -570,24 +570,28 @@ int is_range_diff_range(const char *arg)
 {
 	char *copy = xstrdup(arg); /* setup_revisions() modifies it */
 	const char *argv[] = { "", copy, "--", NULL };
-	int i, positive = 0, negative = 0;
+	int i;
 	struct rev_info revs;
+	unsigned npmask = 0;
 
 	init_revisions(&revs, NULL);
-	if (setup_revisions(3, argv, &revs, 0) == 1)
-		for (i = 0; i < revs.pending.nr; i++) {
-			struct object *obj = revs.pending.objects[i].item;
+	if (setup_revisions(3, argv, &revs, NULL) != 1)
+		goto cleanup;
 
-			if (obj->flags & UNINTERESTING)
-				negative++;
-			else
-				positive++;
-			if (obj->type == OBJ_COMMIT)
-				clear_commit_marks((struct commit *)obj,
-						   ALL_REV_FLAGS);
-		}
+	for (i = 0; i < revs.pending.nr && npmask != 3; i++) {
+		struct object *obj = revs.pending.objects[i].item;
+
+		npmask |= (obj->flags & UNINTERESTING) ? 01 : 02;
+	}
+
+	for (i = 0; i < revs.pending.nr; i++) {
+		struct object *obj = revs.pending.objects[i].item;
+		if (obj->type == OBJ_COMMIT)
+			clear_commit_marks((struct commit *)obj, ALL_REV_FLAGS);
+	}
 
+cleanup:
 	free(copy);
 	object_array_clear(&revs.pending);
-	return negative > 0 && positive > 0;
+	return npmask == 3;
 }
diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
index 45f21ee215..2b518378d4 100755
--- a/t/t3206-range-diff.sh
+++ b/t/t3206-range-diff.sh
@@ -160,7 +160,7 @@ test_expect_success 'A^! and A^-<n> (unmodified)' '
 
 test_expect_success 'A^{/..} is not mistaken for a range' '
 	test_must_fail git range-diff topic^.. topic^{/..} 2>error &&
-	test_i18ngrep "not a commit rang" error
+	test_i18ngrep "not a commit range" error
 '
 
 test_expect_success 'trivial reordering' '
-- 
2.30.0-601-g9b6178ed87

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Junio,

On Thu, 4 Feb 2021, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
>
> > Ah, I didn't think of adding this inside the same loop.  As long as
> > the pending objects are independent, this should work, but it is
> > unsafe, I think.
>
> Here is what I added on top of the 3-patch series for tonight's
> pushout.
>
>  * No need to count positive/negative; just seeing both exists is
>    sufficient and there is no need to examine any later elements, if
>    any.

Your code is substantially harder to read. I really dislike that `npmask`
idea.

>  * Clearing commit marks needs to be done after we have seen enough,
>    or it can clear the stuff we would want to inspect before we
>    have a chance to.  Do it in a separate post-cleaning loop.

I implemented that.

>  * Dedent by judicious use of 'goto'.

Not necessary: the code is short and narrow enough.

>  * The last parameter to setup_revisions() is a pointer, so spell it
>    NULL not 0.

I had missed that in your review. Fixed, too.

>  * "rang" -> "range" typofix (it might be even better to look for
>    "range:")

Technically, it is not necessary, as we're only verifying that the
intended error message is shown, and that "e" does not make any
difference.

But it _was_ a typo, so I fixed it, too.

Next iteration is coming soon,
Dscho

>
> ----- >8 ---------- >8 ---------- >8 ---------- >8 ---------- >8 ---------- >8 -----
> Subject: [PATCH] fixup! range-diff/format-patch: handle commit ranges other than A..B
>
>  range-diff.c          | 30 +++++++++++++++++-------------
>  t/t3206-range-diff.sh |  2 +-
>  2 files changed, 18 insertions(+), 14 deletions(-)
>
> diff --git a/range-diff.c b/range-diff.c
> index c307bca9de..7a38dc8715 100644
> --- a/range-diff.c
> +++ b/range-diff.c
> @@ -570,24 +570,28 @@ int is_range_diff_range(const char *arg)
>  {
>  	char *copy = xstrdup(arg); /* setup_revisions() modifies it */
>  	const char *argv[] = { "", copy, "--", NULL };
> -	int i, positive = 0, negative = 0;
> +	int i;
>  	struct rev_info revs;
> +	unsigned npmask = 0;
>
>  	init_revisions(&revs, NULL);
> -	if (setup_revisions(3, argv, &revs, 0) == 1)
> -		for (i = 0; i < revs.pending.nr; i++) {
> -			struct object *obj = revs.pending.objects[i].item;
> +	if (setup_revisions(3, argv, &revs, NULL) != 1)
> +		goto cleanup;
>
> -			if (obj->flags & UNINTERESTING)
> -				negative++;
> -			else
> -				positive++;
> -			if (obj->type == OBJ_COMMIT)
> -				clear_commit_marks((struct commit *)obj,
> -						   ALL_REV_FLAGS);
> -		}
> +	for (i = 0; i < revs.pending.nr && npmask != 3; i++) {
> +		struct object *obj = revs.pending.objects[i].item;
> +
> +		npmask |= (obj->flags & UNINTERESTING) ? 01 : 02;
> +	}
> +
> +	for (i = 0; i < revs.pending.nr; i++) {
> +		struct object *obj = revs.pending.objects[i].item;
> +		if (obj->type == OBJ_COMMIT)
> +			clear_commit_marks((struct commit *)obj, ALL_REV_FLAGS);
> +	}
>
> +cleanup:
>  	free(copy);
>  	object_array_clear(&revs.pending);
> -	return negative > 0 && positive > 0;
> +	return npmask == 3;
>  }
> diff --git a/t/t3206-range-diff.sh b/t/t3206-range-diff.sh
> index 45f21ee215..2b518378d4 100755
> --- a/t/t3206-range-diff.sh
> +++ b/t/t3206-range-diff.sh
> @@ -160,7 +160,7 @@ test_expect_success 'A^! and A^-<n> (unmodified)' '
>
>  test_expect_success 'A^{/..} is not mistaken for a range' '
>  	test_must_fail git range-diff topic^.. topic^{/..} 2>error &&
> -	test_i18ngrep "not a commit rang" error
> +	test_i18ngrep "not a commit range" error
>  '
>
>  test_expect_success 'trivial reordering' '
> --
> 2.30.0-601-g9b6178ed87
>
>

#include "userdiff.h"
#include "apply.h"
#include "revision.h"

struct patch_util {
/* For the search for an exact match */
Expand Down Expand Up @@ -564,3 +565,31 @@ int show_range_diff(const char *range1, const char *range2,

return res;
}

int is_range_diff_range(const char *arg)
{
char *copy = xstrdup(arg); /* setup_revisions() modifies it */
const char *argv[] = { "", copy, "--", NULL };
int i, positive = 0, negative = 0;
struct rev_info revs;

init_revisions(&revs, NULL);
if (setup_revisions(3, argv, &revs, NULL) == 1) {
for (i = 0; i < revs.pending.nr; i++)
if (revs.pending.objects[i].item->flags & UNINTERESTING)
negative++;
else
positive++;
for (i = 0; i < revs.pending.nr; i++) {
struct object *obj = revs.pending.objects[i].item;

if (obj->type == OBJ_COMMIT)
clear_commit_marks((struct commit *)obj,
ALL_REV_FLAGS);
}
}

free(copy);
object_array_clear(&revs.pending);
return negative > 0 && positive > 0;
}
6 changes: 6 additions & 0 deletions range-diff.h
Expand Up @@ -16,4 +16,10 @@ int show_range_diff(const char *range1, const char *range2,
const struct diff_options *diffopt,
const struct strvec *other_arg);

/*
* Determine whether the given argument is usable as a range argument of `git
* range-diff`, e.g. A..B.
*/
int is_range_diff_range(const char *arg);

#endif
13 changes: 13 additions & 0 deletions t/t3206-range-diff.sh
Expand Up @@ -150,6 +150,19 @@ test_expect_success 'simple A B C (unmodified)' '
test_cmp expect actual
'

test_expect_success 'A^! and A^-<n> (unmodified)' '
git range-diff --no-color topic^! unmodified^-1 >actual &&
cat >expect <<-EOF &&
1: $(test_oid t4) = 1: $(test_oid u4) s/12/B/
EOF
test_cmp expect actual
'

test_expect_success 'A^{/..} is not mistaken for a range' '
test_must_fail git range-diff topic^.. topic^{/..} 2>error &&
test_i18ngrep "not a commit range" error
'

test_expect_success 'trivial reordering' '
git range-diff --no-color master topic reordered >actual &&
cat >expect <<-EOF &&
Expand Down