Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sequencer: remove use of hardcoded comment char #1603

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ttung
Copy link

@ttung ttung commented Oct 27, 2023

Instead of using the hardcoded # , use the
user-defined comment_line_char. Adds a test
to prevent regressions.

cc: Elijah Newren newren@gmail.com
cc: Tony Tung tonytung@merly.org
cc: Phillip Wood phillip.wood123@gmail.com

@ttung ttung changed the title sequencer: Remove use of comment character sequencer: Remove use of hardcoded comment char Oct 27, 2023
@gitgitgadget
Copy link

gitgitgadget bot commented Oct 27, 2023

Welcome to GitGitGadget

Hi @ttung, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that your Pull Request has a good description, as it will be used as cover letter. You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 27, 2023

There are issues in commit 7a51b73:
sequencer: Remove use of comment character
Prefixed commit message must be in lower case

@ttung ttung changed the title sequencer: Remove use of hardcoded comment char sequencer: remove use of hardcoded comment char Oct 27, 2023
@dscho
Copy link
Member

dscho commented Oct 29, 2023

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 29, 2023

User ttung is now allowed to use GitGitGadget.

WARNING: ttung has no public email address set on GitHub;
GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use.

@ttung
Copy link
Author

ttung commented Oct 30, 2023

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 30, 2023

Submitted as pull.1603.git.1698635292629.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1603/ttung/ttung/commentchar-v1

To fetch this version to local tag pr-1603/ttung/ttung/commentchar-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1603/ttung/ttung/commentchar-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 30, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Tony Tung <tonytung@merly.org>
>
> Instead of using the hardcoded `# `, use the
> user-defined comment_line_char.  Adds a test
> to prevent regressions.

Good spotting.

Two observations.

 (1) There are a few more places that need similar treatment in the
     same file; you may want to fix them all while at it.

 (2) The second argument to strbuf_commented_addf() is always the
     comment_line_char global variable, not just inside this file
     but all callers across the codebase.  We probably should drop
     it and have the strbuf_commented_addf() helper itself refer to
     the global.  That way, if we ever want to change the global
     variable reference to something else (e.g. function call), we
     only have to touch a single place.

The latter is meant as #leftoverbits and will be a lot wider
clean-up that we may want to do long after this patch hits out
codebase.  The "other places" I spotted for the former are the
following, but needs to be taken with a huge grain of salt, as it
has not even been compile tested.

Thanks.

 sequencer.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git c/sequencer.c w/sequencer.c
index d584cac8ed..33208b1660 100644
--- c/sequencer.c
+++ w/sequencer.c
@@ -1893,8 +1893,8 @@ static void update_squash_message_for_fixup(struct strbuf *msg)
 	size_t orig_msg_len;
 	int i = 1;
 
-	strbuf_addf(&buf1, "# %s\n", _(first_commit_msg_str));
-	strbuf_addf(&buf2, "# %s\n", _(skip_first_commit_msg_str));
+	strbuf_addf(&buf1, comment_line_char, "%s\n", _(first_commit_msg_str));
+	strbuf_addf(&buf2, comment_line_char, "%s\n", _(skip_first_commit_msg_str));
 	s = start = orig_msg = strbuf_detach(msg, &orig_msg_len);
 	while (s) {
 		const char *next;
@@ -2269,8 +2269,8 @@ static int do_pick_commit(struct repository *r,
 		next = parent;
 		next_label = msg.parent_label;
 		if (opts->commit_use_reference) {
-			strbuf_addstr(&msgbuf,
-				"# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
+			strbuf_commented_addf(&msgbuf, comment_line_char, "%s",
+				"*** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
 		} else if (skip_prefix(msg.subject, "Revert \"", &orig_subject) &&
 			   /*
 			    * We don't touch pre-existing repeated reverts, because

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 30, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

On Sun, Oct 29, 2023 at 9:01 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: Tony Tung <tonytung@merly.org>
> >
> > Instead of using the hardcoded `# `, use the
> > user-defined comment_line_char.  Adds a test
> > to prevent regressions.
>
> Good spotting.
>
> Two observations.
>
>  (1) There are a few more places that need similar treatment in the
>      same file; you may want to fix them all while at it.
>
>  (2) The second argument to strbuf_commented_addf() is always the
>      comment_line_char global variable, not just inside this file
>      but all callers across the codebase.  We probably should drop
>      it and have the strbuf_commented_addf() helper itself refer to
>      the global.  That way, if we ever want to change the global
>      variable reference to something else (e.g. function call), we
>      only have to touch a single place.
>
> The latter is meant as #leftoverbits and will be a lot wider
> clean-up that we may want to do long after this patch hits out
> codebase.  The "other places" I spotted for the former are the
> following, but needs to be taken with a huge grain of salt, as it
> has not even been compile tested.
>
> Thanks.
>
>  sequencer.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git c/sequencer.c w/sequencer.c
> index d584cac8ed..33208b1660 100644
> --- c/sequencer.c
> +++ w/sequencer.c
> @@ -1893,8 +1893,8 @@ static void update_squash_message_for_fixup(struct strbuf *msg)
>         size_t orig_msg_len;
>         int i = 1;
>
> -       strbuf_addf(&buf1, "# %s\n", _(first_commit_msg_str));
> -       strbuf_addf(&buf2, "# %s\n", _(skip_first_commit_msg_str));
> +       strbuf_addf(&buf1, comment_line_char, "%s\n", _(first_commit_msg_str));
> +       strbuf_addf(&buf2, comment_line_char, "%s\n", _(skip_first_commit_msg_str));
>         s = start = orig_msg = strbuf_detach(msg, &orig_msg_len);
>         while (s) {
>                 const char *next;
> @@ -2269,8 +2269,8 @@ static int do_pick_commit(struct repository *r,
>                 next = parent;
>                 next_label = msg.parent_label;
>                 if (opts->commit_use_reference) {
> -                       strbuf_addstr(&msgbuf,
> -                               "# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
> +                       strbuf_commented_addf(&msgbuf, comment_line_char, "%s",
> +                               "*** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
>                 } else if (skip_prefix(msg.subject, "Revert \"", &orig_subject) &&
>                            /*
>                             * We don't touch pre-existing repeated reverts, because
>

I thought the point of the comment_line_char was so that commit
messages could have lines starting with '#'.  That rationale doesn't
apply to the TODO list generation or parsing, and I'm not sure if we
want to add the same complexity there.  If we do want to add the same
complexity there, I'm worried that making these changes are
insufficent; there are some other hardcoded '#' references in the code
(as a quick greps for '".*#' and "'#'" will turn up).  Since those
other references include parsing as well as generation, I think we
might actually be introducing bugs in the TODO list handling if we
only partially convert it, but someone would need to double check.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 30, 2023

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 30, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

Elijah Newren <newren@gmail.com> writes:

> I thought the point of the comment_line_char was so that commit
> messages could have lines starting with '#'.  That rationale doesn't
> apply to the TODO list generation or parsing, and I'm not sure if we
> want to add the same complexity there.

Thanks for a healthy dose of sanity.  I noticed existing use of
comment_line_char everywhere in sequencer.c and assumed we would
want to be consistent, but you are right to point out that they are
all about the COMMIT_EDITMSG kind of thing, and not about what
appears in "sequencer/todo".

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Tony Tung wrote (reply to this):

> On Oct 30, 2023, at 4:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Elijah Newren <newren@gmail.com> writes:
> 
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.
> 
> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo”.

I believe comment_line_char is being applied when the sequencer reads back the instructions, which is why I ran into this problem to begin with.

If the intent is not to apply it to the sequencer, then the bugfix is in the wrong place.

Thanks

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

User Tony Tung <tonytung@merly.org> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Tony Tung wrote (reply to this):

> On Oct 30, 2023, at 4:35 PM, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Elijah Newren <newren@gmail.com> writes:
> 
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.
> 
> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo”.

Actually, I withdraw my previous comment.  comment_line_char is all over sequencer.c, and there are tests that say, "rebase -i respects core.commentchar”, which strongly implies that the *intent* is that rebase -i respects comment_line_char.

I would propose that we move forward with this, except perhaps removing more instances of comment_line_char.

Thanks.


@ttung
Copy link
Author

ttung commented Oct 31, 2023

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

Submitted as pull.1603.v2.git.1698728952.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1603/ttung/ttung/commentchar-v2

To fetch this version to local tag pr-1603/ttung/ttung/commentchar-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1603/ttung/ttung/commentchar-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

Junio C Hamano <gitster@pobox.com> writes:

> Elijah Newren <newren@gmail.com> writes:
>
>> I thought the point of the comment_line_char was so that commit
>> messages could have lines starting with '#'.  That rationale doesn't
>> apply to the TODO list generation or parsing, and I'm not sure if we
>> want to add the same complexity there.

Earlier I said

> Thanks for a healthy dose of sanity.  I noticed existing use of
> comment_line_char everywhere in sequencer.c and assumed we would
> want to be consistent, but you are right to point out that they are
> all about the COMMIT_EDITMSG kind of thing, and not about what
> appears in "sequencer/todo".

but with something as simple as

    $ git -c core.commentchar='@' rebase -i master seen^2

I can see that the references to comment_line_char in sequencer.c
are about the commented lines after the list of insn in the
generated sequencer/todo file, so even though the rationale does not
apply, isn't this already "broken" in the current code anyway?

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

On Mon, Oct 30, 2023 at 10:33 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Elijah Newren <newren@gmail.com> writes:
> >
> >> I thought the point of the comment_line_char was so that commit
> >> messages could have lines starting with '#'.  That rationale doesn't
> >> apply to the TODO list generation or parsing, and I'm not sure if we
> >> want to add the same complexity there.
>
> Earlier I said
>
> > Thanks for a healthy dose of sanity.  I noticed existing use of
> > comment_line_char everywhere in sequencer.c and assumed we would
> > want to be consistent, but you are right to point out that they are
> > all about the COMMIT_EDITMSG kind of thing, and not about what
> > appears in "sequencer/todo".
>
> but with something as simple as
>
>     $ git -c core.commentchar='@' rebase -i master seen^2
>
> I can see that the references to comment_line_char in sequencer.c
> are about the commented lines after the list of insn in the
> generated sequencer/todo file, so even though the rationale does not
> apply, isn't this already "broken" in the current code anyway?

Yes, I believe it is.  However, I remember specifically looking at
cases with --rebase-merges about a year and a half ago, and noted that
there was a mixture of hardcoded '#' references along with
comment_line_char.  I noted at the time that changing
comment_line_char looked like it had a bug, and that the parsing in
particular would be fooled and do wrong things if it changed.
Unfortunately, I can't find any notes from the time with the details,
so I don't remember exactly what or how it was triggered.

However, I do suspect that the references to comment_line_char in the
`rebase -i` codepaths was not for any actual intended purpose, but
just noting that they were used elsewhere in the file (for
COMMIT_EDITMSG, where it made sense) and just mimicking that code
without realizing the lack of rationale.  That would have been mere
wasted effort had the comment_line_char been consistently supported in
the TODO file editing and parsing, but it wasn't, which left TODO
editing & parsing somewhat broken.

I think supporting comment_line_char for the TODO file provides no
value, and I think the easier fix would be undoing the uses of
comment_line_char relative to the TODO file (perhaps also leaving
in-code comments to the effect that comment_line_char just doesn't
apply to the TODO file).

However, if someone prefers to make the TODO file also respect
comment_line_char, despite its dubious value, then I expect any patch
should
  1) audit *every* reference found via git grep -e '".*#' -e "'#'" sequencer.c
  2) add a test case (or cases) involving --rebase-merges -i that
trigger the relevant code paths
If they don't do that, then I fear we might make the bug more likely
to be triggered rather than less.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

Hi,

On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Instead of using the hardcoded # , use the user-defined comment_line_char.
> Adds a test to prevent regressions.
>
> Tony Tung (2):
>   sequencer: remove use of comment character
>   sequencer: fix remaining hardcoded comment char

The second commit message seems to suggest that the two commits should
just be squashed; there's no explicit or even implicit reason provided
for why the two small patches are logically independent.  After
reading them carefully, and digging through the particular changes
being made and what part of the code they touch, I think I can guess
at a potential reason, but I feel like I'm crossing into the territory
of mind reading trying to articulate that reason.  (Besides, my
rationale would argue that the two patches should be split
differently.)  Perhaps a comment could be added, to either the second
commit message or the cover letter, to explain that better?

More importantly, though, I think the second commit message is
actually wrong.  Before and after applying this series:

$ git grep -c -e '".*#' -e "'#'" -- sequencer.c
sequencer.c:16

$ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
$ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx

$ git grep -c -e '".*#' -e "'#'" -- sequencer.c
sequencer.c:12

Granted, four of those lines are code comments, but that still leaves
8 hard coded references to '#' in the code at the end (i.e. the
majority are still left), meaning your second patch doesn't do what
its subject line claims.

And, most important of all is still the first patch.  As I stated
elsewhere in this thread (at
CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):

"""
I think supporting comment_line_char for the TODO file provides no
value, and I think the easier fix would be undoing the uses of
comment_line_char relative to the TODO file (perhaps also leaving
in-code comments to the effect that comment_line_char just doesn't
apply to the TODO file).

However, if someone prefers to make the TODO file also respect
comment_line_char, despite its dubious value, then I expect any patch
should
  1) audit *every* reference found via git grep -e '".*#' -e "'#'" sequencer.c
  2) add a test case (or cases) involving --rebase-merges -i that
trigger the relevant code paths
If they don't do that, then I fear we might make the bug more likely
to be triggered rather than less.
"""

Personally, I would rather not accept patches changing the handling of
the TODO script relative to comment_line_char until the above is done,
and I worry that half measures _might_ end up being more hurtful than
helpful.

I feel quite differently about patches that make COMMIT_EDITMSG
handling use comment_line_char more consistently since that code
simply writes the file without re-parsing it; although fixing
everything would be best, even fixing some of them to use
comment_line_char would be welcome.  I think the first two hunks of
your second patch happen to fall into this category, so if those were
split out, then I'd say those are good partial solutions.

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Elijah

On 31/10/2023 06:55, Elijah Newren wrote:
> Hi,
> > On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> Instead of using the hardcoded # , use the user-defined comment_line_char.
>> Adds a test to prevent regressions.
>>
>> Tony Tung (2):
>>    sequencer: remove use of comment character
>>    sequencer: fix remaining hardcoded comment char
> > The second commit message seems to suggest that the two commits should
> just be squashed; there's no explicit or even implicit reason provided
> for why the two small patches are logically independent.  After
> reading them carefully, and digging through the particular changes
> being made and what part of the code they touch, I think I can guess
> at a potential reason, but I feel like I'm crossing into the territory
> of mind reading trying to articulate that reason.  (Besides, my
> rationale would argue that the two patches should be split
> differently.)  Perhaps a comment could be added, to either the second
> commit message or the cover letter, to explain that better?
> > More importantly, though, I think the second commit message is
> actually wrong.  Before and after applying this series:
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> sequencer.c:16
> > $ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
> $ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> sequencer.c:12

As far as I can see those remaining instances are all to do with the '#' that separates a merge subject line from its parents. I don't think we need to complicate things anymore by respecting core.commentchar there as the '#' is not denoting a commented line, it is being used as an intra-line separator instead.

> Granted, four of those lines are code comments, but that still leaves
> 8 hard coded references to '#' in the code at the end (i.e. the
> majority are still left), meaning your second patch doesn't do what
> its subject line claims.
> > And, most important of all is still the first patch.  As I stated
> elsewhere in this thread (at
> CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):
> > """
> I think supporting comment_line_char for the TODO file provides no
> value, and I think the easier fix would be undoing the uses of
> comment_line_char relative to the TODO file (perhaps also leaving
> in-code comments to the effect that comment_line_char just doesn't
> apply to the TODO file).

I agree that I don't see much point in respecting core.commentchar in the TODO file as unlike a commit message a legitimate non-commented line will never begin with '#'. Unfortunately I think we're committed to respecting it - see 180bad3d10f (rebase -i: respect core.commentchar, 2013-02-11)

> [...] > I feel quite differently about patches that make COMMIT_EDITMSG
> handling use comment_line_char more consistently since that code
> simply writes the file without re-parsing it; although fixing
> everything would be best, even fixing some of them to use
> comment_line_char would be welcome.  I think the first two hunks of
> your second patch happen to fall into this category, so if those were
> split out, then I'd say those are good partial solutions.

I think splitting the changes so that we have one patch that fixes the TODO file generation and another that fixes the commit message generation for fixup commands would be best.

Best Wishes

Phillip

@gitgitgadget
Copy link

gitgitgadget bot commented Oct 31, 2023

User Phillip Wood <phillip.wood123@gmail.com> has been added to the cc: list.

@@ -1893,8 +1893,14 @@
size_t orig_msg_len;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Tony

Thanks for working on this. When you're submitting patches please try testing them locally and check the CI before doing '/submit'. In this case all the jobs that try to compile git are failing - see https://github.com/gitgitgadget/git/actions/runs/6702301267/job/18211090139?pr=1603#step:4:260 for example.

On 31/10/2023 05:09, Tony Tung via GitGitGadget wrote:
> From: Tony Tung <tonytung@merly.org>
> > Signed-off-by: Tony Tung <tonytung@merly.org>
> ---
>   sequencer.c | 18 +++++++++++++-----
>   1 file changed, 13 insertions(+), 5 deletions(-)
> > diff --git a/sequencer.c b/sequencer.c
> index 8c6666d5e43..bbadc3fb710 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1893,8 +1893,14 @@ static void update_squash_message_for_fixup(struct strbuf *msg)
>   	size_t orig_msg_len;
>   	int i = 1;
>   > -	strbuf_addf(&buf1, "# %s\n", _(first_commit_msg_str));
> -	strbuf_addf(&buf2, "# %s\n", _(skip_first_commit_msg_str));
> +	strbuf_commented_addf(&buf1,
> +			      comment_line_char,
> +			      "%s\n",
> +			      _(first_commit_msg_str));

This is what is causing the compilation the fail. It should be

	strbuf_commented_addf(&buf1, "%c $s\n", comment_char_line,
			      _(first_commit_msg_str));

> +	strbuf_commented_addf(&buf2,
> +			      comment_line_char,
> +			      "%s\n",
> +			      _(skip_first_commit_msg_str));

This needs changing in the same way.

It would be nice to add a test for this. I think we can do that by adding

	test_config core.commentchar :

To the beginning of the test '--skip after failed fixup cleans commit message' in t3418-rebase-continue.sh and changing the expected message so that it uses ':' instead of '#' for the comments.

>   	s = start = orig_msg = strbuf_detach(msg, &orig_msg_len);
>   	while (s) {
>   		const char *next;
> @@ -2269,8 +2275,9 @@ static int do_pick_commit(struct repository *r,
>   		next = parent;
>   		next_label = msg.parent_label;
>   		if (opts->commit_use_reference) {
> -			strbuf_addstr(&msgbuf,
> -				"# *** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
> +			strbuf_commented_addf(&msgbuf,
> +					      "%s",
> +					      "*** SAY WHY WE ARE REVERTING ON THE TITLE LINE ***");
>   		} else if (skip_prefix(msg.subject, "Revert \"", &orig_subject) &&
>   			   /*
>   			    * We don't touch pre-existing repeated reverts, because
> @@ -6082,7 +6089,8 @@ static int add_decorations_to_list(const struct commit *commit,
>   		/* If the branch is checked out, then leave a comment instead. */
>   		if ((path = branch_checked_out(decoration->name))) {
>   			item->command = TODO_COMMENT;
> -			strbuf_commented_addf(ctx->buf, comment_line_char,
> +			strbuf_commented_addf(ctx->buf,
> +					      comment_line_char,
>   					      "Ref %s checked out at '%s'\n",
>   					      decoration->name, path);

This hunk is unnecessary - it is just changing the code you added in the first patch.

Best Wishes

Phillip

@@ -6082,8 +6082,9 @@ static int add_decorations_to_list(const struct commit *commit,
/* If the branch is checked out, then leave a comment instead. */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Tony

On 31/10/2023 05:09, Tony Tung via GitGitGadget wrote:
> From: Tony Tung <tonytung@merly.org>
> > Instead of using the hardcoded `# `, use the
> user-defined comment_line_char.  Adds a test
> to prevent regressions.

Well spotted and thanks for fixing this. Normally we wrap the commit message at ~72 chars.

> Signed-off-by: Tony Tung <tonytung@merly.org>
> ---
>   sequencer.c                   |  5 +++--
>   t/t3404-rebase-interactive.sh | 39 +++++++++++++++++++++++++++++++++++
>   2 files changed, 42 insertions(+), 2 deletions(-)
> > diff --git a/sequencer.c b/sequencer.c
> index d584cac8ed9..8c6666d5e43 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -6082,8 +6082,9 @@ static int add_decorations_to_list(const struct commit *commit,
>   		/* If the branch is checked out, then leave a comment instead. */
>   		if ((path = branch_checked_out(decoration->name))) {
>   			item->command = TODO_COMMENT;
> -			strbuf_addf(ctx->buf, "# Ref %s checked out at '%s'\n",
> -				    decoration->name, path);
> +			strbuf_commented_addf(ctx->buf, comment_line_char,
> +					      "Ref %s checked out at '%s'\n",
> +					      decoration->name, path);
>   		} else {
>   			struct string_list_item *sti;
>   			item->command = TODO_UPDATE_REF;
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index 8ea2bf13026..076dca87871 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -1839,6 +1839,45 @@ test_expect_success '--update-refs adds label and update-ref commands' '
>   	)
>   '

Thank you for taking the time to add a test. I think it could be simplified though as all we really need to do is check that the expected comment is present in the todo list. Something like (untested)

test_expect_success '--update-refs works with core.commentChar' '
	git worktree add new-branch &&
	test_when_finished "git worktree remove new-branch" &&
	test_config core.commentchar : &&
	write_script fake-editor.sh <<-\EOF &&
	grep "^: Ref refs/heads/new-branch checked out at .*new-branch" "$1" &&
	# no need to rebase
	>"$1"
	EOF
	(
		test_set_editor "$(pwd)/fake-editor.sh" &&
		git rebase -i --update-refs HEAD^
	)
'

Best Wishes

Phillip

> +test_expect_success '--update-refs works with core.commentChar' '
> +	git checkout -b update-refs-with-commentchar no-conflict-branch &&
> +	test_config core.commentChar : &&
> +	git branch -f base HEAD~4 &&
> +	git branch -f first HEAD~3 &&
> +	git branch -f second HEAD~3 &&
> +	git branch -f third HEAD~1 &&
> +	git commit --allow-empty --fixup=third &&
> +	git branch -f is-not-reordered &&
> +	git commit --allow-empty --fixup=HEAD~4 &&
> +	git branch -f shared-tip &&
> +	git checkout update-refs &&
> +	(
> +		write_script fake-editor.sh <<-\EOF &&
> +		grep "^[^:]" "$1"
> +		exit 1
> +		EOF
> +		test_set_editor "$(pwd)/fake-editor.sh" &&
> +
> +		cat >expect <<-EOF &&
> +		pick $(git log -1 --format=%h J) J
> +		fixup $(git log -1 --format=%h update-refs) fixup! J : empty
> +		update-ref refs/heads/second
> +		update-ref refs/heads/first
> +		pick $(git log -1 --format=%h K) K
> +		pick $(git log -1 --format=%h L) L
> +		fixup $(git log -1 --format=%h is-not-reordered) fixup! L : empty
> +		update-ref refs/heads/third
> +		pick $(git log -1 --format=%h M) M
> +		update-ref refs/heads/no-conflict-branch
> +		update-ref refs/heads/is-not-reordered
> +		update-ref refs/heads/update-refs-with-commentchar
> +		EOF
> +
> +		test_must_fail git rebase -i --autosquash --update-refs primary shared-tip >todo &&
> +		test_cmp expect todo
> +	)
> +'
> +
>   test_expect_success '--update-refs adds commands with --rebase-merges' '
>   	git checkout -b update-refs-with-merge no-conflict-branch &&
>   	git branch -f base HEAD~4 &&

Copy link

gitgitgadget bot commented Nov 1, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <phillip.wood123@gmail.com> writes:

> As far as I can see those remaining instances are all to do with the
> '#' that separates a merge subject line from its parents. I don't
> think we need to complicate things anymore by respecting
> core.commentchar there as the '#' is not denoting a commented line, it
> is being used as an intra-line separator instead.

It is unfortunate that the format of the file needs an intra-line
separator in the first place, but I tend to agree with you that the
comment-line-char would be a terrible fit there.  '#' or any
replacement character at the beginning of a line is easy to spot as
a signal that the line in its entirety is commented out, but it is
much harder to eyeball-spot a single punctuation character in the
middle of a line.  If we do not have to look for a different
character depending on the comment-line-char setting, it would make
the system (slightly) easier to use.

> I agree that I don't see much point in respecting core.commentchar in
> the TODO file as unlike a commit message a legitimate non-commented
> line will never begin with '#'. Unfortunately I think we're committed
> to respecting it - see 180bad3d10f (rebase -i: respect
> core.commentchar, 2013-02-11)

Yeah, the ship has long sailed.

> I think splitting the changes so that we have one patch that fixes the
> TODO file generation and another that fixes the commit message
> generation for fixup commands would be best.

Yes, it would be great.

Thanks.

Copy link

gitgitgadget bot commented Nov 1, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Oct 31, 2023 at 4:18 AM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Elijah
>
> On 31/10/2023 06:55, Elijah Newren wrote:
> > Hi,
> >
> > On Mon, Oct 30, 2023 at 10:09 PM Tony Tung via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> Instead of using the hardcoded # , use the user-defined comment_line_char.
> >> Adds a test to prevent regressions.
> >>
> >> Tony Tung (2):
> >>    sequencer: remove use of comment character
> >>    sequencer: fix remaining hardcoded comment char
> >
> > The second commit message seems to suggest that the two commits should
> > just be squashed; there's no explicit or even implicit reason provided
> > for why the two small patches are logically independent.  After
> > reading them carefully, and digging through the particular changes
> > being made and what part of the code they touch, I think I can guess
> > at a potential reason, but I feel like I'm crossing into the territory
> > of mind reading trying to articulate that reason.  (Besides, my
> > rationale would argue that the two patches should be split
> > differently.)  Perhaps a comment could be added, to either the second
> > commit message or the cover letter, to explain that better?
> >
> > More importantly, though, I think the second commit message is
> > actually wrong.  Before and after applying this series:
> >
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> > sequencer.c:16
> >
> > $ b4 am c9f4ff34dbdb7ba221e4203bb6551b80948dc71d.1698728953.git.gitgitgadget@gmail.com
> > $ git am ./v2_20231031_gitgitgadget_sequencer_remove_use_of_hardcoded_comment_char.mbx
> >
> > $ git grep -c -e '".*#' -e "'#'" -- sequencer.c
> > sequencer.c:12
>
> As far as I can see those remaining instances are all to do with the '#'
> that separates a merge subject line from its parents. I don't think we
> need to complicate things anymore by respecting core.commentchar there
> as the '#' is not denoting a commented line, it is being used as an
> intra-line separator instead.

Ah, that might be jogging my memory slightly.  I had a patch to put a
comment before the one-line commit summaries in the TODO list
(https://github.com/git/git/commit/f1ae608477e010b96557d6fc87eed9f3f39b905e).
I think I at some point noticed comment_line_char, and went to switch
to it, probably also switching the mid-line comment char for merges,
and then noticed the potential for breakage due to the manual parsing
of those.

Anyway, I trust your analysis, but I believe some of that analysis
belongs in the relevant commit messages if we push forward with these
changes.

> > Granted, four of those lines are code comments, but that still leaves
> > 8 hard coded references to '#' in the code at the end (i.e. the
> > majority are still left), meaning your second patch doesn't do what
> > its subject line claims.
> >
> > And, most important of all is still the first patch.  As I stated
> > elsewhere in this thread (at
> > CABPp-BFY7m_g+sT131_Ubxqo5FsHGKOPMng7=90_0-+xCS9NEQ@mail.gmail.com):
> >
> > """
> > I think supporting comment_line_char for the TODO file provides no
> > value, and I think the easier fix would be undoing the uses of
> > comment_line_char relative to the TODO file (perhaps also leaving
> > in-code comments to the effect that comment_line_char just doesn't
> > apply to the TODO file).
>
> I agree that I don't see much point in respecting core.commentchar in
> the TODO file as unlike a commit message a legitimate non-commented line
> will never begin with '#'. Unfortunately I think we're committed to
> respecting it - see 180bad3d10f (rebase -i: respect core.commentchar,
> 2013-02-11)

Thanks for digging up the old commit and the explicit mention of the
TODO file.  Kind of disappointing.  While I can't imagine anything
that would actually break by reverting this, it's not worth it at this
point.

> > [...]
> > I feel quite differently about patches that make COMMIT_EDITMSG
> > handling use comment_line_char more consistently since that code
> > simply writes the file without re-parsing it; although fixing
> > everything would be best, even fixing some of them to use
> > comment_line_char would be welcome.  I think the first two hunks of
> > your second patch happen to fall into this category, so if those were
> > split out, then I'd say those are good partial solutions.
>
> I think splitting the changes so that we have one patch that fixes the
> TODO file generation and another that fixes the commit message
> generation for fixup commands would be best.

That would seem more logical to me.

@@ -6082,8 +6082,9 @@ static int add_decorations_to_list(const struct commit *commit,
/* If the branch is checked out, then leave a comment instead. */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Tony Tung via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Subject: Re: [PATCH v2 1/2] sequencer: remove use of comment character

The patch does not seem to be doing that, though.  It may have
removed '#' in "# Ref", but still uses comment_line_char, so it does
not remove use at all (and we do not want to, of course).

"use the core.commentchar consistently"

> From: Tony Tung <tonytung@merly.org>
>
> Instead of using the hardcoded `# `, use the
> user-defined comment_line_char.  Adds a test
> to prevent regressions.

Overly short lines.

The readers cannot tell where in the output the hardcoded # appears
with the above description. I am guessing that it is in comments in
the sequencer/todo file that mark commits that are at the tip of
branches that are checked out, but there may be more specific
circumstances in which the comment is used, like "when rebase -i is
used with the --update-refs option", if so that also need to be told
to the readers.

Describe the condition well enough so that readers can easily see
the defect the patch attempts to fix.

> -			strbuf_addf(ctx->buf, "# Ref %s checked out at '%s'\n",
> -				    decoration->name, path);
> +			strbuf_commented_addf(ctx->buf, comment_line_char,
> +					      "Ref %s checked out at '%s'\n",
> +					      decoration->name, path);

OK.

Instead of using the hardcoded `# `, use the
user-defined comment_line_char.  Adds a test
to prevent regressions.

Signed-off-by: Tony Tung <tonytung@merly.org>
Signed-off-by: Tony Tung <tonytung@merly.org>
@dscho
Copy link
Member

dscho commented Dec 2, 2023

@ttung ttung force-pushed the ttung/commentchar branch from 6a43527 to e1b59b3

@ttung did you mean to /submit a new iteration?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants