commit-extra-checker and commits new to branch but not repository #20

jsm28 · 2021-08-03T17:04:02Z

In GCC, we use the commit-extra-checker to make several checks on commits pushed to the GCC repository. Some of those commits apply to all branches, and some only to master and release branches.

We've observed the following at least once: a commit gets pushed to a branch that is not master or a release branch, so subject to weaker changes, then the same commit gets pushed to master, so bypassing the stronger checks that the commit-extra-checker script applies to commits going on master.

Thus, we'd like that hook, or something equivalent, to apply to all commits new to the ref being updated, whether or not they are new to the repository. But at the same time, we don't want to recheck 187113 existing commits, many of which would fail the checks, when a new branch is created based on master; this is only about updates to existing branches. Maybe we need a way to distinguish "checks for commits new to the repository" (which shouldn't depend on the ref being updated) from "checks for commits new to an existing ref" (which should apply to existing commits, not just commits new to the repository).

brobecke · 2021-08-17T20:57:21Z

Hi @jsm28 ,

Indeed, the commit-extra-checker only gets called when the commit is entirely new to the repository. In other words, if assumes that if the change is good for one branch, then it's good for all branches.

At the moment, I can't think of a good way to really express in general ways a feature which would allow the hooks to call your checker for all the commits you want to call it for, and only those commits. You nailed it that not wanting to re-check the 187113 pre-existing commits already on master is a must: I have a similar problem in another context when using Gerrit, and it's causing a lot of problems.

What I might recommend is that you start experimenting with a hooks.update-hook script that implements the logic you're looking for. That way, you can embed in that script some project-specific knowledge about which branches have weaker requirements, and therefore when a pre-existing commit needs to be re-checked for a official branches.

If this helps us design a clean and easy to understand set of knobs, I can put that on my list of things to look at after I've finished the transition to Python 3.x (I am getting pretty close, now, with being able to run most testscases with Python 3.x and reaching 99% code coverage -- something I will resume working on in Sep/Oct).

Other than that, one question I asked myself is whether it's a good idea or not to have different levels of requirement levels depending on the type of branch. Is that something that the GCC project might want to reconsider? I'm guessing not, and you do not have to explain the reasons to me, but I thought I'd mention it here, in case it was an option. Failing that, perhaps another cheap option might be to change a bit how those extra checks are being performed, and instead of weakening the checks by checking the branch name, maybe you could check the commit message instead, and only turn some checks off based on some keywords, similar to the "no-precommit-check" keyword, for instance. That way, when a user tries to push those changes to e.g. master, we can use the hooks.update-hook to check the commit messages of all the new commits for that branch, and reject the update if we find one of those keywords (whose presence would indicate that someone weakened the extra checks for his purposes and forgot to handle it before pushing to master).

brobecke · 2021-12-19T06:44:43Z

Hi @jsm28 ,

I have been thinking about this request, and I am a bit torn about it.

As I mentioned, for me the toughest part is to find a way to express your needs in the config file in a way which is easy for users to understand and won't cause too much complexity in terms of how each configuration knobs interact with each other.

With that in mind, I think the first decision is to say that, given a commit, the hooks will either decide check the commit, and therefore run all checks (modulo the project's configuration), or not check the commit at all.

One option is to have a config telling the hooks that, for a given set of references, all new commits on that references are always checked, regardless of whether they are preexisting or not. But as you correctly mention, we should only apply this to the case where we update a branch, and ignore it when creating a new branch. I'll slight rephrase this to mention a scenario where we need to be even more careful: Pushing a branch creation which introduces some commits which are entirely new to the repository. In that case, we want to ignore the pre-existing commits, and still check the new ones.

Another option which has been in my mind for a while is to provide a new config to be called at the start of the update phase, which would be responsible for telling the git-hooks in some form which commits for which to force the checking. It's till very nebulous in my mind, because I worry about the balance between the information to send the script in order to facilitate the script's writing, and the potential quantity of data it represents in some cases (e.g. creation of new branch with nearly 200,000 commits already pre-existing). I also worry about the config getting the git-hooks into scenarii that I haven't previously envisioned with the hooks either crashing or behaving unexpectedly. So I would prefer to avoid that option if possible.

With that said, the solution I proposed in my previous message...

instead of weakening the checks by checking the branch name, maybe you could check the commit message instead, and only turn some checks off based on some keywords, similar to the "no-precommit-check" keyword, for instance. That way, when a user tries to push those changes to e.g. master, we can use the hooks.update-hook to check the commit messages of all the new commits for that branch, and reject the update if we find one of those keywords (whose presence would indicate that someone weakened the extra checks for his purposes and forgot to handle it before pushing to master).

... does require a bit of implementation work for GCC and then is a bit more work from the users in terms of workflow. On the other hand, I can see some good advantages in return: It allows people seeing the commit to know exactly which checks have been skipped or weakened. And for users ready to request official inclusion of their patches, not having any of the keywords requesting weakened checks is a guaranty that they won't be hitting an issue at the very last moment (the moment when they push).

What do you think? Would you guys be happy with the solution I proposed, or would you prefer a different solution (In which case, would the first option I presented above fit your needs)?

jsm28 · 2021-12-20T17:55:49Z

I don't think anything involving users needing to add special keywords to their commits on development branches to avoid checks such as that for a ChangeLog entry is suitable. Users are meant to be able to do what they want on such branches (especially user and vendor ones); even the minimal check that a commit message doesn't have a single-word subject line (one of those that applies on all branches) has caused some concerns there.

brobecke · 2021-12-22T01:26:03Z

OK. I will put the following option..

have a config telling the hooks that, for a given set of references, all new commits on that references are always checked, regardless of whether they are preexisting or not. But as you correctly mention, we should only apply this to the case where we update a branch, and ignore it when creating a new branch. I'll slight rephrase this to mention a scenario where we need to be even more careful: Pushing a branch creation which introduces some commits which are entirely new to the repository. In that case, we want to ignore the pre-existing commits, and still check the new ones.

.. on my TODO.

brobecke · 2022-01-04T08:04:18Z

Fixed by commit 0e7b6ff, which introduces the hooks.force-precommit-checks option, with some documentation in README.md on how to use it.

Enjoy :-)

brobecke closed this as completed Jan 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

commit-extra-checker and commits new to branch but not repository #20

commit-extra-checker and commits new to branch but not repository #20

jsm28 commented Aug 3, 2021

brobecke commented Aug 17, 2021

brobecke commented Dec 19, 2021

jsm28 commented Dec 20, 2021

brobecke commented Dec 22, 2021

brobecke commented Jan 4, 2022

commit-extra-checker and commits new to branch but not repository #20

commit-extra-checker and commits new to branch but not repository #20

Comments

jsm28 commented Aug 3, 2021

brobecke commented Aug 17, 2021

brobecke commented Dec 19, 2021

jsm28 commented Dec 20, 2021

brobecke commented Dec 22, 2021

brobecke commented Jan 4, 2022