ocp-indent pre-commit hook #236

sim642 · 2021-05-20T07:32:39Z

make dev sets up a Git pre-commit hook, which prevents committing any files with indentation that the tool considers "wrong":

analyzer/make.sh

Lines 102 to 110 in f99164f

    
           ;; dev) 
        
             echo "Installing opam packages for development..." 
        
             opam install utop ocaml-lsp-server ocp-indent ocamlformat ounit2 earlybird 
        
             # ocaml-lsp-server is needed for https://github.com/ocamllabs/vscode-ocaml-platform 
        
             echo "Be sure to adjust your vim/emacs config!" 
        
             echo "Installing Pre-commit hook..." 
        
             cd .git/hooks; ln -s ../../scripts/hooks/pre-commit; cd - 
        
             echo "Installing gem parallel (not needed for ./scripts/update_suite.rb -s)" 
        
             sudo gem install parallel

I deleted this hook locally already years ago and I guess many of you have done the same (or never even set it up). It was impossible to make tiny correctly indented changes because it also forced to fix the indentation of the rest of the file, which hadn't been touched for a long time (I guess ocp-indent cannot be told to just complain about the modified changes). This is clearly still an issue: vandah@4e8840c.

It forces the awful practice of doing logic changes together with completely unrelated formatting changes in the same commit. This obfuscates the Git history and makes Git blame more difficult to use, as one cannot just ignore formatting commits if they also contain logic changes: https://github.com/goblint/analyzer/blob/master/.git-blame-ignore-revs. Our suggested make dev setup should not be encouraging this practice!

I'm wondering what everyone's thoughts about this are, because it doesn't seem great that we recommend a pre-commit hook, which will immediately hinder development and just push everyone to remove the hook, which in itself leads again to more misindented code being committed.

One thought I have is to remove the hook, so locally you can make all the changes without being forced to combine unrelated formatting changes. Instead we could have a GitHub Actions job that runs ocp-indent. That would of course require fixing all the indentation problems that already are around and also agreeing that ocp-indent is the gold standard for our code style.

The text was updated successfully, but these errors were encountered:

michael-schwarz · 2021-05-20T07:37:26Z

I'm wondering what everyone's thoughts about this are

Honestly, I would just go with removing the hook. I think indenting as it currently is in the code base is not too bad, and there might be specific instances where breaking conventions as enforced by ocp-indent actually makes the code easier to understand.

jerhard · 2021-05-20T08:56:11Z

I think having a common, automatable indentation style would be a good thing. From what I saw, the changes that ocp-indent makes to indentation are mostly minor, but it makes the indentation more consistent. It does not seem to insert line breaks depending on the syntax, which would limit flexibility.

That would of course require fixing all the indentation problems that already are around and also agreeing that ocp-indent is the gold standard for our code style.

Are there other indentation tools for OCaml? Not that I know of.

vogler · 2021-05-20T09:21:35Z

It forces the awful practice of doing logic changes together with completely unrelated formatting changes in the same commit. This obfuscates the Git history and makes Git blame more difficult to use ...

Yes, true, also deleted it. Was too lazy to make it only apply to the diff.
Changing the xargs arg from

analyzer/scripts/hooks/pre-commit

Line 52 in f99164f

    
           git diff --cached --name-only | grep -E ".*\.ml(i|l|y)?$" | xargs -I% bash -c 'diff <(git show :%) <(git show :% | ocp-indent) || (echo; echo "ocp-indent says you suck at indentation in % (diff shown above)."; false)' || (echo "You shall not commit!"; exit 1)

to

diff <(diff <(git show HEAD:%) <(git show :%)) <(diff <(git show HEAD:% | ocp-indent) <(git show :% | ocp-indent))

would work, but it seems pretty convoluted and maybe too expensive. It shows the diff between the to-be-committed changes and the same run through ocp-indent.
ocp-indent has an option --lines=RANGE, it's probably better to use that.
Sadly, it doesn't support multiple ranges (ocp-indent -l10-15 -l110-130 src/solvers/td3.ml), so you'd have to call it for all hunks of a file.

One thought I have is to remove the hook, so locally you can make all the changes without being forced to combine unrelated formatting changes. Instead we could have a GitHub Actions job that runs ocp-indent. That would of course require fixing all the indentation problems that already are around and also agreeing that ocp-indent is the gold standard for our code style.

I think it's good to have it run on the changes you want to commit.
Idk how well it would work to have it run as an action. I could imagine that it also becomes tedious since you might commit and push something, then it creates a commit that fixes indention, you continue working, want to push again, but can't, pull and get merge conflicts. Also, when you want to revert a commit, you then have to revert the indent-commit etc.

Probably best to run ocp-indent on everything when we feel like it's a good time, and in-between rely on the hook to at least keep the changes well-indented.
I think last time I ran it on everything, there also were some changes that I didn't want to stage since they made the code less readable or split it up too much - don't remember what it was.

vogler · 2021-05-20T11:47:53Z

Are there other indentation tools for OCaml? Not that I know of.

There's also https://github.com/ocaml-ppx/ocamlformat, but that's probably too disruptive since it just pretty-prints the parse tree and adds the comments.

sim642 · 2021-05-20T13:08:52Z

I think last time I ran it on everything, there also were some changes that I didn't want to stage since they made the code less readable or split it up too much - don't remember what it was.

This is exactly why I was wondering if ocp-indent should be our gold standard. Because if there are places where we don't agree, it's very inconvenient. You cannot then just blindly reindent the whole project then. It's also a major problem for using the pre-commit hook: it's almost impossible to make changes to a file which contains such indentation that ocp-indent doesn't agree with. To do that you'd have to remove the hook, commit and then re-add the hook.

Not sure if ocp-indent has any means to disable it for some section of code (e.g. with comments). If it does, then that would help. If it doesn't, then enforcing the style is problematic.

Idk how well it would work to have it run as an action. I could imagine that it also becomes tedious since you might commit and push something, then it creates a commit that fixes indention

With Actions I just meant it would complain in a CI job, but wouldn't actually make new commits automatically. You'd see the job having failed because of indentation problems and then you can fix it yourself. Due to the aforementioned indentation disagreements, maybe it would be better if the CI job always succeeded (doesn't cause failure) but just outputs warnings. Of course that could mean nobody looks at them, although it would at least be better than all regular developers removing the hook and nobody seeing the output.

Or if we make the ocp-indent CI job do the slightly more expensive thing of only looking at different modified ranges, it could also cause failures to prevent new misindentation being introduced. Although it would add an extra level of complication since you could push many commits at once while the CI job would only run on the last one.

Honestly, I would just go with removing the hook. I think indenting as it currently is in the code base is not too bad

I agree with this. It hasn't been a big issue without using the hook, so we'd probably be fine without. And if anyone notices definitely misindented code in PRs or just browsing the code, it's not hard to manually make those fixes as they're found.
ocp-indent is often more pedantic than what would also be completely acceptable and readable, and exists in the codebase already.

vogler · 2021-05-20T15:33:20Z

It's also a major problem for using the pre-commit hook: it's almost impossible to make changes to a file which contains such indentation that ocp-indent doesn't agree with.

If it only diffs the staged changes as in my comment above, it shouldn't be a major problem. You just can't commit any changes that don't comply with ocp-indent.

To do that you'd have to remove the hook, commit and then re-add the hook.

There's git commit -n.

Not sold on CI since you only find out after push which is too late and causes more overhead than pre-commit.
Adding it as some further CI check doesn't hurt, but IMO it'd better if it were avoided in the first place.
We can also configure stuff we don't agree with: http://manpages.ubuntu.com/manpages/bionic/man1/ocp-indent.1.html#configuration
AFAIR it was fine mostly - we just have some code like

if x then y else
z

where z is many lines that would be indented.
Maybe this kind of code should just be indented or refactored by pulling z out.

sim642 · 2021-05-21T06:44:23Z

If it only diffs the staged changes as in my comment above, it shouldn't be a major problem. You just can't commit any changes that don't comply with ocp-indent.

I'm for updating the pre-commit hook to do that, then we could hopefully keep things more or less as is. It might motivate me to also add it back if it doesn't cause extra inconvenience.

There's git commit -n.

TIL. Although if the updated hook works, then this hopefully wouldn't really be needed. Although maybe it's a good idea to mention this command somewhere, maybe in docs or just the hook (or even its output).

Not sold on CI since you only find out after push which is too late and causes more overhead than pre-commit.
Adding it as some further CI check doesn't hurt, but IMO it'd better if it were avoided in the first place.

The reason I was thinking of CI was because as many people might have the hook already removed, they could still keep pushing misindented code. But unless we reindent everything to match ocp-indent (which I think we don't have to do), the CI would just keep complaining about the already existing misindented code.

#236

vogler · 2021-05-21T13:35:12Z

Used the above. A bit of a hack since it shows the diff of diffs, but should be good enough.
Maybe a good time to restore the symlink and report if it misbehaves.
If no, we can also add it as a GitHub action.

michael-schwarz · 2021-05-26T05:42:31Z

Now that we have the hook back, we can close this, right?

sim642 · 2021-05-26T06:01:21Z

I will document something about it and I think then I can close the issue for now.

If any complications arise again, we can discuss them further in this issue and reopen when it seems necessary to try some other solution.

sim642 · 2021-05-26T08:07:58Z

Well, this didn't take long... The hook doesn't behave well with merge commits:

The changes of a merge commit are all the changes being merged, so if any of the commits being merged themselves had indentation problems, then the hook will also prevent merging.
When that happens or there are conflicts, then after resolving those the natural thing to do is git merge --continue, which makes the merge commit. But that command doesn't allow --no-verify! Only the default (without --continue) variant allows --no-verify.

So I still ended up temporarily removing the hook. Not sure how often this would be an issue though.

Right now I was merging master into #196 to fix the conflicts. The misindentation things coming from master are related to all the cleanup changes I've been making, where I didn't fix existing indentation problems. But since those changes show up in the diff, ocp-indent starts complaining about a hundred places since the cleanup changes were very spread around, touching almost all printables and lattices. So the hook might annoy others' attempts at merging master into their branches as well, just so you know.

sim642 · 2021-05-26T08:31:16Z

Another fun bug of the hook: it complains about misindented code that's being deleted, because that's also in the diff.

sim642 · 2021-05-26T13:36:21Z

Another fun bug of the hook: it complains about misindented code that's being deleted, because that's also in the diff.

I'm sorry, but this is worse than it first seemed. Not only does the new hook complain about misindented code that's now deleted, it also complains about misindented code that's now changed to be properly indented because that's also in the diff!

For example, if you change

let x =
y

into

let x =
  y

The new hook first computes the diff without ocp-indented files, which is:

< y
---
>   y

Second it computes the diff with ocp-indented files, which are both equal to the new properly indented one, so the diff is empty.
Finally it computes the diff between the two diffs, which is:

< < y
< ---
< >   y

And because this is non-empty, it complains. I hope I'm understanding this correctly. (Not to mention it's essentially impossible to read this diff of diffs.)

The old hook, which checks the entire file (new without ocp-indent vs new with ocp-indent) sees no difference, because the new file is properly indented and everything is fine. So the new hook is complaining about things that the old one didn't.

vogler · 2021-05-26T15:08:50Z

Didn't test anything but adding misindented code 🙈
What you actually want is some set minus instead of diff, but I guess that can be achieved by piping it through sed and ignoring all the lines starting with < (and lstrip > to make it look like a normal diff).

Concerning merge commits: you wrote it checks for each commit - I thought that would only be the case for rebase, and for a merge commit it should only check that commit.
My git-fu is not that strong. There's probably a better way to get the to-be-committed changes that also plays well with merges than the git show approach above.
When there are conflicts then you'd likely also have to take MERGE_HEAD into account.
In case of an octopus merge, do we need to diff all parents?
Maybe easier to ignore merge commits.

sim642 · 2021-05-26T15:16:47Z

Didn't test anything but adding misindented code see_no_evil
What you actually want is some set minus instead of diff, but I guess that can be achieved by piping it through sed and ignoring all the lines starting with < (and lstrip > to make it look like a normal diff).

At this point I'm wondering if multiple calls using --lines argument extracted from diff lines would be more straightforward.

Concerning merge commits: you wrote it checks for each commit - I thought that would only be the case for rebase, and for a merge commit it should only check that commit.

Well, it doesn't re-check for each commit, but if you git merge --no-commit (or it stays in MERGING state due to conflicts), then all changes being merged are staged I think, so that's why it would complain again.

Maybe easier to ignore merge commits.

This is probably the most reasonable approach, we don't need it to be that advanced. (Also I don't think anyone in Goblint's history has ever performed an octopus merge, we aren't the linux kernel.)

Technically a merge commit could also introduce new changes whose indentation may be off, but given our imperfect indentation state, such rare occurrence wouldn't be significant.

command string gets too long for xargs to replace: ~~~ fatal: ambiguous argument ':%': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' ~~~

vogler · 2021-06-08T13:44:12Z

Changed from xargs to for-loop in a381b47, but noticed that a simple sed replacement is not enough.

At this point I'm wondering if multiple calls using --lines argument extracted from diff lines would be more straightforward.

This is probably the way to go.

vogler · 2021-06-09T09:12:46Z

I changed it to be based on --lines. Should be better.
No longer uses HEAD, so maybe it's also fine for merges now.

michael-schwarz · 2021-06-23T07:19:51Z

I think this seems to be working for almost everyone now. We can reopen if we encounter any new problems.

vogler · 2021-07-27T14:23:13Z

Stumbled upon this: https://pre-commit.ci
Not sure if we want automatic commits that fix indentation in PRs.
Also, https://pre-commit.com does not list OCaml, so we might as well just add another GitHub workflow for it.

sim642 added the cleanup label May 20, 2021

vogler added a commit that referenced this issue May 21, 2021

make pre-commit hook only complain about ocp-indent on changes

d2adce6

#236

sim642 assigned sim642 and vogler May 26, 2021

sim642 added a commit that referenced this issue May 26, 2021

Document pre-commit hook and its bypass (issue #236)

7187234

sim642 closed this as completed May 26, 2021

sim642 reopened this May 26, 2021

vogler added a commit that referenced this issue Jun 9, 2021

pre-commit: use ocp-indent with --lines instead of diff of diffs, #236

a06a611

michael-schwarz closed this as completed Jun 23, 2021

vogler mentioned this issue Jul 27, 2021

check indentation of changes on GitHub #308

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ocp-indent pre-commit hook #236

ocp-indent pre-commit hook #236

sim642 commented May 20, 2021

michael-schwarz commented May 20, 2021

jerhard commented May 20, 2021 •

edited

vogler commented May 20, 2021

vogler commented May 20, 2021

sim642 commented May 20, 2021

vogler commented May 20, 2021

sim642 commented May 21, 2021

vogler commented May 21, 2021

michael-schwarz commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

vogler commented May 26, 2021

sim642 commented May 26, 2021

vogler commented Jun 8, 2021 •

edited

vogler commented Jun 9, 2021

michael-schwarz commented Jun 23, 2021

vogler commented Jul 27, 2021

ocp-indent pre-commit hook #236

ocp-indent pre-commit hook #236

Comments

sim642 commented May 20, 2021

michael-schwarz commented May 20, 2021

jerhard commented May 20, 2021 • edited

vogler commented May 20, 2021

vogler commented May 20, 2021

sim642 commented May 20, 2021

vogler commented May 20, 2021

sim642 commented May 21, 2021

vogler commented May 21, 2021

michael-schwarz commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

sim642 commented May 26, 2021

vogler commented May 26, 2021

sim642 commented May 26, 2021

vogler commented Jun 8, 2021 • edited

vogler commented Jun 9, 2021

michael-schwarz commented Jun 23, 2021

vogler commented Jul 27, 2021

jerhard commented May 20, 2021 •

edited

vogler commented Jun 8, 2021 •

edited