Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary file change by hook not detected due to Git textconv #2742

Closed
adamchainz opened this issue Feb 2, 2023 · 0 comments · Fixed by #2743
Closed

Binary file change by hook not detected due to Git textconv #2742

adamchainz opened this issue Feb 2, 2023 · 0 comments · Fixed by #2743
Labels

Comments

@adamchainz
Copy link
Contributor

search you tried in the issue tracker

textconv

describe your issue

I’ve set up Git to use exiftool as a text conversion filter for JPEGs, with this in ~/.config/git/config:

[diff "exiftool"]
    textconv = exiftool --composite -x 'Exiftool:*' -x 'File:*' -g0
    cachetextconv = true
    xfuncname = "^-.*$"

…and this in ~/.config/git/attributes:

*.jpeg diff=exiftool

In my repo, I’m using jpegoptim to optimize JPEG's with the below .pre-commit-config.yaml.

Take an unoptimized JPEG:

$ git status
On branch main
Changes to be committed:
        new file:   donut.jpeg

Running pre-commit, the hook passes:

$ pre-commit run jpegoptim
Optimize JPEGs...........................................................Passed

(There’s no option to make jpegoptim exit with a non-zero status code when it changes files.)

However, the tool did actually modify the image:

$ git status
On branch main
Changes to be committed:
        new file:   donut.jpeg

Changes not staged for commit:
        modified:   donut.jpeg

This behaviour means commits don't include the optimized version of a flie, instead they’re left around unstaged.

pre-commit doesn't detect the change because the text conversions of the binary file before/after are the same, leading to an empty diff:

$ git diff donut.jpeg

But using --no-textconv shows the tool actually made a change:

$ git diff --no-textconv donut.jpeg |cat
diff --git a/donut.jpeg b/donut.jpeg
index 3bd0743..d66a2f4 100644
Binary files a/donut.jpeg and b/donut.jpeg differ


The `git diff` call that pre-commit uses to check for changes, in [`_git_diff()`](https://github.com/pre-commit/pre-commit/blob/e846829992a84ce8066e6513a72a357709eec56c/pre_commit/commands/run.py#L273C6-L277) already uses `--no-ext-diff` to make diffs reproducible
I think it should also use `--no-textconv`.
This would also make it a little faster.

### pre-commit --version

3.0.3 / main

### .pre-commit-config.yaml

```yaml
repos:
-   repo: local
    hooks:
    -   id: jpegoptim
        name: Optimize JPEGs
        entry: jpegoptim
        language: system
        types: [jpeg]

~/.cache/pre-commit/pre-commit.log (if present)

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

2 participants