Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black does not honor exclude regex when files explicitly listed on the command line #438

Closed
adamehirsch opened this issue Aug 7, 2018 · 48 comments · Fixed by #1032
Closed

Comments

@adamehirsch
Copy link

Operating system: OSX
Python version: 3.6.2
Black version: black, version 18.6b4

The problem: certain directories in our repo contain generated python code that we don't want black to change. We've configure our repo to run black via pre-commit. Pre-commit invokes black with a list of changed files on the command line, and black's exclude regex does not work against those files and paths.

i.e.

black --exclude "/migrations/" content/migrations/0049_publicationstore_is_test.py
reformatted content/migrations/0049_publicationstore_is_test.py
All done! ✨ 🍰 ✨
1 file reformatted.

This makes us sad, since we've carefully put exclusion regexes into our pyproject.toml and black doesn't honor them when pre-commit calls it. Instead, we're having to workaround by configuring pre-commit to skip that path:

repos:
-   repo: https://github.com/ambv/black
    rev: stable
    hooks:
    - id: black
      language_version: python3.6
      exclude: migrations

The behavior we'd like to see is that black's exclude regex would apply even when full file paths are listed on the commandline. I'd be happy to try for a PR if this seems like desirable behavior to anyone else...

@asottile
Copy link
Contributor

From a prior art perspective, flake8 has this same issue (marked: WONTFIX) -- that said, I personally think the decision in flake8 is incorrect/inconsistent and that implementing this is a good idea :)

@adamehirsch
Copy link
Author

I'll wait for the black maintainers to indicate whether they think it's a good idea; no sense writing a PR for something that'll be rejected on concept.

@ambv
Copy link
Collaborator

ambv commented Aug 17, 2018

--include= and --exclude= are only consulted for recursive search, not for files passed on the command line.

How is the decision of flake8 inconsistent here? The rationale is rather simple: by default, we exclude some paths and only include some file extensions in recursive search. But if you specifically give us a file path on the command line which doesn't match the file extension or would otherwise belong to the exclusion list, you probably know what you're doing.

I do agree that interfacing with CI and editors is surprising in this case so I'm not downright rejecting changing this. But I need to carefully consider whether there are backwards compatibility disasters hiding in changing this. And even if we agree to do this, what does it mean for defaults in --include=? Should I reject non- .py/.pyi files from now on unless somebody clears --include= on their call?

This is not going to be straight-forward to change but let's try to figure something out. Anthony, what would you suggest?

@ambv ambv added the T: style What do we want Blackened code to look like? label Aug 17, 2018
@asottile
Copy link
Contributor

ah let me clarify -- black and flake8 currently have the same behaviour here (they are consistent).

Maybe it's a bit snowflakey but intuitively it makes a lot of sense for me to apply --exclude even if passed on the commandline. This simplifies a lot of editor configurations, pre-commit, and even just black foo/*.py. I don't think --include should be applied except in the recursive case however.

That said, there's certainly arguments in both directions -- it may very well be simpler to only apply it during the recursive routines.

the one thing I usually point at here is "pre-commit is better at running your linter than your linter is" because it can take advantage of a few things:

  • "recursive" doesn't really matter, pre-commit knows which files are part of your version control
  • "exclusion" is configurable in multiple ways (global exclusion, per-linter exclusion)
  • pre-commit knows how to find files by shebangs thanks to identify

Though for tools that often means you have to configure both pre-commit exclusion and tool exclusion and keep them in sync (or a superset / subset of each other). It would be nice to only configure this in one place and for most that usually means "configure it using the tool's configuration". But then you run into OP's issue :)

@bisby
Copy link

bisby commented Aug 17, 2018

My confusion stemmed from the following line from black's own readme on pre-commit:

Avoid using args in the hook. Instead, store necessary configuration in pyproject.toml so that editors and command-line usage of Black all behave consistently for your project.

I know this specifically says "args", but I took this to mean "pyproject.toml is preferable to pre-commit's yaml when both have a similar config." It made sense at the moment (without having fully grasped the concept of how everything worked), that since black . worked with the exclude in black's config, that it should work with pre-commit run black as well.

If the answer is "exclude will only ever exclude on recursive searches" that's fine, but the readme should note that in the pre-commit section ("pre-commit passes specific files, not a recursive search, so you should exclude files in pre-commit as well"). But I would fully support some sort of "--hard-exclude" that is a definitive "black will not run on these files, no matter what"

@ambv
Copy link
Collaborator

ambv commented Sep 26, 2018

We'll definitely hard-exclude things that are .gitignore'd (see #475). Other than that, I don't think there's anything actionable here. If I pass a file explicitly to the tool, I expect it to be acted on.

What I think we could do instead is to convince @asottile to let pre-commit run black on the entire Git repo on every commit. The main reason it's not doing so is performance as far as I can tell. But Black does cache its executions so it won't even touch files that are already well-formed. Plus, then it will correctly follow exclusion and inclusion lists. This would minimize configuration for users, too.

@bisby
Copy link

bisby commented Sep 26, 2018

Would a PR with an update to the readme be warranted then?

Avoid using args in the hook. Instead, store necessary configuration in pyproject.toml so that editors and command-line usage of Black all behave consistently for your project. See Black's own pyproject.toml for an example.

This heavily implies that pre-commit should not have any config (although it admittedly just calls out args specifically, as a first time user this was confusing). Some clarification that pre-commit will call files explicitly, thus will ignore black's exclude list would be nice.

@ambv
Copy link
Collaborator

ambv commented Sep 26, 2018

I'd like to wait for what Anthony thinks about it. If he agrees that we should change the hook setup so that Black is ran on the entire repo every commit, then we will do just that and the README then remains fine.

Otherwise, yeah, we will have to explicitly call out in the README that --exclude= in pyproject.toml is not used by pre-commit. I would like to avoid this.

@asottile
Copy link
Contributor

first, it's possible to enable this behaviour but I don't think it's desirable (as it sidesteps the benefits of the framework):

    -   id: black
        args: [.]
        pass_filenames: false
        always_run: true

pre-commit is (generally) better at running linters than linters themselves are, here's a couple of reasons why:

  • pre-commit knows about what files are in version control and will never run against files which aren.t There's no need to parse / worry about .gitignore / exclude .git|.tox|venv|..., etc. / recurse / etc.
  • pre-commit knows how to lint extensionless executables with a conventional shebang (#!)
  • pre-commit knows when or when not to run the linter (only passing filenames which change, not even executing at all when there are zero files which change)

An example of a case where always running black on all the files is (very) wrong is during a merge conflict resolution. pre-commit will only execute a linter on files which conflict or are manually changed avoiding the headache of waiting for black (or other linters) to run across every file that changed in the upstream (and potentially dealing with other people's mistakes as a punishment for merging).

(gotta run, hope this succinct reply is enough, if not I can elaborate / link some more prose on this -- hope it helps!)

@alvinlindstam
Copy link

When using an editor integration that calls black with a changed file's path (which it could do automatically, such as on a post save action), this behaviour also means that it would reformat the file even if it's excluded by the config.

So I would be in favor of either changing the default to always consider the ignore/exclude rules, or to include an option to do so even when a full path is provided.

@amitbeka
Copy link

Hi everyone,

I also run into this problem with multiple editors and this behavior was surprising to me, as a user.

IMHO you don't want to have multiple configurations stating the same files to exclude: pyproject.toml, pre-commit (not everyone using the tool named pre-commit), VIM, PyDev, IntelliJ etc.

We have many tools in the team so it's essential one configuration will be used by everyone.

I think that in terms of usability, surprising the user is never a good idea.
We can add two flags (tentative naming) to help black behave nicely when an excluded file is given to black as an argument:

  1. -f/--force to force formatting an excluded file, return 0 (or whatever we return from the formatting)
  2. --ignore-excluded which means black silently ignore the excluded files, formatting the other files (if any) and return 0 (assuming the rest was fine)
  3. Without flags, return an error (e.g. -1) and warn that excluded files were given explicitly

If this change of behavior is agreed upon, I can try and create the PR.

@chebee7i
Copy link

Any progress/decisions on this?

@zsol
Copy link
Collaborator

zsol commented Aug 16, 2019

I agree with Anthony that pre-commit is in a better position to figure out when to run Black on what, so if we could somehow get away with only ever running Black through pre-commit, that would provide a way to solve this problem.

I don't think it's OK to assume Black is always going to be called through pre-commit though, so we do need a separate include/exclude mechanism (and unfortunately we do need to worry about parsing .gitignore et al) for when it's called directly from the command line or through editors.

As for the performance concerns around running black . on the entire repo every time you commit: after the first run, it should be O(changed files) instead of O(repo) because of Black's cache.

So I think all in all we should encourage people to configure Black with the following in their .pre-commit-config.yaml

    -   id: black
        args: [.]
        pass_filenames: false
        always_run: true

@asottile
Copy link
Contributor

please don't suggest that per above, thanks

even with black's cache it's going to be the wrong thing during merge conflicts and you're back to linting files that aren't checked in and you have to do filesystem traversals which themselves are pretty slow

given how frequent this comes up I'm considering changing flake8's behaviour to honor flake8 foo.py --exclude foo.py to be a noop (as silly as such a command is) and perhaps if that goes well black should do the same

@asottile
Copy link
Contributor

plus, even a trivial invocation of black (with no files) takes ~250ms which would be 0ms in the case when there's no files to run

@chebee7i
Copy link

chebee7i commented Aug 16, 2019

#438 (comment) is the closest to a proposal here, but I'm not sure that we need extra parameters. I think all we need to do is change the behavior of --exclude:

Desired Behavior:

  • --include is used when recursively searching a path (EXISTING)
  • --exclude is always used on each filepath, no matter whether the filepath was explicitly passed in or discovered (NEW)

Use cases:

  • Force inclusion of an excluded file: black --exclude '' excluded_file.py
  • Tools like pre-commit can run black with every changed file, and black will exclude what should be excluded: black modified_included_file.py modified_excluded_file.py.

The latter use case is important since during merge conflicts, we will only run on the files that the user edited, rather than on every file that anyone has edited.

@asottile
Copy link
Contributor

weird I thought I mentioned this but I guess not, if you're only invoking through pre-commit it's usually better to use pre-commit's exclude: ... pattern:

repos:
-   repo: ...
    rev: ...
    hooks:
    -   id: black
        exclude: ^testing/test_data/

or if you're globally excluding

repos:
exclude: ^vendor/
-   repo: ...
    rev: ...
    hooks:
    -   id: black

@chebee7i
Copy link

This works for me (in my limited use cases):

diff --git a/black.py b/black.py
index 05edf1a..05ba5af 100644
--- a/black.py
+++ b/black.py
@@ -441,7 +441,23 @@ def main(
             )
         elif p.is_file() or s == "-":
             # if a file was explicitly given, we don't care about its extension
-            sources.add(p)
+            try:
+                normalized_path = "/" + p.resolve().relative_to(root).as_posix()
+            except ValueError:
+                if p.is_symlink():
+                    report.path_ignored(
+                        p, f"is a symbolic link that points outside {root}"
+                    )
+                    continue
+
+                raise
+
+            exclude_match = exclude_regex.search(normalized_path)
+            if exclude_match and exclude_match.group(0):
+                report.path_ignored(p, f"matches the --exclude regular expression")
+                continue
+            else:
+                sources.add(p)
         else:
             err(f"invalid path: {s}")
     if len(sources) == 0:

itajaja added a commit to itajaja/black that referenced this issue Sep 24, 2019
fixes psf#438

I Have not added or amended tests yet, but if we agree on the approach, I can work on them
@itajaja
Copy link
Contributor

itajaja commented Sep 24, 2019

I have added a PR following @chebee7i proposal, which I think is the most sensical #1032. I think that black needs to work well when integrating with other tools (editors, pre-commit hooks) for larger adoption. this concerns probably were less of a thing some years ago when flake8 was developed, but if we look at other modern tools and ecosystems (node: prettier, eslint, husky) this is the expected behavior.

@itajaja
Copy link
Contributor

itajaja commented Oct 22, 2019

this is the 4th top commented issue in the repo, it would help a lot of people if some attention could be spared on this. There is an open PR, so my question is why this is not moving forward? a) simply no time to look into this? b) won't fix? c) something else?

They are all acceptable answers :) but I'd like to set some expectations for myself

@asottile
Copy link
Contributor

@itajaja well for one your open PR has no tests and has a merge conflict -- I also don't think there's been any concrete proposal / agreement on the way to move forward yet

@ichard26
Copy link
Collaborator

ichard26 commented Aug 12, 2020

We'll definitely hard-exclude things that are .gitignore'd

#438 (comment)

Currently due to a bug (#1572), this works as described 🙃 . Even if you explicitly pass a file to Black, it will still ignore the file if it's gitignore'd. Unfortunately, this bug also breaks the behaviour of --exclude, causing it to act like --force-exclude so it has to be fixed. This is no longer the case as #1591 fixes this bug, but the rest of this comment still applies.

@ambv are you still in favor of having Black always refuse to format files that are gitignore'd, or should we stay with current go back to stable behaviour where .gitignore only applies to files found recursively? That idea of yours goes back almost two years ago, so you could be no longer in favor of it.

dhvcc added a commit to dhvcc/vkbottle that referenced this issue May 31, 2021
As per psf/black#438 if you don't your excluded files to be formatted when passing filename (e.x. pre-commit), then use "force-exclude". Available in black>=20.8b1
ngnpope added a commit to ngnpope/django that referenced this issue Mar 1, 2022
When using `pre-commit run --all-files`, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

See psf/black#438.
ngnpope added a commit to ngnpope/django that referenced this issue Mar 3, 2022
When using `pre-commit run --all-files`, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

See psf/black#438.
carltongibson pushed a commit to ngnpope/django that referenced this issue Mar 9, 2022
When using `pre-commit run --all-files`, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

See psf/black#438.
carltongibson pushed a commit to django/django that referenced this issue Mar 9, 2022
When using `pre-commit run --all-files`, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

See psf/black#438.
chassing added a commit to chassing/qontract-reconcile that referenced this issue Jun 20, 2022
When using VS code black plugin, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

see psf/black#438
chassing added a commit to app-sre/qontract-reconcile that referenced this issue Jun 20, 2022
When using VS code black plugin, because the filename is passed
explicitly, the file referred to in `extend-exclude` is not properly
excluded. Use `force-exclude` instead to say we really mean it.

see psf/black#438
kdestin added a commit to kdestin/azure-sdk-for-python that referenced this issue Apr 4, 2023
    Black does not honor patterns in --excludes when explicitly
    provided on the cli. This is problematic when used with
    pre-commit since pre-commit _always_ provides a list of files
    to act on.

    `--force-exclude` seems to act identically to `--extend-excludes`
    but also applies to cli arguments.

    See psf/black#438 for more background
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging a pull request may close this issue.