Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Since version 1.24.0 finding files recursively is broken #334

Open
dafyddj opened this issue Oct 8, 2020 · 6 comments
Open

Since version 1.24.0 finding files recursively is broken #334

dafyddj opened this issue Oct 8, 2020 · 6 comments

Comments

@dafyddj
Copy link

dafyddj commented Oct 8, 2020

I believe PR #283 introduced a flaw in finding files recursively when specifying a directory search on the command line.
Specifically you can no longer check files only within a specific subdirectory.

The code in the PR strips out the directory information so after that you can no longer make any decisions based on full directory path.

The set of command below demonstrates the issue.

:~/tmp% mkdir yaml-test
:~/tmp% cd yaml-test 
:~/tmp/yaml-test% cat >test.yaml
testing: [
:~/tmp/yaml-test% mkdir sub 
:~/tmp/yaml-test% cp test.yaml sub 
:~/tmp/yaml-test% yamllint .
./test.yaml
  1:1       warning  missing document start "---"  (document-start)
  2:1       error    syntax error: expected the node content, but found '<stream end>' (syntax)

./sub/test.yaml
  1:1       warning  missing document start "---"  (document-start)
  2:1       error    syntax error: expected the node content, but found '<stream end>' (syntax)

:~/tmp/yaml-test% cat >.yamllint
extend: default
yaml-files:
  - sub/*.yaml
:~/tmp/yaml-test% yamllint .
:~/tmp/yaml-test% yamllint -v
yamllint 1.25.0
@adrienverge
Copy link
Owner

Thanks for the report, I can confirm that it worked with 1.23.0, but not with 1.24.0 and versions above.

My understanding of the problem is that #283 was incorrect, and we should revert it. @dafyddj @jsok what do you think?

The problem comes from the fact that python-pathspec configured with gitwildmatch, ['*.yaml'] matches files like dir.yaml/file.sql:

>>> import pathspec
>>> p = pathspec.PathSpec.from_lines('gitwildmatch', ['*.yaml'])
>>> p.match_file('file.yaml')
True
>>> p.match_file('dir/file.sql')
False
>>> p.match_file('dir.yaml/file.sql')
True

I don't know whether it's the expected behavior, so I asked at cpburnz/python-pathspec#41.

@adrienverge
Copy link
Owner

This problem was reported again (in #390). I still think that #283 introduced a bigger problem than the one it solved, and we should revert it. @jsok @dafyddj @andrewimeson what do you think?

@andrewimeson
Copy link
Contributor

It's hard for me to estimate which has more impact, but based on the reports for each issue, I think that the one that #283 fixes is more rare.

It would be ideal if we could figure out how to solve #279 without breaking the directory matching, but in the meantime reverting #283 makes sense to me.

@adrienverge
Copy link
Owner

@andrewimeson thanks for your feedback. Solving #279 without breaking the directory matching seems possible, but complex (see discussion at cpburnz/python-pathspec#41).

We can wait for other users' feedback, but if someone wants to solve #334 by reverting commit a221898, I'm +1 to merge it.

@adrienverge
Copy link
Owner

This problem was reported again (in #546).

Now that pathspec has a new GitIgnoreSpec class to fix the original problem, the solution could be to cancel #283 and use the new pathspec class. For example :
- remove addition of os.path.basename() from commit a221898,
- switch from pathspec.PathSpec to pathspec.GitIgnoreSpec,
- require pathspec ≥ 0.10.0 in setup.cfg,
- add new cases in tests/test_cli.py, including examples for real-life issues #334, #390 and #546.

Unfortunately the author of the problematic PR @jsok doesn't answer. To anyone who wants to tackle the problem: contribution welcome!

@jsok
Copy link
Contributor

jsok commented Mar 8, 2023

If #283 is broken then revert it 😃
I did my best to add test cases but it's clearly caused a regression that no test caught. It would be wise to add the scenarios outlined in this issue as test cases to avoid future regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants