Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

--match should consider only basename when given a path argument #549

Closed
oczkoisse opened this issue Aug 13, 2021 · 2 comments · Fixed by #550
Closed

--match should consider only basename when given a path argument #549

oczkoisse opened this issue Aug 13, 2021 · 2 comments · Fixed by #550

Comments

@oczkoisse
Copy link
Contributor

oczkoisse commented Aug 13, 2021

As an example,

pydocstyle --match '(?!setup).*\.py' setup.py

correctly ignores setup.py. However, both

pydocstyle --match '(?!setup).*\.py' .\setup.py

and

pydocstyle --match '(?!setup).*\.py' ./setup.py

don't. This behavior is confusing for a user. Even worse, IDEs like VSCode automatially invoke pydocstyle with full path of opened Python file. Even if I include a match pattern in my configuration, it doesn't work because full paths are passed to pydocstyle. I wasted some time trying to figure out if I had configured pydocstyle correctly because VSCode was still flagging problems in ignored files.

I think the issue can be resolved by only considering basename of file arguments to match against --match pattern.

@oczkoisse oczkoisse changed the title --match should consider only basename when given a path argument --match should consider only basename when given a path argument Aug 13, 2021
@sigmavirus24
Copy link
Member

@check_initialized
def get_files_to_check(self):
"""Generate files and error codes to check on each one.
Walk dir trees under `self._arguments` and yield file names
that `match` under each directory that `match_dir`.
The method locates the configuration for each file name and yields a
tuple of (filename, [error_codes]).
With every discovery of a new configuration file `IllegalConfiguration`
might be raised.
"""
def _get_matches(conf):
"""Return the `match` and `match_dir` functions for `config`."""
match_func = re(conf.match + '$').match
match_dir_func = re(conf.match_dir + '$').match
return match_func, match_dir_func
def _get_ignore_decorators(conf):
"""Return the `ignore_decorators` as None or regex."""
return (
re(conf.ignore_decorators) if conf.ignore_decorators else None
)
for name in self._arguments:
if os.path.isdir(name):
for root, dirs, filenames in os.walk(name):
config = self._get_config(os.path.abspath(root))
match, match_dir = _get_matches(config)
ignore_decorators = _get_ignore_decorators(config)
# Skip any dirs that do not match match_dir
dirs[:] = [d for d in dirs if match_dir(d)]
for filename in filenames:
if match(filename):
full_path = os.path.join(root, filename)
yield (
full_path,
list(config.checked_codes),
ignore_decorators,
)
else:
config = self._get_config(os.path.abspath(name))
match, _ = _get_matches(config)
ignore_decorators = _get_ignore_decorators(config)
if match(name):
yield (name, list(config.checked_codes), ignore_decorators)

I think the solution can be handled in a couple ways:

  1. We could switch from re.compile(conf.match + '$').match to .search at the end - this works because we are already adding a $ to force the match to happen at the end.
  2. We could leave that code as it is today and use os.path.basename on filenames (since match-dir should be handling the directory from the full-path)
  3. On the odd chance someone is relying on this behaviour, add in or match(os.path.basename(filename)) on line 280

https://docs.python.org/3/library/re.html#search-vs-match

@oczkoisse
Copy link
Contributor Author

@sigmavirus24 Thanks for replying. About the suggested approaches:

  1. It seems like changing to .search will break negative lookahead patterns because:
    # The following evaluates to True with .search but is False with .match
    bool(re.compile('(?!test_).+\.py' + '$').search('test_a.py'))
    My understanding is that pydocstyle doesn't have an exclude option for files so people may be relying on negative lookahead patterns to get similar functionality.
  2. I like this approach better because I think using basename is more explicit about --match being meant to match against filenames alone.
  3. Personally, I think it might be okay to break the old behavior in this case. However, I leave that decision to project maintainers.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants