Skip to content

Conversation

@adrien-berchet
Copy link
Member

Description

In some cases the reference directory can be 'dirty', i.e. it can contain files that should not be compared (e.g. *.pyc files or other tmp files). This PR adds a new parameter to compare_trees to ignore these files.

Fixes: #84

Checklist

This pull request is:

  • A documentation / typographical error fix
    • Good to go, no issue or tests are needed
  • A short code fix
    • Please include: Fixes: #<issue number> in the description if it solves an existing issue
      (which must include a complete example of the issue).
    • Please include tests that fail with the main branch and pass with the provided fix.
  • A new feature implementation or update an existing feature
    • Please include: Fixes: #<issue number> in the description if it solves an existing issue
      (which must include a complete example of the feature).
    • Please include tests that cover every lines of the new/updated feature.
    • Please update the documentation to describe the new/updated feature.

@codecov
Copy link

codecov bot commented Sep 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (04159b2) to head (d1be2a7).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff            @@
##              main       #86   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            8         8           
  Lines          721       800   +79     
  Branches       119       133   +14     
=========================================
+ Hits           721       800   +79     
Flag Coverage Δ
pytest 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
dir_content_diff/__init__.py 100.00% <100.00%> (ø)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds functionality to ignore specified files in the reference directory during tree comparison by introducing an ignore_patterns parameter to the compare_trees function. This addresses cases where reference directories contain temporary or unwanted files that should be excluded from comparison.

  • Added ignore_patterns parameter to compare_trees function for filtering files using regex patterns
  • Implemented pattern matching logic to skip files that match any of the ignore patterns
  • Added comprehensive test coverage for the new ignore functionality

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
dir_content_diff/init.py Added ignore_patterns parameter and filtering logic to compare_trees function
tests/test_base.py Added test case to verify ignore patterns functionality works correctly
README.md Updated documentation with example usage of the new ignore_patterns feature

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@adrien-berchet
Copy link
Member Author

@liborjelinek could you test that this branch fixes your issue please?

@adrien-berchet
Copy link
Member Author

Hi @liborjelinek !
I came back to this PR to improve a few things. Now you can filter the files in two ways:

  1. use a set of patterns to select the files that are kept.
  2. use a set of patterns to ignore a few files that were selected.

I think this should cover most cases regarding file selection. Again, if you have some time to test this branch it would be nice, but if you can't it's no big deal and I will merge in a few days.

@liborjelinek
Copy link

I have now tried exclude_patterns and it works like a charm 👌

But how does it work if both params are passed? Honestly, I can't find a use case for include_patterns. E.g., if a reference (left) directory has A, B, and C files and I pass include_patterns=["a"], exclude_patterns=["b"] to compare_trees()/assert_equal_trees (), what happens? Include includes from what? Just asking.

@adrien-berchet
Copy link
Member Author

Great, thanks for your feedback!
For the include_patterns and exclude_patterns, we keep the files that match at least one include_patterns AND that do NOT match any exclude_patterns. I think it can be convenient in some complex case, for example you could write something like this:

config = ComparisonConfig(
    include_patterns=[r".*\.(py|json|toml|yaml)$"],
    exclude_patterns=[
        r".*__pycache__.*",
        r".*/\.pytest_cache/.*",
    ]
)

to compare Python projects (keeping files with specific extensions but excluding the ones located in specific subdirectories). It would be possible to convert this case with only excluding patterns but I think it would be more complicated because it would require to use a NOT operator in the patterns, which I find a bit ugly.
Anyway, having both was not hard to implement so users can choose what they prefer :-)

@adrien-berchet adrien-berchet merged commit bf5a9c2 into main Sep 12, 2025
9 checks passed
@adrien-berchet adrien-berchet deleted the ignore_patterns branch September 12, 2025 08:23
@adrien-berchet
Copy link
Member Author

This feature is included in the new release 1.13.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[How to use] ignore (exclude) files from comparing

3 participants