Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use start/end anchors in matchStrings() regex #1

Merged
merged 3 commits into from
Jan 23, 2023

Conversation

jmikola
Copy link
Contributor

@jmikola jmikola commented Jan 20, 2023

I came across this when creating two similarly named collections in a Sculpin deployment ("posts" and "posts2" used source directories "_posts" and "_posts2", respectively). I found that the data collector for "posts" contained sources from both directories and ultimately traced it back to logic in AntPathMatcherTest.

I wasn't familiar with the Ant pattern syntax before this PR, but I think the regex anchoring is correct per the syntax definition in AntPathMatcher in the Spring API docs. I didn't come across any documentation that suggested path components should not be matched in their entirety. That said, matchStrings() is called from a few places so it'd be prudent to confirm this doesn't inadvertently break other use cases.

In order to test this, I had to make some additional changes to allow a newer version of PHPUnit. I attempted to do so without breaking support for older PHP versions.

As a side note, if you're looking to improve the test coverage for the AntPathMatcher class, it may be worth pulling in additional test cases from AntPathMatcherTests.java.

This prevents PHPUnit from executing the data provider as a test and warning that it performs no assertions.
composer.json Show resolved Hide resolved
Copy link
Member

@simensen simensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for looking into this. I have a few questions and comments. If you have time to do a bit more digging (like, if you already have some environment set up where you can look at this a bit more) that would be awesome.

Rooting the regular expression might be the right call anyway, but it seems like that should actually be required. It seems more likely based on what I recall that maybe Sculpin isn't adding the /** suffix to the path. This could still be an issue with this library, but in that case it might be in isPattern.

This adds a new test case demonstrating the issue where "com/**" would incorrectly match "com2/foo" or "2com/foo".

The change also necessitated revising an existing test case, since "t?st" should not match a string with a file extension. The original test case was changed to add an extension to the pattern, and the extension-less pattern was preserved for a new regression test.
@simensen simensen merged commit c8406d2 into dflydev:master Jan 23, 2023
@jmikola jmikola deleted the fix-matchStrings branch January 24, 2023 15:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants