Skip to content

TracePathMatcher should match pattern "**" with paths end by "/" (#7875)#72

Closed
newboy2004 wants to merge 8 commits intoapache:mainfrom
newboy2004:main
Closed

TracePathMatcher should match pattern "**" with paths end by "/" (#7875)#72
newboy2004 wants to merge 8 commits intoapache:mainfrom
newboy2004:main

Conversation

@newboy2004
Copy link
Copy Markdown

Fix <TracePathMatcher should match pattern "**" with paths end by "/">

  • Add a unit test to verify that the fix works.
  • Use "Spring AntPathMatcher' instead of ”FastPathMatcher“

Comment thread apm-sniffer/optional-plugins/trace-ignore-plugin/pom.xml Outdated
@wu-sheng
Copy link
Copy Markdown
Member

If you want to add this, you need to enhance the existing one, rather than use Spring to replace.
We knew Spring can do from beginning, but we determined this is not suitable, due to package size and License risk.

@wu-sheng wu-sheng added the invalid This doesn't seem right label Nov 18, 2021
@wu-sheng
Copy link
Copy Markdown
Member

@devkanro Could you recheck this? I think you wrote this originally.

@wu-sheng wu-sheng added enhancement New feature or request plugin and removed invalid This doesn't seem right labels Nov 18, 2021
@devkanro
Copy link
Copy Markdown
Contributor

ok

Comment on lines 125 to 128
// End of pattern, just check the end of string is '/' quickly.
if (p >= pat.length() && s < str.length()) {
return str.charAt(str.length() - 1) != '/';
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the comment need be changed.

        // End of pattern, make matching success quickly.
        if (p >= pat.length() && s < str.length()) {
            return true;
        }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my originally design, the /eureka/** not match the /eureka/client/ is a feature, but I have no preference for this, I can accept it regardless of whether it matches or does not match.

Could you show your use cases and benefits for it?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because some users think all things under /eureka/ belong to this match rule.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devkanro Could you share a little more about why /eureka/client/ doesn't belong to /eureka/**? I hope we could have a fully evaluation considering this was being asked more than once.(This is the first fix)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ummm...I can't remember what I thought at the time...

Maybe it's because /eureka/ doesn't match /eureka, I want to make it same as /eureka/** case?

I have changed my mind, the /eureka/client/ should belong to /eureka/**.

LGTM

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And maybe the rule of wildcardMatch also should be changed?
Should /eureka/* match with /eureka/client/ ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/eureka/ doesn't match /eureka

I think they should be same, do you know any different in some use cases?
I feel they are same.

And maybe the rule of wildcardMatch also should be changed?
Should /eureka/* match with /eureka/client/ ?

Yes, it should be supported, too.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is final dicussion?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current resolution is, what your proposal is correct. Just follow the polish requirements.

Copy link
Copy Markdown
Contributor

@devkanro devkanro Nov 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wu-sheng

/eureka/ doesn't match /eureka

Ummm, The reason I do this is to simplify the state machine. There are no other special considerations.

@wu-sheng
Copy link
Copy Markdown
Member

@newboy2004 Please follow the @devkanro 's comments to polish/enhance this fix and update the changes.md in the root.

@newboy2004
Copy link
Copy Markdown
Author

@newboy2004 Please follow the @devkanro 's comments to polish/enhance this fix and update the changes.md in the root.

ok,i continue repair it.

By the way, accessing githubcom is slow. Is there any good solution?

@devkanro
Copy link
Copy Markdown
Contributor

The simple resolution is remove the last '/' for the pattern and string.

If this change is done during the matching process, many states will be introduced, which will complicate the state machine of the matcher.

I think it is a good choice to normalize(remove the last '/') the pattern and value before entering the matcher.

@wu-sheng
Copy link
Copy Markdown
Member

Agree, this kind of change seems better and easier to understand.

@newboy2004
Copy link
Copy Markdown
Author

Agree, this kind of change seems better and easier to understand.

Continue to repair * matching, and remove the last / of pattern and string before?

@wu-sheng
Copy link
Copy Markdown
Member

Remove / before the match, I think. Then no need to change matching core, fromy understanding.
Please correct me if I am wrong.

@newboy2004
Copy link
Copy Markdown
Author

Remove / before the match, I think. Then no need to change matching core, fromy understanding. Please correct me if I am wrong.

please see blow origin code:
String patten = "/eureka/*";
path = "/eureka/apps/";
match = pathMatcher.match(patten, path);
Assert.assertFalse(match);

i think remove / after,Assert should return true.
please check my opinion,is it right?

@wu-sheng
Copy link
Copy Markdown
Member

i think remove / after,Assert should return true.
please check my opinion,is it right?

Yes, /eureka/apps/ should be captured by /eureka/* expression. That is why @devkanro proposed a change about removing / in /eureka/apps/. Then the current FastPathMatcher should return true, too. Right?

@newboy2004
Copy link
Copy Markdown
Author

i think remove / after,Assert should return true.
please check my opinion,is it right?

Yes, /eureka/apps/ should be captured by /eureka/* expression. That is why @devkanro proposed a change about removing / in /eureka/apps/. Then the current FastPathMatcher should return true, too. Right?

i wait for final opinion? my local code repo have already finish what * rule match and unit test according before discussion

waiting for you final opionion

@wu-sheng
Copy link
Copy Markdown
Member

I think the conclusion is very clear. The recommended way to fix is removing the last / in the operation name, before running match. Such as in TraceIgnoreExtendService#trySampling. What do you think? @newboy2004

@newboy2004
Copy link
Copy Markdown
Author

conclusion

ok,i understand.

char pc = safeCharAt(pat, p);
//if pat already arrival end,and str in position of s is not '/',then return true
if (pc == '\u0000' && safeCharAt(str, s) != '/') {
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change will cause a bug that 'abc/*' matching 'abc/foo/bar'

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i review it

// End of pattern, just check the end of string is '/' quickly.
if (p >= pat.length() && s < str.length()) {
return str.charAt(str.length() - 1) != '/';
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this check after we remove the last '/' both of pattern and value,

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@wu-sheng
Copy link
Copy Markdown
Member

I think the conclusion is very clear. The recommended way to fix is removing the last / in the operation name, before running match. Such as in TraceIgnoreExtendService#trySampling. What do you think? @newboy2004

I think the current doesn't follow this conclusion. We just need to slightly change the input.

@newboy2004
Copy link
Copy Markdown
Author

I think the conclusion is very clear. The recommended way to fix is removing the last / in the operation name, before running match. Such as in TraceIgnoreExtendService#trySampling. What do you think? @newboy2004

I think the current doesn't follow this conclusion. We just need to slightly change the input.

ok,but if i remove the last '/' in the operation name,please confirm blow code,return true or false:
String patten = "/eureka/*";
path = "/eureka/";
match = pathMatcher.match(patten, path);
Assert.assertTrue(match);

i think '' match zero or one char,but remove the last '/' after,pattern='/eureka/' path='/eureka',accroding Ant Rule,result should return false,do you think?

@wu-sheng
Copy link
Copy Markdown
Member

OK, I can see this becomes a tricky point. From this point of view, removing / and changing matching rules are both not a very good idea, as these kinds of edge conditions.

I think we should consider whether changing plugin's operation rules fits more cases.

@newboy2004
Copy link
Copy Markdown
Author

edge

We can consider not supporting this so-called bug,so I'll consider claiming other tasks?haha

@wu-sheng
Copy link
Copy Markdown
Member

All operation names, such as typically /eureka/ is collected by a HTTP client plugin. If that plugin says removing a / as suffix, I think it is generally easy.

@newboy2004
Copy link
Copy Markdown
Author

All operation names, such as typically /eureka/ is collected by a HTTP client plugin. If that plugin says removing a / as suffix, I think it is generally easy.

The code test mentioned above should return false,please @devkanro check

@wu-sheng
Copy link
Copy Markdown
Member

All operation names, such as typically /eureka/ is collected by a HTTP client plugin. If that plugin says removing a / as suffix, I think it is generally easy.

The code test mentioned above should return false,please @devkanro check

The point here is, returning false breaks the expectation when this PR raised.

@newboy2004
Copy link
Copy Markdown
Author

All operation names, such as typically /eureka/ is collected by a HTTP client plugin. If that plugin says removing a / as suffix, I think it is generally easy.

The code test mentioned above should return false,please @devkanro check

The point here is, returning false breaks the expectation when this PR raised.

so remvoing the last '/' suffix, it's not a good idea, it breaks the expectation.
or is there a logical problem with the modification of the code after the removal / modification

@wu-sheng
Copy link
Copy Markdown
Member

Removing suffix is not a good idea. And changing logic makes me concerns about new bugs.
After all, this is just an enhancement for this kind of case.

@devkanro
Copy link
Copy Markdown
Contributor

devkanro commented Nov 22, 2021

We should rearrange the rules so that we can analyze the edge conditions.

Basic (from spring ant matcher):

  1. ? matches any one char.
  2. * matches zero or more chars except '/'.
  3. ** matches zero or more path parts.

Edge(for *):

  1. /eureka/* matches /eureka/?
  2. /eureka/* matches /eureka/test/?
  3. /eureka/*/ matches /eureka/?
  4. /eureka/*/ matches /eureka/test?
  5. Is /eureka/* same as /eureka/*/?

Edge(for **):

  1. /eureka/** matches /eureka/?
  2. /eureka/** matches /eureka/test/?
  3. /eureka/**/ matches /eureka/?
  4. /eureka/**/ matches /eureka/test?
  5. Is /eureka/** same as /eureka/**/?

Edge(for normal):

  1. /eureka/ matches /eureka?
  2. /eureka matches /eureka/?

@devkanro
Copy link
Copy Markdown
Contributor

devkanro commented Nov 22, 2021

My answer(all false for current behavior):

Edge(for *):

  1. /eureka/* matches /eureka/?
  2. /eureka/* matches /eureka/test/?
  3. /eureka/*/ matches /eureka/?
  4. /eureka/*/ matches /eureka/test?
  5. Is /eureka/* same as /eureka/*/?

Edge(for **):

  1. /eureka/** matches /eureka/?
  2. /eureka/** matches /eureka/test/?
  3. /eureka/**/ matches /eureka/?
  4. /eureka/**/ matches /eureka/test?
  5. Is /eureka/** same as /eureka/**/?

Edge(for normal):

  1. /eureka/ matches /eureka?
  2. /eureka matches /eureka/?

@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Dec 6, 2021

No update in 2 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants