Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(proxy) prioritize longer regex path matches #5667

Open
wants to merge 1 commit into
base: master
from

Conversation

@chrisaljoudi
Copy link

chrisaljoudi commented Mar 11, 2020

Summary

Let’s say we have two Routes, each with a single path and otherwise identically-configured:

  • /api/users/(?<userId>\w+)
  • /api/users/(?<userId>\w+)/profile

In the current behavior, unless regex_priority is set, a request coming in to /api/users/30/profile will match the former route (the one without the profile suffix) if it was created first—so the current behavior is to fallback to prioritizing based on created-time.

With non-regex paths, Kong currently (very helpfully) defaults to giving precedence/priority to longer paths. This is a great default.

This change simply makes the same true for regex paths, which is a more reasonable/predictable default (compared to created-time based priority, which is implicit and undesirable).

Full changelog

  • Update router.lua to include regex paths in calculating max_uri_length for routes
`regex_priority` obviously still applies and takes precedence, but just like we prioritize longer matches for non-regex paths, prioritizing longer paths in the regex case, as well, is a more reasonable/predictable default (otherwise we fall back to created-time based priority, which is implicit and undesirable).
@CLAassistant

This comment has been minimized.

Copy link

CLAassistant commented Mar 11, 2020

CLA assistant check
All committers have signed the CLA.

@bungle

This comment has been minimized.

Copy link
Member

bungle commented Mar 11, 2020

with regex you only know which regex is longer but that doesn't answer which matches the longest uri. Shorter regex can match longer prefix.

@chrisaljoudi

This comment has been minimized.

Copy link
Author

chrisaljoudi commented Mar 11, 2020

@bungle Good point. Isn’t that significantly more 'intrusive' to implement, though?

To be clear, I don’t at all disagree that the semantic considering real match lengths is preferable—but it seems like basing this on regex length, while not ideal, is still preferable to the current behavior.

Thoughts?

(Also, just noticed a couple of the tests need to be updated. Will do that if the approach seems reasonable to you).

@bungle

This comment has been minimized.

Copy link
Member

bungle commented Mar 13, 2020

@hishamhm any opinions about this?

@hishamhm

This comment has been minimized.

Copy link
Member

hishamhm commented Mar 13, 2020

@chrisaljoudi while I agree that created-time is not a great default criteria (and that's of course why the regex_priority field exists to override that), but I think applying length of the regex as criteria could be very misleading to users because it would in many cases appear to filter by match length (especially since non-regex matches already do that), but, as @bungle correctly pointed out, not all.

Running all regexes and applying the longest match would IMO make more sense, I think, and yes, the implementation would look very different, etc. (Though regex_priority would still need to apply because you may still want to match shorter regexes). But doing that would be a big behavior compatibility breakage (just like this PR is). It would have to be done in a major version, because for better or worse, sorting by created-at is the currently documented behavior.

@chrisaljoudi

This comment has been minimized.

Copy link
Author

chrisaljoudi commented Mar 13, 2020

Thanks for the feedback @hishamhm and @bungle. Makes sense that this behavior change should be in a major version, probably.

With that in mind, what do you guys think if the logic was to default regex_priority to the number of capture groups in the regex?

Again, this would still be in a major version, just wondering what you semantically think of that vs. length (trying to support the use-case exemplified above for more ‘specific’/drilled-down parameterized routes being matched first).

Number of capture groups is one idea, and another is the number of ‘path segments’ (though that’s less deterministic as you point out, because a regex may match multiple segments).

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.