Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(router) do not force an extra segment in the path #3780

Merged
merged 5 commits into from Oct 1, 2018

Conversation

Projects
None yet
3 participants
@Tieske
Copy link
Member

Tieske commented Sep 17, 2018

Issues resolved

Fixes the issue/todo identified in #3749

Read carefully

The way this is handled is by changing the way the router constructs the outgoing path. The path is constructed from these elements in order:

  1. the upstream prefix (the path property in the service entity)
  2. the path match (based on the paths property in the route entity)
    • if matching on path
    • if not stripping the path
  3. the request postfix (the part after the matched path, or if not matched by path, the full request path)

In all cases, the initial / of the incoming requested path will be ignored/removed. This allows the user to control the presence of the / in the outgoing path.

The reason for this change is because in the existing code, different inputs lead to the same output, always injecting the extra /. See these lines for example. The first set of 5 tests, yield the exact same upstream path as the second set of 5 lines, despite the upstream_path being different. The updated tests (now a block of 12) start here.

This PR also adds quite a number of test cases that were not previously covered.

The bad news is that this is a breaking change, since it changes the way the upstream path is constructed. The good news is that in the entire Kong test suite there was only 1 occurrence of a test failing because of this.

@Tieske Tieske self-assigned this Sep 17, 2018

@Tieske Tieske force-pushed the fix/path-handling branch 6 times, most recently from b520358 to 8d549a9 Sep 17, 2018

Show resolved Hide resolved kong/router.lua Outdated
@@ -371,7 +371,7 @@ do

if m then
if m.stripped_uri then
ctx.matches.stripped_uri = "/" .. m.stripped_uri
ctx.matches.stripped_uri = m.stripped_uri

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

Throughout this function, we changed the behavior around ctx.matches.stripped_uri in such a way that nothing is done anymore to ensure that it is prefixed by /, correct?

Unfortunately, I do not recall that we ever implemented tests around the ctx.matches values.

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

yes, its behaviour is now that is will be everything AFTER the match, so the / will only be there if the matched path does not include it. The variable should probably be renamed, but I postponed that for now.

And yes, we should add those tests.

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

The variable should probably be renamed

The name is fine as is, it is still the stripped part of the URL. I was mostly worried about the breaking change; but it might be a necessary evil here.

@@ -388,15 +388,7 @@ do
end

-- plain or prefix match from the index
if route_t.strip_uri then

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

This branch was an optimization around the two sub and the concatenation operations that it enclosed. Depending on the answer to the comment right above this one (about ctx.matches.stripped_uri), we may or may not need bring it back.

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

since the behavior changed, this doesn't need to come back imo

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 24, 2018

Member

Well, I would beg to differ. Now that this condition is removed, ctx.matches.stripped_uri would be set unconditionally when the Route is a plain prefix match, wouldn't it? Even if a Route does not have strip_path enabled? That would not be desirable behavior, as it would lead some existing implementations or developers into believing that a request got its URL stripped, while it was not.
I understand that this is needed as per the new way of building the upstream URL, but it should not be exposed in the user-facing router matches if not necessary, or bear a different name.

-- uri trailing slash logic
-- if we do not have a path-match, then the postfix is simply the
-- incoming path, without the initial slash
local request_postfix = matches.stripped_uri or sub(req_uri, 2, -1)

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

This variable should only be needed within the below if matched_route.strip_uri branch. Let's move it within it then. Doing so would also avoid executing sub(req_uri, 2, -1) twice.

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

see comment below, fix it in after-access

upstream_uri = upstream_url_file .. upstream_uri
if matched_route.strip_uri then
-- we drop the matched part, replacing it with the upstream path
if sub(upstream_base, -1, -1) == "/" and sub(request_postfix, 1, 1) == "/" then

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

The first statement of this condition (if sub(upstream_base, -1, -1) == "/") could be avoided if we relied on upstream_url_t.file, which we sanitize only once, here, to ensure that it does not contain a trailing slash.

And when upstream_base == "/" (initialized to file instead of path), we should then do: upstream_uri = request_postfix. That would be cheaper.

Besides, it seems like the third argument in the first sub() call is unnecessary, but that's irrelevant if superseded by the above.

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

style: mind keeping the line length below 80 characters please?

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

Considering this code will be moved out of this pre-access code, into post-access, I'd rather not as it seems a premature optimization doing that now. Once a follow up PR puts this in the proper place, we can optimize it there imo.

That said, afaik, the file thing is only used here in the router, should it be removed? Once the path construction moves to after-access, we cannot rely on it any more.

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

It wasn't clear that there would be a subsequent PR. In this case, I'd rather see the final result of these changes at once, in the same PR (separate commits is fine). I'd argue that aiming towards commit atomicity does mean that each commit should be written optimally: say we end up reverting commit number 2, now commit number 1 does not include the optimizations that were only added in 2.

That said, afaik, the file thing is only used here in the router, should it be removed?

When a variable becomes stale (and only then), please, by all means, remove it!

@@ -1681,64 +1681,89 @@ describe("Router", function()

describe("slash handling", function()
local checks = {
-- upstream url paths request path expected path strip uri
-- upstream url paths request path expected path strip uri

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 21, 2018

Member

Unrelated to this PR, but I truly wish the order of those were: request path, route path, upstream url, expected url... Even despite the fact that the upstream URL is built from upstream+path+request. If anybody with some time on their hands is reading this...

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

yes, I had that itch as well 😄

Show resolved Hide resolved spec/02-integration/05-proxy/01-router_spec.lua Outdated
@thibaultcha

This comment has been minimized.

Copy link
Member

thibaultcha commented Sep 21, 2018

@Tieske Why does this PR have a "do not merge" label? This label signifies that a PR is rejected. If you are worried that this PR gets merged without proper review/testing, that should not happen as per in the first place as per our process.

@Tieske
Copy link
Member Author

Tieske left a comment

I'll drop the "do not merge" label, I just wanted to make very sure it wouldn't get merged

@@ -371,7 +371,7 @@ do

if m then
if m.stripped_uri then
ctx.matches.stripped_uri = "/" .. m.stripped_uri
ctx.matches.stripped_uri = m.stripped_uri

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

yes, its behaviour is now that is will be everything AFTER the match, so the / will only be there if the matched path does not include it. The variable should probably be renamed, but I postponed that for now.

And yes, we should add those tests.

@@ -388,15 +388,7 @@ do
end

-- plain or prefix match from the index
if route_t.strip_uri then

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

since the behavior changed, this doesn't need to come back imo

upstream_uri = upstream_url_file .. upstream_uri
if matched_route.strip_uri then
-- we drop the matched part, replacing it with the upstream path
if sub(upstream_base, -1, -1) == "/" and sub(request_postfix, 1, 1) == "/" then

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

Considering this code will be moved out of this pre-access code, into post-access, I'd rather not as it seems a premature optimization doing that now. Once a follow up PR puts this in the proper place, we can optimize it there imo.

That said, afaik, the file thing is only used here in the router, should it be removed? Once the path construction moves to after-access, we cannot rely on it any more.

@@ -1681,64 +1681,89 @@ describe("Router", function()

describe("slash handling", function()
local checks = {
-- upstream url paths request path expected path strip uri
-- upstream url paths request path expected path strip uri

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

yes, I had that itch as well 😄

-- uri trailing slash logic
-- if we do not have a path-match, then the postfix is simply the
-- incoming path, without the initial slash
local request_postfix = matches.stripped_uri or sub(req_uri, 2, -1)

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 21, 2018

Author Member

see comment below, fix it in after-access

@Tieske Tieske removed the pr/do not merge label Sep 21, 2018

@Tieske Tieske force-pushed the fix/path-handling branch 4 times, most recently from 46e5780 to 0eee48d Sep 24, 2018


else
if path == null then

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 24, 2018

Member

Maybe:

local path = service.path or "/"

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 25, 2018

Author Member

my assumption here was that the TODO comment above indicated that in the future this value comes in as null and not nil, hence I retained the existing logic with the extra check. And it seemed a bad combination to also fix that here?

This comment has been minimized.

Copy link
@bungle

bungle Sep 25, 2018

Member

There was once a time when new dao returned null instead of nil, but that was changed to configurable. Today it returns nil by default. Only the admin API uses null.

This comment has been minimized.

Copy link
@Tieske

Tieske Sep 25, 2018

Author Member

Fixed after consulting @bungle, see d1cad14

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 25, 2018

Member

Yes, just like @bungle said. Thank you!

@@ -388,15 +388,7 @@ do
end

-- plain or prefix match from the index
if route_t.strip_uri then

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Sep 24, 2018

Member

Well, I would beg to differ. Now that this condition is removed, ctx.matches.stripped_uri would be set unconditionally when the Route is a plain prefix match, wouldn't it? Even if a Route does not have strip_path enabled? That would not be desirable behavior, as it would lead some existing implementations or developers into believing that a request got its URL stripped, while it was not.
I understand that this is needed as per the new way of building the upstream URL, but it should not be exposed in the user-facing router matches if not necessary, or bear a different name.

@Tieske Tieske force-pushed the fix/path-handling branch 4 times, most recently from 279176f to ed8712d Sep 27, 2018

Tieske added some commits Sep 17, 2018

fix(router) do not force an extra segment in the path
Fixes path concatenation to be more logical. See the
updated tests for changed behaviour.

Fixes the 'todo' of #3749

Tieske added some commits Sep 18, 2018

feat(route) allow trailing slash
Allow a trailing slash in a route for path matching, this
provides the user with the option to potentially strip the
final slash from the upstream path.
tests(router) duplicate test matrix to also run regex based
Duplicates the test matrix for slash-handling, modified to
do the path mathcing based on a regex.

@Tieske Tieske force-pushed the fix/path-handling branch from ed8712d to 8da38aa Sep 28, 2018

@thibaultcha
Copy link
Member

thibaultcha left a comment

LGTM, but I thought we agreed on the pressing need to test the path building with regexes as well?

{ "/fee/bor", nil, "/foo/bar/", "/fee/borfoo/bar/", true },
{ "/fee/bor/", nil, "/", "/fee/bor/", true },
{ "/fee/bor/", nil, "/foo/bar", "/fee/bor/foo/bar", true },
{ "/fee/bor/", nil, "/foo/bar/", "/fee/bor/foo/bar/", true },

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Oct 1, 2018

Member

Are we not testing with regex paths as we discussed? By simply copy-pasting a prefix match block, and making "dumb" regexes?

This comment has been minimized.

Copy link
@Tieske

Tieske Oct 1, 2018

Author Member

@thibaultcha look closer 👀

here: https://github.com/Kong/kong/pull/3780/files#diff-1770b4b763fc513f92257840b039392eR1883

and here: https://github.com/Kong/kong/pull/3780/files#diff-e1afd87867200fd0cc7977cf95e039a6R1093

(injecting the prefix had to be slightly smarter, since the route must start with a / as per the schema)

This comment has been minimized.

Copy link
@thibaultcha

thibaultcha Oct 1, 2018

Member

Great then

@thibaultcha

This comment has been minimized.

Copy link
Member

thibaultcha commented Oct 1, 2018

If good to merge on your end too, let's proceed with a merge commit

@Tieske

This comment has been minimized.

Copy link
Member Author

Tieske commented Oct 1, 2018

yay 🎉 @thibaultcha please go ahead and merge as you see fit

@thibaultcha thibaultcha merged commit 62422bf into next Oct 1, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@thibaultcha thibaultcha deleted the fix/path-handling branch Oct 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.