-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(proxy): host header including port #2045
Conversation
We shouldn't touch it, and preserve the original |
kong/templates/nginx_kong.lua
Outdated
@@ -103,7 +103,7 @@ server { | |||
proxy_set_header X-Real-IP $remote_addr; | |||
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; | |||
proxy_set_header X-Forwarded-Proto $scheme; | |||
proxy_set_header Host $upstream_host; | |||
proxy_set_header Host $upstream_host:$upstream_port; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the HTTP specs, the port really is part of the Host value if it is not the default port being requested for a service. This port could very well be included in the $upstream_host
variable on the Lua land, to avoid dispersing that logic. Besides and to highlight that fact, this new variable is currently undeclared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially since it is not required for the port to be present if it is the default one for the service being requested, and we would have more control over that from the Lua land if we wish to perform such changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Yes. |
There seems to be a bit more detail to it; The router matches on 'host-header', not on hostname. So to the router |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issues I have with this is how it makes the proxy retrieve the request 's headers on every call instead of doing it only when necessary, and the weirdness of the router's API now that it lost some responsibilities (see comment).
kong/core/handler.lua
Outdated
@@ -79,6 +79,8 @@ return { | |||
|
|||
ctx.KONG_ACCESS_START = get_now() | |||
|
|||
local original_host_header = ngx.req.get_headers().host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to avoid retrieving the headers if not necessary. This change will cancel that behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the logic requires the hostname only, so an extra field would have to be returned, the original host-header (incl. any port info). I see what you mean, but don't see a clean solution.
kong/core/handler.lua
Outdated
@@ -117,14 +120,13 @@ return { | |||
"' with: "..tostring(err)) | |||
end | |||
|
|||
if balancer_address.hostname and not api.preserve_host then | |||
var.upstream_host = balancer_address.hostname | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line-jump was intended.
kong/core/router.lua
Outdated
end | ||
|
||
upstream_host = headers["host"] | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test in the unit tests that currently ensures this behavior is taken care of by exec
. It it now failing.
@@ -309,6 +310,7 @@ local function execute(target) | |||
|
|||
target.ip = ip | |||
target.port = port | |||
target.hostname = target.host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using host
or rename host
to hostname
? They seem to ultimately be identical even though the comments in the balancer_address
structure seems to say otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
host
is the hostname extracted from the upstream url (and might be an upstream-name)
hostname
is the name that matches the final ip (never an upstream-name)
The original host
field must be retained in case of a retry, whereas hostname
might get a different value for each try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whereas hostname might get a different value for each try.
Where is that bit happening then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On retries, the balancer is called again: https://github.com/Mashape/kong/blob/189628b28c672886b1d74de9bf1857c1c7a80719/kong/kong.lua#L213-L222
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha
kong/core/router.lua
Outdated
@@ -493,6 +493,9 @@ function _M.new(apis) | |||
|
|||
|
|||
local host = headers["Host"] or headers["host"] | |||
if host then | |||
host = host:match("^([^:]+)") -- strip port number if given |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better using match(host, pattern)
in such a hot code path. Even better is to use ngx_pcre, to allow this bit to be JIT-compiled.
kong/core/router.lua
Outdated
if ngx.var.http_kong_debug then | ||
ngx.header["Kong-Api-Name"] = api_t.api.name | ||
end | ||
|
||
|
||
return api_t.api, api_t.upstream_scheme, upstream_host, api_t.upstream_port | ||
return api_t.api, api_t.upstream_scheme, api_t.upstream_host, api_t.upstream_port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the time of writing, the goal of this API was to return everything needed in order for Kong to forward upstream. Now with the balancing bit in between and the removal of the preserve_host
logic from this module, it seems like this is un-necessarily bloated.
We should either continue encapsulating such logic in this module (might not be possible since we now need to the port from the DNS resolution to build the definitive upstream Host header), or, simply return the Kong API (api_t.api
) and api_t
itself (which has all the necessary properties, but exposes some underlying router implementation which we don't really want).
We should return a table containing the final api
+ eventually another table containing scheme
, host
and port
, or include those properties in the 1st table being returned. Those are minor modifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering this is all internal implementation I'd opt for returning api_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least over returning yet another table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I kept the list and added the host_header
, seemed cleanest to me. All header fetching, including checks for preserve_host
are back in the router.
@thibaultcha imo good to go now. |
fixes #1347 |
if balancer_address.hostname and not api.preserve_host then | ||
var.upstream_host = balancer_address.hostname | ||
-- if set `host_header` is the original header to be preserved | ||
var.upstream_host = host_header or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I more and more find this syntax to be harder to read. In the future we should stick to:
if host_header then
var.upstream_host = host_header
else
var.upstream_host = balancer_address.hostname..":"..balancer_address.port
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick not using else is a more defensive programming practice
var.upstream_host = balancer_address.hostname..":"..balancer_address.port
if host_header then
var.upstream_host = host_header
end
reference: https://github.com/TheLadders/object-calisthenics#rule-2-dont-use-the-else-keyword
assert.equal("preserved.com:80", json.headers.Host) | ||
end) | ||
|
||
pending("preserves downstream host+non_default_port if enabled", function() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those pending?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as per comment: httpbin does not return the host header, but a parsed version with the port stripped, hence cannot test it...
I'm sorry but I don't think this even applies here. We are light years away from the examples and proposals given by the author, and the `or` pattern is still a branch, but more confusing to read. The way I understand it, this is just an exercice, and certainly not a "current best practices". We also do not have an OOP code base.
Plus the fact that the author is actually *cheering* for OOP... Hum, sorry.
… On Feb 8, 2017, at 6:41 AM, Colin Hutchinson ***@***.***> wrote:
@hutchic commented on this pull request.
In kong/core/handler.lua:
> @@ -117,14 +119,12 @@ return {
"' with: "..tostring(err))
end
- if balancer_address.hostname and not api.preserve_host then
- var.upstream_host = balancer_address.hostname
+ -- if set `host_header` is the original header to be preserved
+ var.upstream_host = host_header or
nitpick not using else is a more defensive programming practice
var.upstream_host = balancer_address.hostname..":"..balancer_address.port
if host_header then
var.upstream_host = host_header
end
reference: https://github.com/TheLadders/object-calisthenics#rule-2-dont-use-the-else-keyword
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
|
We want to avoid a double assignment, and take advantage of branch prediction here. Hence, else. Better make it readable than not, at least, is my point.
… On Feb 8, 2017, at 6:41 AM, Colin Hutchinson ***@***.***> wrote:
@hutchic commented on this pull request.
In kong/core/handler.lua:
> @@ -117,14 +119,12 @@ return {
"' with: "..tostring(err))
end
- if balancer_address.hostname and not api.preserve_host then
- var.upstream_host = balancer_address.hostname
+ -- if set `host_header` is the original header to be preserved
+ var.upstream_host = host_header or
nitpick not using else is a more defensive programming practice
var.upstream_host = balancer_address.hostname..":"..balancer_address.port
if host_header then
var.upstream_host = host_header
end
reference: https://github.com/TheLadders/object-calisthenics#rule-2-dont-use-the-else-keyword
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.
|
That said, the case for not using else is made. Not that it wasn't clear before though.
… On Feb 8, 2017, at 9:31 AM, Thibault Charbonnier ***@***.***> wrote:
We want to avoid a double assignment, and take advantage of branch prediction here. Hence, else. Better make it readable than not, at least, is my point.
> On Feb 8, 2017, at 6:41 AM, Colin Hutchinson ***@***.***> wrote:
>
> @hutchic commented on this pull request.
>
> In kong/core/handler.lua:
>
> > @@ -117,14 +119,12 @@ return {
> "' with: "..tostring(err))
> end
>
> - if balancer_address.hostname and not api.preserve_host then
> - var.upstream_host = balancer_address.hostname
> + -- if set `host_header` is the original header to be preserved
> + var.upstream_host = host_header or
> nitpick not using else is a more defensive programming practice
>
> var.upstream_host = balancer_address.hostname..":"..balancer_address.port
> if host_header then
> var.upstream_host = host_header
> end
> reference: https://github.com/TheLadders/object-calisthenics#rule-2-dont-use-the-else-keyword
>
> —
> You are receiving this because you modified the open/close state.
> Reply to this email directly, view it on GitHub, or mute the thread.
>
|
yeah I wasn't sure how applicable OOP calisthenics were. I figure lua has polymorphism so majority of them should still work. but like I mentioned it's a nitpick (or bike shed if you prefer) :) |
If the host header is updated to include the port, then what is the exact desired behaviour?
So if
preserve_host
is set on the api definition, we do not touch the header at all? or do we still inject a default port if it was not in the received host header?If we don't preserve the header, we could simply insert the upstream host and port.