Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Use Forwarded/X-Forwarded-{For,Host,By,Port,Proto} to fixup WSGI environ #209

Merged
merged 18 commits into from Dec 3, 2018

Conversation

digitalresistor
Copy link
Member

@digitalresistor digitalresistor commented Sep 7, 2018

If we are in a trusted proxy situation whereby waitress is running behind nginx/haproxy/any other reverse proxy (whose IP matches what is configured, or if the socket we receive the connection on is a unix socket) then we want to pull a variety of information from the X-Forwarded-* headers or Forwarded which has superseded it.

The algorithm will pull X-Forwarded-* headers, unless there is a Forwarded header present and trusted_proxy_headers is set to just forwarded, in which case it takes priority.

By default we will use the proxy Forwarded header to set the following:

  • wsgi.url_scheme (Proto)
  • SERVER_NAME (Using Host param)
  • SERVER_PORT (Using Proto param to set a default, overriden by Host param)
  • HTTP_HOST (Using Host param, adding the SERVER_PORT as necessary)
  • REMOTE_ADDR (Using For param, first For is used as it is client IP address)
  • REMOTE_PORT (Using For param, as above, if a port is available)

This off course also means that Waitress now supports https://tools.ietf.org/html/rfc7239 's Forwarded header! Yay for being standards compliant.

Waitress also now knows how to clean up the headers and thus the WSGI environment, so that untrusted headers are no longer passed through to downstream WSGI applications using clear_untrusted_proxy_headers.

Closes #207

@digitalresistor digitalresistor force-pushed the bugfix/socket-server-name-port branch 3 times, most recently from cd161dc to a5f205e Compare September 23, 2018 23:33
waitress/task.py Outdated Show resolved Hide resolved
@mmerickel
Copy link
Member

mmerickel commented Sep 24, 2018

What is the thought process behind the trusted_proxy_forwarded setting? I don't think it's clearly addressed anywhere in the diff so I want to ask. It looks like you're using this flag (false by default) as an opportunity to strip all X-Forwarded-* headers which is good. It also looks like when it's false, the headers other than Forwarded are always preserved even when the trusted proxy does not match. I know you talked in the past about stripping these headers using a clear_untrusted_proxy_headers setting which would clear them if the trusted proxy did not match. Did you decide not to do this, or are you saving it for later? Right now, it seems like the only way to trust those headers is if we use trusted_proxy_forwarded = yes but that requires our upstream proxy to support the forwarded spec instead of the more common X-Forwarded headers headers. To me, the ideal here, would be to always strip any proxy headers if the trusted_proxy setting is set and doesn't match but we need a migration path to get to that point and clear_untrusted_proxy_headers defaulting to false would get us there, allowing a switch to true as the default in the future but overridable to true now, unrelated to which proxy headers our proxy is using.

tl;dr; I think there should be a clear_untrusted_proxy_headers setting defaulting to false, with a possible warning if the default value is used and in a future version, the default should be set to true and the warning removed.

Second point of note - you decided to always trust proxy headers from a unix socket, I'm not sure this is a good idea as it's still not really clear where those headers came from. You'd want to at least document this very clearly.

@digitalresistor
Copy link
Member Author

digitalresistor commented Sep 24, 2018

What is the thought process behind the trusted_proxy_forwarded setting?

I had originally planned to say that if the Forwarded header exists, then we automatically use it over the X-Forwarded-{blah} however I realised that almost no proxies at the moment natively support Forwarded, and most don't strip the header either. This means that if you were behind nginx, and a client sent Forwarded: something and nginx only sets X-Forwarded-{For,Proto} then the client would dictate what waitress would see.

I wanted some option that would allow you to turn off processing of Forwarded when it is not expected.

It also looks like when it's false, the headers other than Forwarded are always preserved even when the trusted proxy does not match.

I haven't added the security/safety feature yet. trusted_proxy has already been verified at this point (before calling this function), so yes, we preserve the old style headers and use them... I am not sure what you were mentioning here.

Did you decide not to do this, or are you saving it for later?

I haven't added it to this PR.

Right now, it seems like the only way to trust those headers is if we use trusted_proxy_forwarded = yes but that requires our upstream proxy to support the forwarded spec instead of the more common X-Forwarded headers headers.

This is false. If you set trusted_proxy it will use the X-Forwarded-* headers. If you set trusted_proxy with trusted_proxy_forwarded then it will use the Forwarded header.

To me, the ideal here, would be to always strip any proxy headers if the trusted_proxy setting is set and doesn't match but we need a migration path to get to that point and clear_untrusted_proxy_headers defaulting to false would get us there, allowing a switch to true as the default in the future but overridable to true now, unrelated to which proxy headers our proxy is using.

You are talking about stuff I haven't implemented in this PR. Currently it only strips the old style headers if you are explicitly trusting Forwarded. You can't have both in the environment. Forwarded contains richer and better information. Having the old style headers live alongside the new style means you potentially have issues downstream if downstream tries to use the X-Forwarded-For for example and waitress made decisions based upon Forwarded.

tl;dr; I think there should be a clear_untrusted_proxy_headers setting defaulting to false, with a possible warning if the default value is used and in a future version, the default should be set to true and the warning removed.

Once again, this is not yet implemented...

Second point of note - you decided to always trust proxy headers from a unix socket, I'm not sure this is a good idea as it's still not really clear where those headers came from. You'd want to at least document this very clearly.

I figured there would be some contention on that... BUT the reason why this was done is because with the unix domain socket we don't get valid REMOTE_ADDR/REMOTE_PORT information by default, nor do we have a valid SERVER_NAME/SERVER_PORT (which is what prompted me to re-visit the proxy header stuff in the first place, because Tres and I disagreed on the defaults for SERVER_NAME in that case).

This technically means that when using a unix domain socket, we can not provide the requisite information required for a valid WSGI environ. Unfortunately PEP3333 provides 0 guidance in this case.

@mmerickel
Copy link
Member

This is false. If you set trusted_proxy it will use the X-Forwarded-* headers. If you set trusted_proxy with trusted_proxy_forwarded then it will use the Forwarded header.

Once again, this is not yet implemented...

Ok, I wanted to bring it up even if it isn't a goal to do it in this PR. I think my comment is not wrong, which is why it would be good to add the feature, even if it's not done in this PR. To rephrase slightly - as an app you can't trust the headers unless trusted_proxy_forwarded is true because waitress will not strip them from the request (currently) when trusted_proxy doesn't match or isn't set. Other middleware like pastedeploy's prefix middleware will use those headers right now and so ideally if they are untrusted they will be stripped so that the middleware doesn't see them and use them. I'm pretty sure we agree - and it doesn't need to be done in this PR.

This technically means that when using a unix domain socket, we can not provide the requisite information required for a valid WSGI environ. Unfortunately PEP3333 provides 0 guidance in this case.

Does this mean that some sort of extra support should be added to the trusted_proxy setting? Something like trusted_proxy = unix://... on top of trusted_proxy = ip and trusted_proxy = *? If that were done then it'd given the user more control over which interfaces are trusted.

@mmerickel
Copy link
Member

I think my comment is not wrong, which is why it would be good to add the feature, even if it's not done in this PR. To rephrase slightly - as an app you can't trust the headers unless trusted_proxy_forwarded is true because waitress will not strip them from the request (currently) when trusted_proxy doesn't match or isn't set.

Ok, this was wrong. Basically you just can't ever trust the headers as an app right now because there is no indication anywhere whether trusted_proxy matched and if it didn't match the headers are preserved. This, we both agree, should be fixed in a separate PR via some sort of clear_untrusted_proxy_headers setting.

@digitalresistor
Copy link
Member Author

Does this mean that some sort of extra support should be added to the trusted_proxy setting?

Due to the way that waitress is architected (and before this PR), currently the trusted_proxy setting for trusting the unix socket is "localhost". Whereas for TCP/IP it is always an IP address.

@digitalresistor
Copy link
Member Author

Also, waitress does not support both unix socket AND tcp/ip in the same process.

@mmerickel
Copy link
Member

Also, waitress does not support both unix socket AND tcp/ip in the same process.

If there are plans to keep it that way then fine. Otherwise it'd probably be smart to think about the concern further. My understanding of the listen = ... feature was that maybe they could be done in the same process, but I guess the way unix sockets are handled right now is at a higher level (between server impls, versus inside the tcp server). In that case, it could be worth raising if those settings are used with the unix server or making the adjustments depend on which server impl is being used.

@digitalresistor
Copy link
Member Author

While technically possible, I don't think it is a good idea to mix unix socket/tcp in the same process. We'd want to have different rules depending on where we received a connection, and things would get complicated quickly.

If you are behind a proxy with tcp/ip you tend to have a single host/port combination you bind to, and only allow the proxy to connect to waitress (generally by binding to 127.0.0.1), with a unix socket implementation that extends to binding to the local file system (so you can't have your proxy on a separate host from your waitress instance).

Either way the configuration applies to all instances, there is no way to say "this port should accept proxy headers" and "this one shouldn't".

I don't think adding that is worthwhile either. Having either tcp or unix socket makes sense to me, and is the way it has been for a long time.

@mmerickel
Copy link
Member

Ok. My primary complaint right now is that there's not a way to opt-out of trusting headers when using a unix socket. It really feels like they should default to untrusted and just handle trusted_proxy = yes or * or something to enable it instead of the way it's being done right now.

@tseaver
Copy link
Member

tseaver commented Sep 25, 2018

@mmerickel The thing about waitress listening a Unix domain socket is that it is always behind a proxy, and one running on the same host -- one can't "expose" it inadvertently. In the typical setup, it is even going to be accessible only to a single user account on that host -- the same one as is running waitress. Hard to think of a case where it shouldn't be trusted.

@mmerickel
Copy link
Member

The thing about waitress listening a Unix domain socket is that it is always behind a proxy, and one running on the same host -- one can't "expose" it inadvertently. In the typical setup, it is even going to be accessible only to a single user account on that host -- the same one as is running waitress. Hard to think of a case where it shouldn't be trusted.

@tseaver trusted here is about certain http headers coming in from the proxy. In my experience basically every proxy doesn’t properly strip / validate these headers and thus should count as untrusted until the user tells us via trusted_proxy that they have configured that proxy to validate / control those headers.

@mmerickel
Copy link
Member

mmerickel commented Sep 25, 2018

If you want a precedent, nginx requires you to opt-in to this type of stuff even for unix sockets. Note that set_real_ip_from supports a unix: option, which is analogous to our trusted_proxy setting. https://nginx.org/en/docs/http/ngx_http_realip_module.html

@digitalresistor
Copy link
Member Author

digitalresistor commented Sep 25, 2018

Not sure that using an optional module that a user has to explicitly compile into NGINX in a server that has defaults for anything is really precedent. Starting waitress without any config will start a server on a certain port, starting nginx with an empty config file will do nothing (it may even error). Also, nginx doesn't require certain variables to exist in an "environ" before passing the request off to something else.

That being said, I understand the concern. User education is going to be key. I just don't know how I can make sure that waitress creates a WSGI environ that isn't one big giant lie when receiving a connection over a unix socket.

  • REMOTE_ADDR will be wrong (localhost)
  • wsgi.url_scheme will be http/https (depends on config)
  • SERVER_NAME will be localhost
  • SERVER_PORT will be None
  • HTTP_HOST will be whatever the proxy forwarded on, a changed value, or unset

Using the url reconstruction specified in PEP 3333 you cannot generate a valid URL from this to re-create the original URL the client sent.

@mmerickel
Copy link
Member

I'm really only concerned here about removing the trusted_proxy key from the server object and requiring the user to set trusted_proxy = yes or unix or something in their config if they want waitress to start using http headers to fill in parts of the environ. The realip example is a precedent for not trusting, by default, the http headers coming over the wire from a unix socket.

@mmerickel
Copy link
Member

mmerickel commented Sep 26, 2018

Another precedent if you want me to keep compiling a list for something that seems fairly uncontroversial to me is express.js where they require you to set app.set('trust proxy', true) for it to use the headers [1]. I understand the concern that if you don't trust the upstream proxy you have trouble constructing a valid wsgi environ but I just can't justify in my head that it's ok to trust http headers without explicitly opting into it.

[1] https://expressjs.com/en/guide/behind-proxies.html

@digitalresistor
Copy link
Member Author

ExpressJS doesn't support unix sockets... I didn't set the default on non-unix sockets to trust it either... but whatever. I really don't care.

You can set trusted_proxy to localhost when using a unix socket and it'll have the right behavior.

@mmerickel
Copy link
Member

Express does support Unix sockets. A value of localhost sounds ok to me.

@digitalresistor digitalresistor force-pushed the bugfix/socket-server-name-port branch 3 times, most recently from b556fca to a3c3bef Compare December 2, 2018 12:01
@digitalresistor
Copy link
Member Author

@mmerickel this is ready for another review of the code. I need to still fix/write up the documentation for the changes.

When facing the possibility of a proxy that is sending wacky values
docs/reverse-proxy.rst Outdated Show resolved Hide resolved
docs/reverse-proxy.rst Outdated Show resolved Hide resolved
docs/reverse-proxy.rst Outdated Show resolved Hide resolved
CHANGES.txt Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
docs/arguments.rst Outdated Show resolved Hide resolved
@digitalresistor digitalresistor merged commit 8cfacc1 into master Dec 3, 2018
@digitalresistor digitalresistor deleted the bugfix/socket-server-name-port branch December 3, 2018 05:50
- log_untrusted_proxy_headers (useful for debugging)

Be aware that the defaults for these are currently backwards compatible with
older versions of Waitress, this will change in a future release of waitress.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waitress. This ... Waitress.


Be aware that the defaults for these are currently backwards compatible with
older versions of Waitress, this will change in a future release of waitress.
If you expect to need this behaviour please explicitly set these variables in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need this behavior, please...

docs/arguments.rst Outdated Show resolved Hide resolved
default ``None``

trusted_proxy_count
How many proxies we trust when chained
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of trusted proxies when chained.

docs/arguments.rst Outdated Show resolved Hide resolved
``Forwarded`` supports similar functionality as the different individual
headers, and is mutually exclusive to using the ``X-Forwarded-*`` headers.

To configure waitress to use the ``Forwarded`` header, set::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/::/: and please specify lexer for correct syntax highlighting.

raise ValueError(
"The values trusted_proxy_headers and clear_untrusted_proxy_headers "
"have no meaning without setting trusted_proxy. Cowardly refusing to "
"continue."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest something like this:

trusted_proxy must be set when trusted_proxy_headers and clear_untrusted_proxy_headers are set.

raise ValueError(
"The Forwarded proxy header and the "
"X-Forwarded-{By,Host,Proto,Port,For} headers are mutually "
"exclusive. Can't trust both!"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Can't trust both!/You must set zero or one of these headers, and not both.

warnings.warn(
'No proxy headers were marked as trusted, but trusted_proxy was set. '
'Implicitly trusting X-Forwarded-Proto for backwards compatibility. '
'This will be removed in future versions of waitress.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waitress


def warn_unspecified_behavior(header):
self.logger.warning(
"Found multiple values in %s, this has unspecified behaviour. "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%s. This has unspecified behavior.

@stevepiercy
Copy link
Member

Aw, crap! Merged while I was reviewing.

@digitalresistor
Copy link
Member Author

@stevepiercy feel free to open a new PR. The waitress documentation is meh all around though, so there's a lot to do as you already noticed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants