Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gunicorn 19.0: empty remote addr on unix domain socket? #797

Open
jeroenp opened this issue Jun 18, 2014 · 29 comments
Open

gunicorn 19.0: empty remote addr on unix domain socket? #797

jeroenp opened this issue Jun 18, 2014 · 29 comments

Comments

@jeroenp
Copy link

jeroenp commented Jun 18, 2014

Hi,

I've just upgraded gunicorn 18.0 to 19.0. Now REMOTE_ADDR is an empty string, where before it listed the remote ip ("real ip").

Is that an intentional change?

My setup snippets:

gunicorn:

...
bind = 'unix:/var/run/webapp/xyz.sock'

reverse proxy, nginx (1.4.1):

    upstream xyz_server {
        server unix:/var/run/webapp/xyz.sock;
    }
...
    proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header   Host $server_name;
    proxy_redirect off;
    proxy_pass http://xyz_server;

My Django says that request.META['REMOTE_ADDR'] == ""

@benoitc
Copy link
Owner

benoitc commented Jun 20, 2014

Yes. an unix socket has no remote address any more to comply with the CGI/WSGI spec. You will need to check the X-Forwarded headerin your application if django doesn't provide a way.

The relevant commit is this one: c487368

It has been discussed in #633. Let me know if you want to know more.

@benoitc
Copy link
Owner

benoitc commented Jun 27, 2014

@jeroenp ping :)

@junfenglx
Copy link

If not use X-Fowarded-For to check client real IP, how to log client real IP?
As the docs say: access_log_format %(h)s log remote address.
But if gunicorn under Nginx, this "remote address" always is the proxy server address.
add %({x-forwarded-for}i)s to access_log_format for recording client real IP?

I think the description of the %(h)s is a little ambiguous.

my access_log_format: '''%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" "%({x-forwarded-for}i)s"'''

example output:
127.0.0.1 - - [28/Jun/2014:15:54:59 +0800] "POST /recordpage HTTP/1.0" 200 20 "http://www.xstogether.com/score.php?city=xian&" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1" "61.150.43.97"

@benoitc
Copy link
Owner

benoitc commented Jun 28, 2014

we can probably put the "forwarded ip" in the logs. @davisp @tilgovi thoughts?

@tilgovi
Copy link
Collaborator

tilgovi commented Jun 28, 2014

Sounds fine
On Jun 28, 2014 2:20 AM, "Benoit Chesneau" notifications@github.com wrote:

we can probably put the "forwarded ip" in the logs. @davisp
https://github.com/davisp @tilgovi https://github.com/tilgovi
thoughts?


Reply to this email directly or view it on GitHub
#797 (comment).

@junfenglx
Copy link

Then on logs, We may have two IPs.

@tilgovi
Copy link
Collaborator

tilgovi commented Jun 29, 2014

I took that to mean that we'd add a log substitution variable, and people
could configure the access log format however they like for their use case.
On Jun 28, 2014 6:46 PM, "junfeng" notifications@github.com wrote:

Then on logs, We may have two IPs.


Reply to this email directly or view it on GitHub
#797 (comment).

@junfenglx
Copy link

May be %({x-forwarded-for}i)s -> %(x)s ?
What's the default?
%(h)s or %({x-forwarded-for}i)s ?

@benoitc
Copy link
Owner

benoitc commented Jun 29, 2014

h is already used for the REMOTE_ADDR which now follows the rfc.

Anyway I am not sure we need to add anything. You can already log any header in the request if you want. If you add {HTTP_X_FORWARDED_FOR}i in the log line, it should work.

@jeroenp
Copy link
Author

jeroenp commented Jun 30, 2014

I'm inclined to think that clearing the REMOTE_ADDR field is frivolous and breaks a lot of applications. I don't see any specification for non-TCP connections. My interpretation is that for unix domain sockets you'd still want to pretend that it has a complete set of CGI headers: Everyone trusts that there is a REMOTE_ADDR. An application can only guess that any other unspecified field (XFF, a list no less, or X-REAL-IP) is set.

Django uses REMOTE_ADDR for testing for "Internal IPs" (https://docs.djangoproject.com/en/dev/ref/settings/#internal-ips). Most of my applications trust that the REMOTE_ADDR header is set in some way; Leaving the field empty effectively blocks 19.0 usage for me: I have to chose between auditing/editing various pieces of code (maybe not my own) to use XFF or switch from unix domain sockets to TCP/IP.

@benoitc
Copy link
Owner

benoitc commented Jul 1, 2014

frivolous? No not really. It just happen that django is built from the start to handle tcp connections only.

Anyway the original problem described in #633 is that the proxy address was not found due to the way we set the remote address handling the FORWARDED headers. Also that we didn't respect the spec. And to be honest it would be easier if django would comply with the spec and use the HTTP headers to offer the feature you link. (Maybe opening a ticket there?)

Now, in fact without this change, the application can still discover the remote_address by using the http headers. we could let the forward header pass the raw remote host differently and revert the change in #633.

Any feedback is welcome.

@GrahamDumpleton
Copy link
Contributor

Perhaps wise to start moving away from X-Forwarded-For nomenclature in down stream usages as there is now an RFC for a proper replacement.

http://tools.ietf.org/html/rfc7239

@public
Copy link

public commented Jul 1, 2014

From the looks of the Django source it would be feasible to have a middleware that patched REMOTE_ADDR for you in this situation. No editing of 3rd party code required. Sticking to the WSGI spec seems fine to me.

@benoitc
Copy link
Owner

benoitc commented Jul 1, 2014

@GrahamDumpleton You mean introduces a FORWARDED property?

@benoitc
Copy link
Owner

benoitc commented Jul 5, 2014

cc @davisp @tilgovi @sirkonst @kennethreitz @matrixise @berkerpeksag @fafhrd91 @asvetlov any feedback is welcome ;)

@matrixise
Copy link
Collaborator

I have no ideas on this question :/ sorry, Doesn't really help us, sorry

if we are using unix socket, as explained in #633 we remove REMOTE_PORT, we have to make the same thing with REMOTE_ADDR.

@tilgovi
Copy link
Collaborator

tilgovi commented Jul 8, 2014

Django include django.middleware.http.SetRemoteAddrFromForwardedFor.

I don't think we should be doing anything differently. If you're deploying with a UNIX socket you should be using this middleware because you have a trusted proxy and it should be setting the X-Forwarded-For header (or something else of your choosing).

It's a common enough need that we could consider a flag in gunicorn as an enhancement proposal, but it's also something handled by most frameworks. Django has the middleware, above. Pyramid has request.client_addr. I'm sure other major frameworks have their own common solutions.

@benoitc
Copy link
Owner

benoitc commented Jul 9, 2014

@tilgovi i am also thinking the current behaviour is the good one. If everyone is OK I will close this ticket and release the 19.1.

@matrixise
Copy link
Collaborator

I think we need to keep REMOTE_ADDR to "" and Django has to use a middleware for this behaviour.
With Flask, there is a middleware when you are using a proxy http://werkzeug.pocoo.org/docs/contrib/fixers/#werkzeug.contrib.fixers.ProxyFix

@brentc
Copy link

brentc commented Sep 4, 2014

@tilgovi Just an FYI, as best as I can tell SetRemoteAddrFromForwardedFor is long-gone from Django's middleware module. Since at least 1.4. A quick search doesn't tell me when it was removed.

@collinanderson
Copy link
Contributor

It seems to to me it's the role of the wsgi server, not the application, to account for the situation of being behind a reverse proxy. The application should just be able to use REMOTE_ADDR directly, regardless of how exactly it's being deployed, and not need to implement logic to handle this case.

The 2004 cgi spec says it identifies "the client for the immediate request to the server". Could we interpret server to mean "machine" (or even website) instead of actual "socket"?

@benoitc
Copy link
Owner

benoitc commented Oct 9, 2014

We need to take a decision on this.

To summarise the context, the remote address is right now empty by default when using the unix socket. For the good reason that no tcp address is given, Now most of the time, the application want to fetch the remote address of the client. With the new behaviour it's only possible by parsing the X-Forward* headers at the application level. In short the questions we have to answer are:

  • What should we do when the connection is accepted on a unix socket when gunicorn is behind a proxy like NGINX?
  • What should we do when the connection is accepted on a unix socket coming from a client that connect on this socket directly?
  • In case we are parsing ourself the X-Forward* headers, how do we tell to the application that the connection was accepted using a unix socket? (Which was the reason of that change). Do we need it?

Hopefully we can solve that issue this week. Let me know.

@rbtcollins
Copy link

(coming here via the web-sig discussion). Seems like there are two related issues. For logs that gunicorn itself writes, some mechanism to select the 'real' client address would be useful for server admins. For apps they may need some way to determine the 'real' address too - and if gunicorn has determined that passing it on may be useful. I'm not sure it makes sense to require that all servers have a way to determine the 'real' client address though, so it seems like either a gunicorn extension or we'd have to make it optional.. and optional features are often problematic in terms of interop.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 27, 2014

Still not sure what to do about this one.

What are the current proposals? Maybe we can chat soon in real time on IRC
and make a decision.
On Oct 13, 2014 9:50 PM, "rbtcollins" notifications@github.com wrote:

(coming here via the web-sig discussion). Seems like there are two related
issues. For logs that gunicorn itself writes, some mechanism to select the
'real' client address would be useful for server admins. For apps they may
need some way to determine the 'real' address too - and if gunicorn has
determined that passing it on may be useful. I'm not sure it makes sense to
require that all servers have a way to determine the 'real' client address
though, so it seems like either a gunicorn extension or we'd have to make
it optional.. and optional features are often problematic in terms of
interop.


Reply to this email directly or view it on GitHub
#797 (comment).

@GrahamDumpleton
Copy link
Contributor

See also discussion at:

Without a reliable way of dealing with X-Fowarded and Forwarded headers which can be configured to say what front end clients you trust to pass the information and so be able to set REMOTE_ADDR properly, you are pretty much screwed.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 28, 2014

After reading the discussion at python-web-sig/wsgi-ng#11 I think my feeling is that the current behavior is not incorrect or bad. There is clear prior art for frameworks and middleware tolerating a None value for REMOTE_ADDR. There is clear danger in providing 127.0.0.1.

On the other hand, we have a forwarded-allow-ips setting. It is 127.0.0.1 by default. At this point we only use it for wsgi.url_scheme, but I would support a proposal to rewrite REMOTE_ADDR to X-Forward-For if and only if forwarded-allow-ips is * and the REMOTE_ADDR is None.

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 28, 2014

My reasoning is that if a user knows enough to set the forwarded-allow-ips setting the documentation can make clear to them the implications and they should be able to understand why they're doing it and what the risks are.

@benoitc
Copy link
Owner

benoitc commented Nov 23, 2014

I will do the change proposal from @tilgovi asap.

@benoitc
Copy link
Owner

benoitc commented Jan 30, 2015

It appeared that we also removed the code handling the X-Forwarded-Header so I am postponing to the next release. Also see my comment in python-web-sig/wsgi-ng#11 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

10 participants