Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wsgi: add peer UNIX credential to a non-standard environment variable #37

Merged
merged 20 commits into from Apr 10, 2018
Merged

wsgi: add peer UNIX credential to a non-standard environment variable #37

merged 20 commits into from Apr 10, 2018

Conversation

bazsi
Copy link
Contributor

@bazsi bazsi commented Jun 25, 2017

Signed-off-by: Balazs Scheidler balazs.scheidler@balabit.com

  • What kind of change does this PR introduce? (Bug fix, feature, docs update, ...)

It adds the UNIX UID of a unix domain socket that connects to our HTTP service over socket.

  • What is the related issue number (starting with #)

none

  • What is the current behavior? (You can also link to an open issue here)

peer credentials are not available to apps even if they bind to a unix domain socket

  • What is the new behavior (if this is a feature change)?

I've added a non-standard _REMOTE_UID value to environ, that is available to apps. I've originally did this patch to an older version of cherrypy and this is a forward port.

  • Other information:

@bazsi
Copy link
Contributor Author

bazsi commented Jun 25, 2017

I've attempted to fix automatic tests, but the ones in circle ci seems to be an infrastructure problem. The rest of the tests were green.

I'd be really grateful about the feature, especially the naming of the environment variable. I'd like to deploy something like this in production, and I'd like to use a name that will match the 'official' one.

thanks.

@bazsi
Copy link
Contributor Author

bazsi commented Jun 27, 2017

Any comments?

@webknjaz
Copy link
Member

Oh, sorry @bazsi.. I was busy and didn't check notifications during past few days.

Nevermind about CircleCI, it's not configured yet.
Regarding your change, it's on my list to check it. I'll do this ASAP once I have some free time.

cheroot/wsgi.py Outdated
@@ -276,6 +278,11 @@ def get_environ(self):

return env

def _get_peer_uid(self):
creds = self.req.server.socket.getsockopt(socket.SOL_SOCKET, socket.SO_PEERCRED, struct.calcsize('3i'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 831ec2d

cheroot/wsgi.py Outdated
@@ -276,6 +278,11 @@ def get_environ(self):

return env

def _get_peer_uid(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't obvious what this function purpose is, so it's a good practice to leave comments + docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also done in 831ec2d

I would be squashing that into the base commit as review is complete.

cheroot/wsgi.py Outdated
@@ -254,6 +255,7 @@ def get_environ(self):
# AF_UNIX. This isn't really allowed by WSGI, which doesn't
# address unix domain sockets. But it's better than nothing.
env['SERVER_PORT'] = ''
env['_REMOTE_UID'] = self._get_peer_uid()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this as an environment variable?

cheroot/wsgi.py Outdated
@@ -276,6 +278,11 @@ def get_environ(self):

return env

def _get_peer_uid(self):
creds = self.req.server.socket.getsockopt(socket.SOL_SOCKET, socket.SO_PEERCRED, struct.calcsize('3i'))
Copy link
Member

@webknjaz webknjaz Jun 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to the request or connection entity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt this was a lot more connected to REMOTE_ADDR and the likes, as I've felt it is a property of the connection rather than the request itself.

In our internal HTTP based service, we are using the environment to populate our internal data structures and by that time the connection itself is not available.

If you meant that _REMOTE_UID stays in the environment, but the implementation itself moves to HTTPConnection, that could make sense. Is that what you are thinking?

Thanks for the feedback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share your snippet where you cannot access HTTPConnection and still can access env vars? I'm really not sure whether all other users need this as it adds extra syscall overhead.

@bazsi
Copy link
Contributor Author

bazsi commented Jun 28, 2017

I have now pushed a couple of fixups to address your notes. If you are ok with them, I would rebase the branch and squash them to the same patch.

cheroot/wsgi.py Outdated
@@ -26,8 +26,9 @@ def my_crazy_app(environ, start_response):
"""

import sys

import struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unused in this module

cheroot/wsgi.py Outdated
@@ -254,6 +255,9 @@ def get_environ(self):
# AF_UNIX. This isn't really allowed by WSGI, which doesn't
# address unix domain sockets. But it's better than nothing.
env['SERVER_PORT'] = ''
uid = self.req.conn.get_peer_uid()
if uid is not None:
env['_REMOTE_UID'] = str(uid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other WSGI implementations providing similar env var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found a number of WSGI/HTTP servers/libs that support unix domain sockets:

With that said, the only HTTP over UNIX domain socket implementation, that I've found to also support SO_PEERCRED based authentication is cups, which has its embedded http implementation (in C, brr) and does not use CGI like environment variables.

I've also looked up rfc3875, the original specification of CGI, which seems to be the origin for these environment variables: http://www.ietf.org/rfc/rfc3875. In the RFC, there's a standard REMOTE_USER environment variable, which seems to be what we aim for, but would require username lookup instead of just passing on the UID. username resolution could be expensive, especially as it would need to be done on every request.

That same RFC, also specifies that non-standard extensions should be prefixed with an X_.

With all this said, I think:

  • unix domain based authentication makes sense, there is at least one implementation, and a use-case that matches mine.
  • environment wise, we have three options:
    1. use X_REMOTE_UID and pass the uid only
    2. resolve the uid to username (with caching?), and reuse the standard REMOTE_USER attribute
    3. disregard the CGI origin, and use something different.

I've also found this document about WSGI: http://wsgi.readthedocs.io/en/latest/specifications/simple_authentication.html. This says, that the username needs to go to REMOTE_USER, thus

My opinion and the above spec point to option number 2) above.

If you agree, I can change my patches this way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This shouldn't be a default behavior to avoid overhead for other users who may not want such logic to be triggered, so I would prefer that the calling logic go to CherryPy tool, which would be easy to turn on or switch off easily via config option.
  2. It would be enough to provide @properties, which evaluate creads lookup lazily, once per request as I commented earlier.
  3. It would be ok to have some function/method populating all of REMOTE_USER, X_REMOTE_UID, X_REMOTE_GID, X_REMOTE_PID or just some of them as configured.

Copy link
Contributor Author

@bazsi bazsi Jul 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. cherrypy does not really have access to HTTPConnection objects, so I don't see how that would go back to the socket to do SO_PEERCRED. This is how it looks like (obviously I don't have intimate knowledge of these parts, I just did a quick browse of the related source code):

  • cherrypy instantiates a cheroot.wsgi.Server instance (in ServerAdapter). wsgi.Server is derived from the "native" HTTP server, cheroot.server.HTTPServer
  • as a connection arrives, a HTTPConnection is instantiated to encapsulate the connection specific socket
  • we parse the incoming request, and a cheroot specific HTTPRequest is created, to encapsulate the low-level HTTP details. Here we still have a reference to the connection/server objects.
  • then a cheroot.wsgi.Gateway is created, one that takes the low-level HTTPRequest, and populates a WSGI environment, and passes it to the the WSGI application via a __call__ method.
  • The environment is the only input to the WSGI app, even headers are translated into and from environment name-value pairs. (with ugly translations like Accept-Language becoming HTTP_ACCEPT_LANGUAGE)

There seems to be an intention to decouple the HTTP specific pieces and the app side, and the means to exchange information is the environment itself. It is thus not possible to go back from the app to HTTPConnection & HTTPRequest where the information is available.

The Native interface is one abstraction short (e.g. environment is not created and HTTPRequest is directly translated to cherrypy request objects), but that is not the default how cherrypy operates. That interface does not support authentication (e.g. request.login is not populated at all).

I could inject the socket into the environment (with a key like "wsgi.socket" to match other attributes), but my opinion is that this would be ugly and would derail the entire idea to decouple apps from HTTP specifics.

Short of adding this ugly hack to environment, we need to do the authentication on the cheroot side and pass the information via environment variables.

Still, REMOTE_USER looks to be the best solution, as that would work out of the box with cherrypy (e.g. request.login is filled automatically).

I could make the username resolution stuff optional by adding an additional constructor parameter to cheroot.wsgi.Server that could be driven by a cherrypy config variable (e.g. server.socket_file_auth True/False).

I can also amortize the performance impact by caching the username lookup, because in most cases, it would only be a very limited number of users that would use the interface.

I can see these courses of action at this point if we are to use REMOTE_USER:

  1. additional parameter w/o caching
  2. additional parameter w/ caching
  3. no additional parameter w/ caching

OR we can use X_REMOTE_UID, GID and PID as you suggested, in which case username resolution is not needed at all, so performance impact should be small. But in that case cherrypy.request.login will not be set automatically.

What do you think?

cheroot/wsgi.py Outdated
import six
import socket
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused as well

"""

if self.socket.family != socket.AF_UNIX:
return None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer raising exception rather than returning "magic values"

pid, uid, gid = struct.unpack('3i', creds)
return uid
except socket.error:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, don't swallow exception. Log it and re-raise appropriate one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wanted to ignore errors and in those cases not add this to the environment. It seems to be a Linux specific feature. I wouldn't want UNIX domain sockets not to work if this is not available.

perhaps a one-time warning would suffice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, warning would be ok (mabe even NOTICE). But I still prefer raising exception instead of returning None.

@@ -1155,6 +1156,28 @@ def close(self):
# Apache does, but not today.
pass

def get_peer_uid(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a method returning the whole tuple of creds and 3 @property methods returning each of 'em.

# NOTE: the value for SO_PEERCRED can be architecture specific, in
# which case the getsockopt() will hopefully fail. The arch
# specific value could be derived from platform.processor()
SO_PEERCRED = getattr(socket, 'SO_PEERCRED', 17)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add this as a monkey-patch next to similar win-specific code @ line 80 to be consistent

@webknjaz
Copy link
Member

webknjaz commented Jun 28, 2017

@bazsi it would be very helpful to have tests and documentation for this feature as well + fix linter errors failing in Travis CI

@bazsi
Copy link
Contributor Author

bazsi commented Jul 1, 2017

I'll do another iteration and push the updated patches.

@bazsi
Copy link
Contributor Author

bazsi commented Jul 15, 2017

I still have not progressed with this. coming back once I have a timeslice.

@webknjaz
Copy link
Member

No worries. I'll be waiting for updates from your side and maybe even push a few commits to your branch if I have some free time for this.

@stale stale bot added the stale This thing has been ignored for too long label Sep 13, 2017
@stale
Copy link

stale bot commented Sep 13, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot removed the stale This thing has been ignored for too long label Sep 14, 2017
@webknjaz
Copy link
Member

@bazsi I've rebased your branch on top of current master

@webknjaz
Copy link
Member

@bazsi I've added bits of refactoring, but it's not considered as complete yet.

I'd like to address:

@webknjaz
Copy link
Member

This pull request introduces 1 alert - view on lgtm.com

new alerts:

  • 1 for Unnecessary pass

Comment posted by lgtm.com

@codecov
Copy link

codecov bot commented Sep 14, 2017

Codecov Report

Merging #37 into master will decrease coverage by 0.44%.
The diff coverage is 41.58%.

@@            Coverage Diff             @@
##           master      #37      +/-   ##
==========================================
- Coverage   66.42%   65.97%   -0.45%     
==========================================
  Files          17       17              
  Lines        2847     2942      +95     
==========================================
+ Hits         1891     1941      +50     
- Misses        956     1001      +45

@stale
Copy link

stale bot commented Nov 13, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale This thing has been ignored for too long label Nov 13, 2017
@stale stale bot closed this Nov 20, 2017
@webknjaz
Copy link
Member

I still hope to find some time for this eventionally.

@stale stale bot removed the stale This thing has been ignored for too long label Mar 26, 2018
@webknjaz
Copy link
Member

Hi @bazsi!

Sorry for being silent for so long. I think I came up with a proper solution now.
I've reconsidered feature switching problem a bit: now I think it should be server-wide. This way it's easy to make it switchable in CherryPy as well (I've tested it). Also, it doesn't make sense on per-request basis or per-URI, because it operates on the lower level and has nothing to do with those.

It's bypassed through an adapter in CherryPy to Server instance in cheroot and then it pre-populates it into connections. WSGI env tries to get this data from connection and it succeeds when enabled and supported by the runtime.

Option in CherryPy would be server.peercreds_enabled in global section. I'm also thinking about additional option to enable resolving PID/GID from this auth data into string names.

Do you mind taking another round of testing and finish this with me?

@webknjaz
Copy link
Member

webknjaz commented Apr 8, 2018

@bazsi I think, PoC is ready.

Additionally to #37 (review) I'm thinking that we could also looking at REMOTE_IDENT (RFC 3875 section 4.1.10., RFC 1413)

I'm almost ready to merge this, but it would be great to see some tests around it.

balabit-sync pushed a commit to balabit-deps/balabit-os-6-python3-cherrypy8 that referenced this pull request Apr 9, 2018
This patch enables SO_PEERCRED based authentication into the NNX REST
server, by adding the _REMOTE_UID to the environment, which then
can be used to base authentication on.

A similar patch was submitted to CherryPy upstream, here:

cherrypy/cheroot#37

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
@webknjaz
Copy link
Member

@bazsi
Our test client doesn't support making HTTP requests over UNIX socket, so I'll have to postpone covering this with tests and doing any other modifications in the code base withing this PR.
I'm going to merge and release it now and expect that we could polish/work it out with bugfix releases if necessary.
Feel free to send new PRs with adding tests/fixing test client to support unix sockets etc.
I'm going to wire up CherryPy config options to this feature a bit earlier.

@webknjaz webknjaz merged commit b2f1921 into cherrypy:master Apr 10, 2018
webknjaz added a commit that referenced this pull request Apr 10, 2018
PR #37 by @bazsi:

Implement PEERCRED lookup over UNIX-socket HTTP connection.

  * Discover connected process' PID/UID/GID

  * Respect server switches: ``peercreds_enabled`` and
    ``peercreds_resolve_enabled``

  * ``get_peer_creds`` and ``resolve_peer_creds``  methods on connection

  * ``peer_pid``, ``peer_uid``, ``peer_gid``, ``peer_user`` and ``peer_group``
    properties on connection

  * ``X_REMOTE_PID``, ``X_REMOTE_UID``, ``X_REMOTE_GID``, ``X_REMOTE_USER``
    (``REMOTE_USER``) and ``X_REMOTE_GROUP`` WSGI environment variables when
    enabled and supported

  * Per-connection caching to reduce lookup cost
@webknjaz
Copy link
Member

New minor version v6.2.0 is being released: https://travis-ci.org/cherrypy/cheroot/builds/364381847 (ETA: 30 min)

balabit-sync pushed a commit to balabit-deps/balabit-os-7-cherrypy3 that referenced this pull request Dec 1, 2020
This patch enables SO_PEERCRED based authentication into the NNX REST
server, by adding the X_REMOTE_UID to the environment, which then
can be used to base authentication on.

A similar patch was submitted to CherryPy upstream, here:

cherrypy/cheroot#37

The original version of this patch came from our
balabit-os-7.0/python3-cherrypy8 package.
The corresponding commit ids are:
* c8e7fb9d998391e39778f39707e0e3f0819cec02,
* cd022ff48d2abcb999d890eef4347b13d943eabd
balabit-sync pushed a commit to balabit-deps/balabit-os-8-cherrypy3 that referenced this pull request Nov 14, 2022
This patch enables SO_PEERCRED based authentication into the NNX REST
server, by adding the X_REMOTE_UID to the environment, which then
can be used to base authentication on.

A similar patch was submitted to CherryPy upstream, here:

cherrypy/cheroot#37

The original version of this patch came from our
balabit-os-7.0/python3-cherrypy8 package.
The corresponding commit ids are:
* c8e7fb9d998391e39778f39707e0e3f0819cec02,
* cd022ff48d2abcb999d890eef4347b13d943eabd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants