Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add create user permission and update POST /api/users authz #5288

Merged
merged 5 commits into from
Sep 27, 2018

Conversation

lyzadanger
Copy link
Contributor

This PR removes the view-level auth* logic for authClients on the POST /api/users endpoint and replaces it with a permission view config. To accomplish this, a new permission ('create') has been added to h.traversal.roots.UserRoot and UserRoot has been set as the route factory for that endpoint. The permission is assigned to authenticated AuthClients.

Also in this PR, the route whitelisting for AuthClients is updated and, I think, better. This is in its own commit; if it's too much for this PR I can break it out, though it'd be convenient to ship it together.

Note: After this is merged, POST /api/users will temporarily return a 404 instead of a 403 on auth fail. This is because of our squashing of all 403s to 404s in our view exceptions. 403s happened before because of the custom/legacy view-level logic that was doing the auth before it was normalized. Now that we're using "real" Pyramid authn and authz, the error views are getting used and re-coding all 403s to 404s. Fixing this is very much on my list

AUTH_TOKEN_PATH_PATTERN = '^\/api\/groups'
#: List of route name-method combinations that should
#: allow AuthClient authentication
AUTH_CLIENT_API_WHITELIST = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is both easier to understand as a human and less error-prone than the earlier approach.

@@ -352,14 +353,19 @@ def _is_client_request(request):
"""
Is client_auth authentication valid for the given request?

For initial rollout, authentication should be performed by
Uuthentication should be performed by
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, yeah, I see the typo. I'll fix it in my next round of fixups for this branch.

config.add_route('api.users', '/api/users')
config.add_route('api.users',
'/api/users',
factory='h.traversal.UserRoot')
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first time we've used a non-default factory on a route without using traversal. Seems to work fine. Any red flags?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. Technically we are actually using traversal here (traversal is always on in Pyramid). It's just that the traversal tree is more of a traversal stump.

What happens is that a UserRoot object is instantiated. Or more specifically in Pyramid's terminology the traversal factory function, which is UserRoot.__init__(), is called and it returns the root resource of the traversal tree (a UserRoot instance). And since there's no traversal path to follow, that's the end of traversal: UserRoot.__getitem__(self, username) is never called, and the UserRoot instance itself is used as the context. This contradicts the naming scheme where roots.py contains *Root classes that're used as arguments to factory= (plus the default root), and contexts.py contains *Context classes that're returned by roots and used as contexts. You've now got a *Root class being used as a context as well in some cases but not others. But you just have to remember that a root can also be a context (but a context can't be a root) so it's probably fine. This does lead me to think our traversal classes organization may need some evolving later though, see below.

The one thing to look out for is that UserRoot is also used by the activity.user_search route (and in that case it does have a traversal path of "/{username}", __getitem__(self, username) is called and returns a models.User as the context). So I think the create permission that you've added may also apply to the user activity page. Since auth clients can't view that page though, it won't ever be applied. So I think it's fine.

And I guess you've changed the semantics of the UserRoot class: previously it was only ever a factory/root resource and always returned a User as the context, was never used as a context itself. Now it is sometimes used as the context resource itself. But I don't see a problem there necessarily.


As an aside, I'm starting to doubt the idea of separating things into traversal/roots.py and traversal/contexts.py. A "root resource" in Pyramid is a root node of a traversal tree, and a "context resource" is a leaf node of the tree. Normally you'd have a traversal path such as "/{username}" so Pyramid would call the root resource's __getitem__({username}) which would return the context resource. But in the case of using factory without traverse, as in this PR, the root resource object ends up doubling as the context resource object as well. And UserRoot is also the "factory" too, or at least the UserRoot class (its __init__() method) is the factory.

So trying to separate things out into factories, roots and contexts may not make much sense, as a single class can be all three at once, or can be different things at different times. The UserRoot class could just as well be called UserContext or UserFactory. Calling it any one of those three names is a bit misleading. It's actually a UserFactoryRootContext. Or just a UserResource.

This all leads me to ask whether we should just refer to them all as "resources":

h/traversal/
  user.py  <-- Traversal resources to do with users
  annotation.py <-- Resources to do with annotations
  api/
    # Maybe separate resources for the API
  ...

And instead of UserRoot and UserContext, user.py would just contain a Users class and a User class.

In the case of /api/users the Users class is used as factory, root and context (add_route(factory=traversal.Users, ...) with no traverse= argument).

In the case of /api/users/<USER_ID> the Users class is used as the factory and root, but its __getitem__ is called and it returns a traversal.User as context (add_route(factory=traversal.Users, traverse="/{username}", ...)).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all leads me to ask whether we should just refer to them all as "resources":

As I was doing this work, that started feeling like the intuitive path, for sure. It felt also like if we keep on with our current model, the roots and contexts modules are going to get enormous.

@@ -143,14 +139,16 @@ def test_authenticated_userid_proxies_to_user_policy_first(self,
assert client_policy.authenticated_userid.call_count == 0
assert userid == user_policy.authenticated_userid.return_value

@pytest.mark.parametrize('path', AUTH_CLIENT_PATHS)
@pytest.mark.parametrize('route_name,route_method', AUTH_CLIENT_API_WHITELIST)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these test changes are to account for the different way that auth-client routes are whitelisted and validated (no other functional changes).

route_name,
route_method):
pyramid_request.method = route_method
pyramid_request.matched_route.name = route_name
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where I'm just happily messing around with the matched_route mock :)

@seanh seanh self-assigned this Sep 21, 2018
@codecov
Copy link

codecov bot commented Sep 21, 2018

Codecov Report

Merging #5288 into master will decrease coverage by <.01%.
The diff coverage is 97.05%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5288      +/-   ##
==========================================
- Coverage   95.95%   95.95%   -0.01%     
==========================================
  Files         467      467              
  Lines       25089    25123      +34     
  Branches     1375     1377       +2     
==========================================
+ Hits        24075    24106      +31     
- Misses        889      891       +2     
- Partials      125      126       +1
Impacted Files Coverage Δ
tests/h/routes_test.py 100% <ø> (ø) ⬆️
tests/h/traversal/roots_test.py 100% <100%> (ø) ⬆️
h/views/api/users.py 100% <100%> (ø) ⬆️
h/routes.py 100% <100%> (ø) ⬆️
tests/h/auth/policy_test.py 99.74% <100%> (ø) ⬆️
tests/h/views/api/users_test.py 99.36% <100%> (-0.64%) ⬇️
h/traversal/roots.py 100% <100%> (ø) ⬆️
h/auth/policy.py 98.58% <50%> (-1.42%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c85928...065f787. Read the comment docs.

@seanh
Copy link
Contributor

seanh commented Sep 21, 2018

Rebased

Copy link
Contributor

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing the user create API:

  1. If I just post to it without any auth or data, it 404s: ✔️

    $ requests.post("http://localhost:5000/api/users")
    <Response [404]>
  2. If I post to it with an invalid auth it also 404s: ✔️

    $ requests.post("http://localhost:5000/api/users", auth=("foo", "bar"))
    <Response [404]>
  3. If I post with a valid client_id and client_secret, but no data, I get a 404 "Expected a valid JSON payload, but none was found!" ✔️

    $ requests.post("http://localhost:5000/api/users", auth=(client_id, client_secret))
    <Response [400]>
  4. Lets start adding data. If I just add a username or just a username or email I get a validation error. The error message is unhelpful but that's a known issue. ✔️

    $ requests.post("http://localhost:5000/api/users", auth=(client_id, client_secret), data=json.dumps({"username": "user63", "email": "user63@users.com"}))
    <Response [400]>
  5. As soon as I add a valid username, a valid email address, and an authority I start getting 200s back. But I get a 200 OK response and a user created whether the authority matches the auth client's authority or not. The user is created with the auth client's authority, and the different authority string passed in the request data is ignored. On master this would result in a {"status": "failure", "reason": "\'authority\' does not match authenticated client"} but on this branch it's ok: ❌

    $ requests.post("http://localhost:5000/api/users", auth=(client_id, client_secret), data=json.dumps({"username": "user65", "email""user65@users.com", "authority": "wrong_authority"}))
    <Response [200]>

h/auth/policy.py Outdated

This is intended to be temporary.
:rtype: Boolean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bool in Python

"""
return (re.match(AUTH_TOKEN_PATH_PATTERN, request.path) and
request.method == 'POST')
if request.matched_route:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for there to not be a matched_route?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know! This is up in the auth part of Pyramid's request cycle so I'm not 100% convinced I can rely on there being a matched_route so I figured better safe than sorry?

h/auth/policy.py Outdated
if request.matched_route:
for (route_name, allowed_method) in AUTH_CLIENT_API_WHITELIST:
if (request.method == allowed_method and request.matched_route.name == route_name):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be:

return (request.method, request.matched_route.name) in AUTH_CLIENT_API_WHITELIST

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that is a nicer syntax.

config.add_route('api.users', '/api/users')
config.add_route('api.users',
'/api/users',
factory='h.traversal.UserRoot')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. Technically we are actually using traversal here (traversal is always on in Pyramid). It's just that the traversal tree is more of a traversal stump.

What happens is that a UserRoot object is instantiated. Or more specifically in Pyramid's terminology the traversal factory function, which is UserRoot.__init__(), is called and it returns the root resource of the traversal tree (a UserRoot instance). And since there's no traversal path to follow, that's the end of traversal: UserRoot.__getitem__(self, username) is never called, and the UserRoot instance itself is used as the context. This contradicts the naming scheme where roots.py contains *Root classes that're used as arguments to factory= (plus the default root), and contexts.py contains *Context classes that're returned by roots and used as contexts. You've now got a *Root class being used as a context as well in some cases but not others. But you just have to remember that a root can also be a context (but a context can't be a root) so it's probably fine. This does lead me to think our traversal classes organization may need some evolving later though, see below.

The one thing to look out for is that UserRoot is also used by the activity.user_search route (and in that case it does have a traversal path of "/{username}", __getitem__(self, username) is called and returns a models.User as the context). So I think the create permission that you've added may also apply to the user activity page. Since auth clients can't view that page though, it won't ever be applied. So I think it's fine.

And I guess you've changed the semantics of the UserRoot class: previously it was only ever a factory/root resource and always returned a User as the context, was never used as a context itself. Now it is sometimes used as the context resource itself. But I don't see a problem there necessarily.


As an aside, I'm starting to doubt the idea of separating things into traversal/roots.py and traversal/contexts.py. A "root resource" in Pyramid is a root node of a traversal tree, and a "context resource" is a leaf node of the tree. Normally you'd have a traversal path such as "/{username}" so Pyramid would call the root resource's __getitem__({username}) which would return the context resource. But in the case of using factory without traverse, as in this PR, the root resource object ends up doubling as the context resource object as well. And UserRoot is also the "factory" too, or at least the UserRoot class (its __init__() method) is the factory.

So trying to separate things out into factories, roots and contexts may not make much sense, as a single class can be all three at once, or can be different things at different times. The UserRoot class could just as well be called UserContext or UserFactory. Calling it any one of those three names is a bit misleading. It's actually a UserFactoryRootContext. Or just a UserResource.

This all leads me to ask whether we should just refer to them all as "resources":

h/traversal/
  user.py  <-- Traversal resources to do with users
  annotation.py <-- Resources to do with annotations
  api/
    # Maybe separate resources for the API
  ...

And instead of UserRoot and UserContext, user.py would just contain a Users class and a User class.

In the case of /api/users the Users class is used as factory, root and context (add_route(factory=traversal.Users, ...) with no traverse= argument).

In the case of /api/users/<USER_ID> the Users class is used as the factory and root, but its __getitem__ is called and it returns a traversal.User as context (add_route(factory=traversal.Users, traverse="/{username}", ...)).

@pytest.fixture
def client_authority(self, patch):
patched = patch('h.views.api.users.client_authority')
patched.return_value = 'weylandindustries.com'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also return patched in case any tests want to take the mock client_authority as an argument and use it (e.g. to access client_authority.return_value). (Even though no tests do actually do that yet, our patch fixtures always return the thing)

The usual style in our code is to name the local variable the same as the fixture too, so:

@pytest.fixture
def client_authority(self, patch):
    client_authority = patch("h.views.api.users.client_authority")
    client_authority.return_value = "weylandindustries.com"
    return client_authority

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As soon as I add a valid username, a valid email address, and an authority I start getting 200s back. But I get a 200 OK response and a user created whether the authority matches the auth client's authority or not. The user is created with the auth client's authority, and the different authority string passed in the request data is ignored. On master this would result in a {"status": "failure", "reason": "'authority' does not match authenticated client"} but on this branch it's ok: ❌

I agree I should reconcile this and make it match previous behavior (thanks for the catch). As an aside, I believe that the underlying problem is that the API takes an authority param at all—it doesn't really make sense if you think about it, right? There is no case in which it would be respected if it weren't the request's active authority (and it's unnecessarily redundant when it does match). But we've got a published API, so, I'll make it behave like it used to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, I believe that the underlying problem is that the API takes an authority param at all—it doesn't really make sense if you think about it, right?

I agree - I have no idea what that param is doing there! Unless we had envisioned possible multi- or cross-authority auth clients in the future. But afaik we have no such plans

@lyzadanger
Copy link
Contributor Author

(rebased) ^^

@lyzadanger
Copy link
Contributor Author

@seanh This is ready for re-review! I've addressed your feedback in four commits on this branch (one one full commit; three fixups), which I hope will make your re-review faster and easier.


:raises ValidationError: if ``authority`` param does not match client
authority
:raises ConflictError: if user already exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recently discovered that you can shorten these to :raise instead of :raises btw

Copy link
Contributor

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posting to the API seems to work as expected now, but I had a few comments about the new authority-handling code

authority
:raises ConflictError: if user already exists
"""
applied_authority = client_authority(request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would client_authority_ be a better name for this? I'm not sure what applied means in this context

validate_auth_client_authority(client, appstruct['authority'])
appstruct['authority'] = client.authority
# Enforce authority match
appstruct['authority'] = appstruct['authority'] or applied_authority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what the or here is for? appstruct["authority"] can't be missing or schema validation would already have failed before getting here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INDEED. Why ever did I think that was an optional parameter?

h/views/api/users.py Show resolved Hide resolved
The former method of pattern-matching on request.path was kludgy and
resulted in at least one bug from oversight. The new whitelist here
is easier to read as a human and is matched against the
request’s `matched_route` object.
Remove view-level auth* checking on auth-client as authn now happens
in AuthClientPolicy and authz via a permission on the `UserRoot`
traversal root.

This endpoint will temporarily return a 404 instead of a 403 on
authz fail because of the way that API view exceptions intercept
403s. To be fixed soon
Raise a `ValidationError` if supplied `authority` param
does not match the verified auth client authority
@lyzadanger lyzadanger merged commit 2936c6e into master Sep 27, 2018
@lyzadanger lyzadanger deleted the add-create-user-permission branch September 27, 2018 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants