Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authentication (and permissions) as a core concept #699

Closed
simonw opened this issue Mar 16, 2020 · 40 comments
Closed

Authentication (and permissions) as a core concept #699

simonw opened this issue Mar 16, 2020 · 40 comments

Comments

@simonw
Copy link
Owner

simonw commented Mar 16, 2020

Right now Datasette authentication is provided exclusively by plugins:

This is an all-or-nothing approach: either your Datasette instance requires authentication at the top level or it does not.

But... as I build new plugins like https://github.com/simonw/datasette-configure-fts and https://github.com/simonw/datasette-edit-tables I increasingly have individual features which should be reserved for logged-in users while still wanting other parts of Datasette to be open to all.

This is too much for plugins to own independently of Datasette core. Datasette needs to ship a single "user is authenticated" concept (independent of how users actually sign in) so that different plugins can integrate with it.

@simonw simonw added the feature label Mar 16, 2020
@simonw simonw changed the title Authentication as a core concept in Datasette Authentication (and permissions) as a core concept in Datasette Mar 16, 2020
@simonw simonw added the plugins label Mar 16, 2020
@simonw
Copy link
Owner Author

simonw commented Apr 10, 2020

Another problem this would solve: if you want multiple authentication mechanisms - GitHub auth for users, Authorization: bearer xxx auth for API keys - the order in which they run might end up mattering.

I dealt with this a bit in simonw/datasette-auth-github#59

But having an authentication plugin hook - where playing get to decide if a user should be authenticated based on the incoming ASGI scopes - would be neater.

@simonw
Copy link
Owner Author

simonw commented Apr 10, 2020

So maybe this is all handled by plugin hooks?

auth_from_scope(datasette, scope) would be the main one - for deciding if a user should be authenticated based on data from the scope.

How would a permissions hook work though?

@zeluspudding
Copy link

zeluspudding commented May 11, 2020

Authorization: bearer xxx auth for API keys is a plus plus for me. Looked into just adding this into your Flask logic but learned this project doesn't use flask. Interesting 🤔

@simonw
Copy link
Owner Author

simonw commented May 11, 2020

I implemented bearer tokens in a private project of mine as a one-off plugin. I'm going to extract that out into a installable plugin soon. For the moment, my plugins/token_auth.py file looks like this:

from datasette import hookimpl
import secrets


class TokenAuth:
    def __init__(
        self, app, secret, auth,
    ):
        self.app = app
        self.secret = secret
        self.auth = auth

    async def __call__(self, scope, receive, send):
        if scope.get("type") != "http":
            return await self.app(scope, receive, send)

        authorization = dict(scope.get("headers") or {}).get(b"authorization") or b""
        expected = "Bearer {}".format(self.secret).encode("utf8")

        if secrets.compare_digest(authorization, expected):
            scope = dict(scope, auth=self.auth)

        return await self.app(scope, receive, send)


@hookimpl(trylast=True)
def asgi_wrapper(datasette):
    config = datasette.plugin_config("token-auth") or {}
    secret = config.get("secret")
    auth = config.get("auth")

    def wrap_with_asgi_auth(app):
        return TokenAuth(app, secret=secret, auth=auth,)

    return wrap_with_asgi_auth

Then I have the following in metadata.json:

{
    "plugins": {
        "token-auth": {
            "auth": {
                "name": "token-bot"
            },
            "secret": {
                "$env": "TOKEN_SECRET"
            }
        }
    }
}

And a TOKEN_SECRET environment variable.

@simonw
Copy link
Owner Author

simonw commented May 11, 2020

I did have a bit of trouble with this one-off plugin getting it to load in the correct order - since I need authentication to work if EITHER the one-off plugin spots a token or my datasette-auth-github plugin authenticates the user.

That's why I want authentication as a core Datasette concept - so plugins like these can easily play together in a predictable manner.

@zeluspudding
Copy link

zeluspudding commented May 11, 2020

Very nice! Thank you for sharing that 👍 :) Will try it out!

@simonw simonw added the large label May 30, 2020
@simonw simonw added this to the Datasette 1.0 milestone May 30, 2020
@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I think there are two hooks here:

actor_from_request(datasette, request) - returns None or a dictionary.

  • datasette is a Datasette instance - useful for things like reading plugin configuration or executing queries
  • request is a Request object - which means ASGI scope can be accessed as request.scope

A non-None value means the request is authenticated in some way. The shape of that dictionary is entirely undefined.

The second hook is for checking permissions. It can look something like this:

permission_allowed(actor, action, resource_type, resource_identifier)

  • actor = the dictionary that was returned by actor_from_scope
  • action = a string representing the action to be performed, e.g. edit-schema
  • resource_type = a string representing the type of resource being acted on, e.g. table
  • resource_identifier = a string (or maybe tuple?) representing the specific resource, e.g. the table name

I don't know if Datasette should provide default implementations of these hooks. It may be that leaving them completely up to plugins is the way to go.

I think I need to prototype this quickly to start feeling for how well it might work.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

auth_from_scope(datasette, scope) needs to be able to return an awaitable which is then awaited - so it can execute database queries.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

Maybe call that actor_from_request(datasette, request) instead.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I'm changing auth to actor and updating the above design comment.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

My usage of the term subject here to mean "the thing I am checking I have permission to interact with, e.g. a database table" may be misleading. https://stackoverflow.com/questions/4989063/what-is-the-meaning-and-difference-between-subject-user-and-principal for example shows that JAAS (Java Authentication and Authorization Service) defines subject as "The purpose of the Subject is to represent the authenticated user".

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

In AWS IAM world the following terminology is used: https://aws.amazon.com/iam/features/manage-permissions/

Permissions are granted to IAM entities (users, groups, and roles) [...]

To assign permissions to a user, group, role, or resource, you create a policy that lets you specify:

  • Actions – Which AWS service actions you allow. For example, you might allow a user to call the Amazon S3 ListBucket action. Any actions that you don't explicitly allow are denied.
  • Resources – Which AWS resources you allow the action on. For example, what Amazon S3 buckets will you allow the user to perform the ListBucket action on? Users cannot access any resources that you do not explicitly grant permissions to.
  • Effect – Whether to allow or deny access. Because access is denied by default, you typically write policies where the effect is to allow.
  • Conditions – Which conditions must be present for the policy to take effect. For example, you might allow access only to the specific S3 buckets if the user is connecting from a specific IP range or has used multi-factor authentication at login.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I like "actor" better than "entity" to mean "the user or API key that is authenticated for this request".

I'm going to use "resource" instead of "subject" - updating the design comment again.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I could bake some permission checks into default Datasette, which are all treated as allow by default but can then be locked down by plugins. Maybe the following:

permission_allowed(request.actor, "execute-sql", "database", "name-of-database")

Checks that current user can execute arbitrary SQL queries against a specific database (or use the ?_where= feature). Equivalent to current allow_sql setting.

permission_allowed(request.actor, "download-database", "database", "name-of-database")

Can the user download the database file? Like allow_download.

Maybe one for allow_csv_stream too.

Having a permission check (defaulting to True) on every single "view" would be useful:

  • view_index
  • view_database
  • view_table
  • view_row
  • view_query
  • view_special (for /-/versions and so on)

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I started sketching this out in the authentication branch. Here's the documentation so far: https://github.com/simonw/datasette/blob/8871c20/docs/plugins.rst#actor_from_requestdatasette-request

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

Debugging permissions is going to be important. Optional tooling that supports the following would be useful:

  • Log every check to permission_allowed to the console - optionally with tracebacks showing where in the code the check was made
  • Log every check to the https://latest.datasette.io/?_trace=1 output
  • A tool that shows you exactly what permissions the current authenticated user/entity has
  • A tool showing all available permissions

That last one is tricky if permissions are just strings that might be passed to permission_allowed - so maybe there needs to be a plugin hook that lets plugins register their permissions, such that they can be introspected later on? A register_permission_actions() hook that returns a list of permission action strings (or objects of some sort) perhaps.

simonw added a commit that referenced this issue May 30, 2020
Also added datasette argument to permission_allowed hook
@simonw
Copy link
Owner Author

simonw commented May 30, 2020

I'm going to add an awaitable utility method to the Datasette class for checking permissions:

await datasette.permission_allowed(actor, action, resource_type, resource_identifier)

The second two arguments will be optional.

@simonw
Copy link
Owner Author

simonw commented May 30, 2020

The branch is now usable! Next step: write some experimental plugins that exercise some real authentication use-cases with it.

@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

OK, the implementation in PR #783 is in a good state now - it implements the new plugin hooks with tests and documentation, plus it implements this:

$ datasette . --root
http://127.0.0.1:8001/-/auth-token?token=3ca9ee460a6451142389351d19b147bce27d2a785dfb6b5a74f82211be1ede49
...

That URL, when clicked, will set a cookie for the {"id": "root"} user. The cookie is respected and used to populate scope["actor"].

I'm going to merge that pull request and continue working on this stuff on master.

@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

I should add an entire page to the documentation describing Datasette authentication.

simonw added a commit that referenced this issue Jun 1, 2020
Also added datasette argument to permission_allowed hook
simonw added a commit that referenced this issue Jun 1, 2020
Also added JSON highlighting to introspection documentation.
@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

I rebased in #783 so all of this is on master now.

@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

Some next steps:

  • Try out a branch of datasette-auth-github that builds on these new plugin hooks
  • Build a datasette-api-tokens plugin which implements Authorization: bearer xxx token support for API access
  • Maybe prototype up a datasette-user-accounts plugin which supports username/password accounts and allows an admin user to create/delete them
  • Do more work on writable canned queries in Ability for a canned query to write to the database #698 and see what they look like if they take advantage of the permissions hook (to restrict some to only allowing authenticated users)

@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

https://latest.datasette.io/-/actor is now live (it returns null because there's no current way to sign into the latest.datasette.io site - not even with a fake ds_actor cookie because there's no way to know what that site's random secret is).

@simonw simonw pinned this issue Jun 1, 2020
@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

Plugin idea: datasette-allow-all - really simple plugin which just says "yes" to every permission check.

@simonw
Copy link
Owner Author

simonw commented Jun 1, 2020

Debugging tool idea: /-/permissions page which shows you the actor and lets you type in the strings for action, resource_type and resource_identifier - then shows you EVERY plugin hook that would have executed and what it would have said, plus when the chain would have terminated.

Bonus: if you're logged in as the root user (or a user that matches some kind of permission check, maybe a check for permissions_debug) you get to see a rolling log of the last 30 permission checks and what the results were across the whole of Datasette. This should make figuring out permissions policies a whole lot easier.

simonw added a commit that referenced this issue Jun 1, 2020
Also started the authentication.rst docs page, refs #786.

Part of authentication work, refs #699.
@simonw
Copy link
Owner Author

simonw commented Jun 2, 2020

I can close this issue once I've expanded out this page of documentation https://datasette.readthedocs.io/en/latest/authentication.html - and published at least one plugin and/or feature that takes advantage of this new mechanism.

@simonw simonw changed the title Authentication (and permissions) as a core concept in Datasette Authentication (and permissions) as a core concept Jun 2, 2020
@simonw simonw modified the milestones: Datasette 1.0, Datasette 0.44 Jun 6, 2020
@simonw
Copy link
Owner Author

simonw commented Jun 6, 2020

The canned queries feature is gaining permissions support in #800.

@simonw
Copy link
Owner Author

simonw commented Jun 6, 2020

I landed canned query writes. This feature can now be considered complete: https://datasette.readthedocs.io/en/latest/authentication.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants