Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Tokens #5146

Merged
merged 39 commits into from
Jul 6, 2020
Merged

API Tokens #5146

merged 39 commits into from
Jul 6, 2020

Conversation

smotornyuk
Copy link
Member

Python 3 migration should be finished in the near future and I'll probably have some time to continue with "React.or.somethin.similar" frontend for CKAN. Back then, when I was actively working on it, I thought that it would be cool to have some kind of API authentication endpoint. I don't want to invest a lot of time into something unreasonably complex - we need to discuss the future of repoze.who and flask-auth first. Instead here just some simple solution. This endpoint is forbidden for users, except for cases, when config file contains explicit ckan.auth.get_apikey_via_api option, so there is no security issues. But when this action is enabled, it provides really simple way to obtain your api_key(as long as you know your credentials:) )

@wardi
Copy link
Contributor

wardi commented Jan 9, 2020

-1 from me. For a JS front-end we should be using user name+password to get a session cookie or equivalent, just like the normal front-end.

API keys should only be used from scripts or for server-server communication. Giving API keys to client side JS makes XSS attacks much worse, because the credentials don't time out and don't let us check the referrer (which is all we can use until we get CSRF tokens on all our forms).

@smotornyuk
Copy link
Member Author

From my point of view, it this way has almost the same set of problems:

  1. HTTP_REFFERER would be added to any ajax request by browser, so there should be no difference
  2. Sessions have an expiration time, but it's unlikely that anyone would ever call logout - won't it create an enormously huge amount of session files(or whatever we use for storing session data) if someone implements request to login handler on every page_load?

I also think that it is not the best approach, rather some placeholder that may be useful while we don't have other options. Using session cookies in the modern web doesn't sound very cool and having any popular authorization system for AJAX would show CKAN from a nice way to the public and may bring us more users.
BTW, having a discussion on this topic and maybe some agreement on implementation is much better, than merging this raw change into the core, so I mark it for discussion.

@smotornyuk smotornyuk added WIP and removed To Discuss labels Jan 13, 2020
@smotornyuk smotornyuk removed the WIP label Jan 14, 2020
@smotornyuk
Copy link
Member Author

After some debates, we agreed on the implementation of token system(in addition to API keys).
New features:

  1. API:
  • api_token_create - requires user and name and creates API Token for this user with the specified label
  • api_token_list - requires user and lists all API Tokens for this user
  • api_token_revoke - requires token and removes API Token by specified id
  1. UI:
  • Under user profile add new tab API Token
  • Contains input field(required) for token's label and button Generate new token
  • Shows a table with the label, date of last access for token and "Revoke" button(with confirmation dialogue)
  1. Interfaces:
  • IApiToken - contains two methods:
    • preprocess_api_token(token, original) - do anything with the token, before it used for used identification
    • postprocess_api_token(token, original) - do anything with the token, after it created and before it is shown to the user
  1. DB:
  • new table

API tokens can be used instead of API Keys. Actually, if CKAN fails to find a user by API-key, it assumes, that API-token was provided and makes an attempt to find the user by API token. In contrary to API-keys, a token is supposed to be used by frontend-clients. It stores the date of last use so that users can notice if a token was used by someone else. I don't store the IP for the last use of token, because it's likely a violation of privacy rules. And token can be removed(revoked) at any moment if misuse is discovered.

Developers are likely will miss IP recording, use-rate-limits, permission tags, etc. Because of this, I've added interface that allows modifying (thus, the addition of some info to the token) before it is shown to the user and pre-parsing after it received from used but before it was used to query the database. Token itself is used as id in the relevant DB table. It generated using the secrets module(built-in in py3 and installed as a polyfill for py2) and stored as-is. If extension modifies token before it is shown to the user, we don't know this modification and token will be lost if not copied right after creation(took this idea from GitHub). Theoretically, I can just re-run all post-processing from extensions, in order to show API key multiple times, but it will add idempotence as a restriction to post-processing and side-effects may become buggy. Instead, I clearly state that API token should be copied immediately after creation.

@ThrawnCA
Copy link
Contributor

ThrawnCA commented Feb 11, 2020

Giving API keys to client side JS makes XSS attacks much worse, because the credentials don't time out and don't let us check the referrer (which is all we can use until we get CSRF tokens on all our forms).

If XSS is possible, there are already multiple ways to retrieve the API key, notably the user_show API or scraping the user profile page. And XSS attacks bypass referrer and origin checks anyway, since they appear to come from the correct origin.

@rufuspollock
Copy link
Member

@smotornyuk let's sync on this as we've been doing work on this at Datopian - maybe we can some up with a solution together.

@smotornyuk
Copy link
Member Author

@rufuspollock , sounds really great - i'm in:)

@rufuspollock
Copy link
Member

@SergK great - let's catch up 😄

@amercader
Copy link
Member

@rufuspollock can you briefly summarize here your followed approach, constraints, etc? Would be really good to have detail from a working implementation on this.

@smotornyuk smotornyuk changed the title Implement apikey_show API Tokens Mar 26, 2020
@amercader amercader removed their assignment Jul 2, 2020
ckan/views/user.py Outdated Show resolved Hide resolved
changes/5146.feature Outdated Show resolved Hide resolved
@wardi
Copy link
Contributor

wardi commented Jul 2, 2020

just a couple small text changes. Let's aim to disable generation of API keys in default installs of ckan 3.0


You must provide your API key in the Authorization header.

:returns: a representation of the 'api_token'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to document the input parameters in the data dict, user (user name or id?) and name.

Can you describe the returned key here? eg: Returns a dict with the key token containing the encoded token value.
Is it worth describing the internal decoded structure (ie jti and iat keys), I guess not as this is an internal implementation detail

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.
Yes, jti and iat are implementation-specific and generally, they won't be used by end-user directly. You are just copying encoded token and using it as is. Probably, the majority of users won't even know, that there are some data inside the token

Comment on lines 231 to 235
try:
logic.check_access(u'api_token_revoke', context, {u'token': jti})
except logic.NotAuthorized:
base.abort(403, _(u'Unauthorized to revoke API tokens.'))
model.ApiToken.revoke(jti)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap this into a api_token_revoke action? Even if it's a one line action. This will allow extensions to override it or chain it, otherwise it breaks our pattern of views > action/auth > models

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated

@@ -0,0 +1,59 @@
# -*- coding: utf-8 -*-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a really useful feature, any reason to not include it in the core implementation and use a core extension instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm obsessed with extendibility in this PR because I have no ideas what others may expect/want to have. And, if you want to use a calendar widget for selecting the expiration date of token instead of specifying its lifetime, you can just create new extension(of even inherit from expire_api_token) with desired fields/validation/widget and use this extension instead of expire_api_token.
If it becomes the part of default implementation, you will have to use existing format (units * duration) and that leads us to a number of hacks, like using JS and hidden fields for converting date from calendar into the required format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok as a core extension because it provides a non-toy example of extending API Tokens.

Copy link
Member

@amercader amercader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent work @smotornyuk! I added some comments and questions.
I think we need a big notice on the changelog saying that API keys will no longer be generated on the next version, ie they are deprecated on CKAN 2.9 and will be removed on CKAN 3.0

Comment on lines 257 to 258
web interface and visit your user profile page. To generate new API
token visit **API Tokens** tab available from your puser profile's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
web interface and visit your user profile page. To generate new API
token visit **API Tokens** tab available from your puser profile's
web interface and visit your user profile page. To generate a new API
token visit the *API Tokens* tab available on your user profile.

Also can you rename this section to "Authentication, API keys and API tokens" and replace "(Replacing XXX with your API key.)" with "(Replacing XXX with your API key or token.)"

If we want to encourage people to use them and deprecate the API key we should make them more prominent in this version docs.
Is it worth putting a ..note:: saying that API keys will get deprecated/removed in the next version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added a note about deprecation of api keys. I don't think that we'll really remove them before 3.1/3.2, so the note is quite vague:)

doc/maintaining/configuration.rst Outdated Show resolved Hide resolved
smotornyuk and others added 5 commits July 3, 2020 15:01
Co-authored-by: Ian Ward <ian@excess.org>
Co-authored-by: Ian Ward <ian@excess.org>
Co-authored-by: Adrià Mercader <amercadero@gmail.com>
Co-authored-by: Adrià Mercader <amercadero@gmail.com>
@wardi
Copy link
Contributor

wardi commented Jul 3, 2020

This is excellent work @smotornyuk! I added some comments and questions.
I think we need a big notice on the changelog saying that API keys will no longer be generated on the next version, ie they are deprecated on CKAN 2.9 and will be removed on CKAN 3.0

We can add a configuration option to the next version that enables automatic creation of API keys default to false if not provided (or even this release if there's still time, default true for 2.9) API keys are fine for sites that are internal and tightly controlled and sites that want to maintain compatibility with existing third-party integrations so I wouldn't want to see the feature removed completely

@amercader amercader merged commit c7f0023 into ckan:master Jul 6, 2020
amercader added a commit that referenced this pull request Jul 6, 2020
@amercader
Copy link
Member

@wardi @smotornyuk I actually think that creation of API keys on new users should default to False on 2.9 but let's discuss on #5488

@wardi wardi modified the milestones: CKAN 3.0, CKAN 2.9 Jul 6, 2020
amercader added a commit that referenced this pull request Jul 7, 2020
amercader added a commit that referenced this pull request Jul 8, 2020
wardi added a commit that referenced this pull request Jul 9, 2020
[#5146] Make creation of API keys configurable
amercader added a commit that referenced this pull request Jul 9, 2020
amercader added a commit that referenced this pull request Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants