Add journalist interface API #3619

redshiftzero · 2018-06-29T20:45:17Z

Status

Ready for review

Description of Changes

Fixes #1761

Changes proposed in this pull request:

add initial journalist interface API. Note that the API has changed somewhat from what was initially proposed in Journalist API #1761, the canonical reference point is the docs in bdcd4ba

Endpoints that are not represented here are imho “nice to haves” and I propose we slowly add them in followup issues as needed. My rationale here is: 1. they are not needed for an initial client program and 2. The functionality currently available via this API already unblocks developers at news orgs to implement custom functionality. Adding endpoints without modifying existing functionality also means we do not need to increment the API version

Testing

Follow the docs in this branch and try to use the API. From the perspective of a user consuming this API, is anything not intuitive or could be clearer (in the docs or in the API responses itself)?
Are there endpoints that are important for an initial API that are not here?
Are there error cases that are not gracefully handled? Apologies for the lack of detailed test plan here, but exploratory testing / attempts to break this is really what this needs.
Are there security improvements that should be made?

Test database migrations

Provision staging VMs on develop:

git checkout develop
make build-debs
vagrant up /staging/

Now add a source by submitting a document via the source interface.
Now upgrade:

git checkout journalist-api-0.9.0
make build-debs
vagrant provision /staging/

The database migrations should occur without issue and you should be able to use the API (or direct database access) to verify the UUID on the source and submission now exist.

Deployment

Will be deployed in securedrop-app-code package

Checklist

If you made non-trivial code changes:

I have written a test plan and validated it for this PR

If you made changes to documentation:

Doc linting (make docs-lint) passed locally

And an initial (pytest-based) unit test

We have logic in __init__.py here to ensure that a developer does not accidentally forget to protect a route with a @login_required decorator. Instead of the decorator, we have a list of insecure views. We rework this to allow us to use a decorator for the API routes.

This is primarily for the API, but it turns out that we actually make use of abort(403) to prevent an admin from deleting themselves, so instead of seeing a Flask error page, they will see a nice error page in the style of the rest of the journalist interface.

Again, mostly for the journalist API, but users who are logged into the webapp will now see a custom error page with the regular SecureDrop styling instead of the default Flask page

Covers cases where: * Source does not exist (404 response) * Star is successfully added (201 response)

HTTP DELETE will remove the star

This was a bare except, but I think instead we want to handle only itsdangerous.BadData exceptions, which is a general exception that includes a bad signature and an expired one.

We need to also have a nice error handler for method not allowed, that should return JSON for the API and an HTML page for users of the regular web application.

This is a workaround for some unimplemented features (ALTER) in SQLite, e.g. the ability to modify constraints on a column after it has been created, see: https://stackoverflow.com/questions/30378233/sqlite-lack-of-alter-support-alembic-migration-failing-because-of-this-solutio http://alembic.zzzcomputing.com/en/latest/batch.html#batch-mode-with-autogenerate miguelgrinberg/Flask-Migrate#61

Just an off by one error

(cherry picked from commit 1ce5455)

During testing, I ran into an issue where there was a failure in the following case: reverted_schema was: CREATE TABLE "sources" ( id INTEGER NOT NULL, filesystem_id VARCHAR(96), journalist_designation VARCHAR(255) NOT NULL, flagged BOOLEAN, last_updated DATETIME, pending BOOLEAN, interaction_count INTEGER NOT NULL, PRIMARY KEY (id), CHECK (flagged IN (0, 1)), CHECK (pending IN (0, 1)), UNIQUE (filesystem_id) ) and original_schema was: CREATE TABLE sources ( id INTEGER NOT NULL, filesystem_id VARCHAR(96), journalist_designation VARCHAR(255) NOT NULL, flagged BOOLEAN, last_updated DATETIME, pending BOOLEAN, interaction_count INTEGER NOT NULL, PRIMARY KEY (id), UNIQUE (filesystem_id), CHECK (flagged IN (0, 1)), CHECK (pending IN (0, 1)) ) which fails for two reasons: * The unique constraint on filesystem_id is not at the same line in the CREATE TABLE statement. * The table name is quoted in one CREATE TABLE statement, but not in the other. In order to make our tests a little more lenient in this case (and not produce spurious test failures), we should: * Compare sorted lists consisting of the lines in each CREATE TABLE statement * Strip commas and double quotes for each element in the aforementioned lists

To prevent confusion, we also rename `uuid` to `source_uuid`.

Note that Flask's send_file does include ETags by default, but they are not hashes, so less useful for verifying downloads were not corrupted after fetching over Tor. The ETag in Flask is: ``` rv.set_etag('%s-%s-%s' % ( os.path.getmtime(filename), os.path.getsize(filename), adler32( filename.encode('utf-8') if isinstance(filename, text_type) else filename ) & 0xffffffff )) ``` https://github.com/pallets/flask/blob/161c43649d8c362c8359e0b79aeca40c754c5b51/flask/helpers.py#L616

I'm intentionally not using test_source in the test_submissions fixture as it is a bit messy/spaghetti for saving 1 LOC

redshiftzero · 2018-07-18T00:38:15Z

fixed, squashed (some of them), re-pushed 🚂

kushaldas · 2018-07-18T13:12:02Z

securedrop/journalist_app/api.py

+                seconds=TOKEN_EXPIRATION_MINS * 60)
+            response = jsonify({'token': journalist.generate_api_token(
+                 expiration=TOKEN_EXPIRATION_MINS * 60),
+                 'expiration': token_expiry.isoformat() + 'Z'})


As @redshiftzero mentioned, the timezone is always in UTC, the expiration value will have sometime like '2018-07-18T13:39:34.072044Z'.

kushaldas · 2018-07-18T14:16:13Z

securedrop/journalist_app/api.py

+    @wraps(f)
+    def decorated_function(*args, **kwargs):
+        try:
+            auth_header = request.headers['Authorization']


It would be nice to have a one line comment on what that header looks like after this line.

kushaldas · 2018-07-18T14:17:24Z

securedrop/journalist_app/api.py

+            return abort(403, 'API token not found in Authorization header.')
+
+        if auth_header:
+            auth_token = auth_header.split(" ")[1]


What all can be valid values in first part before the space?

kushaldas · 2018-07-18T14:33:27Z

securedrop/journalist_app/api.py

+        source = get_or_404(Source, source_uuid, column=Source.uuid)
+        utils.make_star_false(source.filesystem_id)
+        db.session.commit()
+        return jsonify({'message': 'Star removed'}), 200


In my test, I removed a start from a source, I got back this reply: {'message': 'Star removed'}. After this when I am trying to get all the sources or that particular source again, I can still see 'is_starred': True for that source.

very good catch! filing followup ticket to address

heartsucker

I didn't test manually on this one, but the code and associated tests look good. Very happy to give this the 💯

heartsucker · 2018-07-18T18:28:13Z

securedrop/tests/test_db.py

+    with journalist_app.app_context():
+        source = Source.query.first()
+        with pytest.raises(NotImplementedError):
+            source.public_key = 'a curious developer tries to set a pubkey!'


emkll

Overall this looks fantastic @redshiftzero , I did not encounter any major issues. I separated my testing in 2 parts, the app/api testing in a development VM, and upgrade testing in staging VMs.

For the application portion, I tested the authentication, rate limiting, and went through all the various methods. Everything works as expected. I've observed the following behavior:

POST with parameters to an API endpoint generates a error: 500: no JSON object could be decoded
Request body format: When making API calls with raw/text, I get the following error only with the message reply endpoint: please send texts in valid json . Switching to the preferred application/json works find, but curious that raw/text works for other methods.
There is no logout functionality. How complex would it be to implement some fort of logout functionality? For example, attaching the token to a user's session.

For the upgrade part of the review, I provisioned staging on develop, then build debs on this branch, and vagrant provision on this API branch. The code was updated in /var/www/securedrop, the migrations were successful, I observed the UUID column in the database, and the API is accessible over the authenticated Tor hidden service.

emkll · 2018-07-18T14:00:07Z

securedrop/journalist_app/api.py

+    return user
+
+
+def token_required(f):


Not sure if this is a good idea, but instead of calling the token_required decorator, we could use the before_request decorator (http://flask.pocoo.org/docs/1.0/api/#flask.Flask.before_request) on the login function and explicitly mark the public endpoints.

+1-ing that

emkll · 2018-07-18T14:01:32Z

securedrop/models.py

+        s = TimedJSONWebSignatureSerializer(current_app.config['SECRET_KEY'])
+        try:
+            data = s.loads(token)
+        except BadData:


👍 , BadData catches all errors, including BadSignature

heartsucker · 2018-07-18T20:11:45Z

RE: @emkll

attaching the token to a user's session

A Flask session is a cookie, so we'd have to have API consumers use both the Authorization header and and correct Cookie. The latter would be sufficient to remove the first.

Sources should only be exposed to journalists when they have submitted something

emkll

I did another round of testing and everything looks good to me 👍. I will address the comment from my previous review (invalid/empty JSON in POST returns 500 from #3619 (review))

Thanks @redshiftzero and thanks @heartsucker for the thorough review.

redshiftzero added 11 commits June 22, 2018 21:23

Journalist API: Add initial Blueprint with root endpoint

4e90665

And an initial (pytest-based) unit test

Remove deprecated autoversion

8068543

Journalist API: Add endpoint to do API token auth

0a7fcbb

Journalist API: Use decorator for protected API endpoints

3402155

Journalist API: Add pytest fixture for journalist API token

efad309

Journalist API /sources/: placeholders and get_all_sources()

98d43d2

Test source pytest fixture: Add submissions for convenience

efe2726

Journalist API: Add security test cases for HTTP GETs

dac442c

Journalist API: Security test cases for HTTP DELETEs

79e5b60

redshiftzero requested review from kushaldas, a user, heartsucker and emkll June 29, 2018 20:45

redshiftzero requested review from conorsch and msheiny as code owners June 29, 2018 20:45

redshiftzero force-pushed the journalist-api-0.9.0 branch 2 times, most recently from 51cd067 to 6a2b8aa Compare June 29, 2018 21:03

redshiftzero added 11 commits June 29, 2018 14:08

Journalist API: Add security test cases for HTTP POSTs

f6a42c5

Journalist interface: Add 404 error handler

35d032e

Again, mostly for the journalist API, but users who are logged into the webapp will now see a custom error page with the regular SecureDrop styling instead of the default Flask page

Journalist API: Add single source endpoint (/sources/<int:id>)

d32da8d

Journalist API: Add star endpoint: /sources/<int:source_id>/star/

73e0aa7

Covers cases where: * Source does not exist (404 response) * Star is successfully added (201 response)

Journalist API: Remove star endpoint: /sources/<int:source_id>/star/

122664e

HTTP DELETE will remove the star

Journalist API: Catch itsdangerous.BadData only

8b425db

This was a bare except, but I think instead we want to handle only itsdangerous.BadData exceptions, which is a general exception that includes a bad signature and an expired one.

Journalist interface: Add 405 error handler

8ae05b1

We need to also have a nice error handler for method not allowed, that should return JSON for the API and an HTML page for users of the regular web application.

Journalist API: Get all submissions [/submissions/]

3504688

Journalist API: GET a source's submissions [/source/:id/submissions]

1324687

Journalist API: GET a single submission [/sources/:id/submissions/:id]

b4e9bb2

Journalist API: DELETE individual source submission

d2c1ce3

redshiftzero and others added 11 commits July 17, 2018 15:52

Journalist API: Create Source.uuid column

daecdd7

Bugfix: Upgrade test should be on old database, then load data

8f385a9

Just an off by one error

API docs: Update for UUID change

664be6e

Database migration: Add migration and tests for source UUID column

ebe1553

updated schema check to handle whitespace differences

b5447d7

(cherry picked from commit 1ce5455)

Journalist API: Also use UUID for submissions

900f3b3

To prevent confusion, we also rename `uuid` to `source_uuid`.

Remove unnecessary _insecure_api_views

5ab60f7

Journalist API tests: Ensure root endpoint exposes all endpoints

dc21b3c

redshiftzero force-pushed the journalist-api-0.9.0 branch from 207b4fa to dc21b3c Compare July 17, 2018 22:56

redshiftzero added 2 commits July 17, 2018 17:16

Indicate source.public_key setter and deleter are not implemented

1b852b0

Journalist API tests: Add submissions fixture

57dd181

I'm intentionally not using test_source in the test_submissions fixture as it is a bit messy/spaghetti for saving 1 LOC

kushaldas requested changes Jul 18, 2018

View reviewed changes

heartsucker previously approved these changes Jul 18, 2018

View reviewed changes

emkll reviewed Jul 18, 2018

View reviewed changes

Journalist API: Only show pending=False sources

e42c7f1

Sources should only be exposed to journalists when they have submitted something

redshiftzero dismissed heartsucker’s stale review via e42c7f1 July 18, 2018 23:36

heartsucker approved these changes Jul 21, 2018

View reviewed changes

emkll approved these changes Jul 24, 2018

View reviewed changes

emkll merged commit 31ddec8 into develop Jul 24, 2018

emkll mentioned this pull request Jul 24, 2018

CI / test failures on develop branch #3653

Closed

redshiftzero deleted the journalist-api-0.9.0 branch July 27, 2018 15:13

This was referenced Jul 27, 2018

Added passlib for more flexible password hashing #3506

Merged

journalist API: incorrect is_starred reported when starring and unstarring a source #3666

Closed

Fix source is_starred in journalist API #3667

Merged

rmol mentioned this pull request Oct 20, 2020

Add type annotations to secure_tempfile.py and models.py #5534

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add journalist interface API #3619

Add journalist interface API #3619

redshiftzero commented Jun 29, 2018 •

edited

redshiftzero commented Jul 18, 2018

kushaldas Jul 18, 2018

kushaldas Jul 18, 2018

kushaldas Jul 18, 2018

kushaldas Jul 18, 2018

redshiftzero Jul 27, 2018

heartsucker left a comment

heartsucker Jul 18, 2018

emkll left a comment

emkll Jul 18, 2018

heartsucker Jul 18, 2018

emkll Jul 18, 2018

heartsucker commented Jul 18, 2018

emkll left a comment

Add journalist interface API #3619

Add journalist interface API #3619

Conversation

redshiftzero commented Jun 29, 2018 • edited

Status

Description of Changes

Testing

Test database migrations

Deployment

Checklist

If you made non-trivial code changes:

If you made changes to documentation:

redshiftzero commented Jul 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heartsucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emkll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

heartsucker commented Jul 18, 2018

emkll left a comment

Choose a reason for hiding this comment

redshiftzero commented Jun 29, 2018 •

edited