Cache CryptoUtil.getkey (redshiftzero's idea) #5100

rmol · 2020-01-22T01:21:47Z

Status

Work in process

Description of Changes

Adds caching to CryptoUtil.getkey, to reduce the number of expensive GPG key lookup operations. It uses CryptoUtil.keycache, an OrderedDict, so we can push out old items once we reach the cache size limit. Using functools.lru_cache would have taken care of that, but meant we couldn't avoid caching sources without keys, so delays in key generation would mean the source key would be unusable until the server were restarted.

The cache is primed in securedrop/journalist.py to avoid cold starts.

Credit to @redshiftzero for the solution.

Testing

Check out this branch.

Save this script as securedrop/securedrop/getkeytest.py:

#!/usr/bin/env python3

import time

import pyotp
import requests


def api_url(path):
    return "http://localhost:8081/api/v1{}".format(path)


def get_all_sources(headers):
    start = time.perf_counter()
    get_all_sources_response = requests.get(api_url("/sources"), headers=headers)
    elapsed = time.perf_counter() - start
    sources = get_all_sources_response.json()["sources"]
    print("get_all_sources with {:d} sources took {:.2f} seconds".format(len(sources), elapsed))


if __name__ == "__main__":
    token_data = {
        "username": "journalist",
        "passphrase": "correct horse battery staple profanity oil chewy",
        "one_time_code": pyotp.TOTP("JHCOGO7VCER3EJ4L").now(),
    }
    token_response = requests.post(api_url("/token"), json=token_data).json()
    headers = {
        "Authorization": "Token {}".format(token_response["token"])
    }

    get_all_sources(headers)
    get_all_sources(headers)

export NUM_SOURCES=50
make dev

The dev server will take considerably longer to start than usual, as it creates the extra sources.

In another shell, once the server has finished populating the database and is fully ready:

docker exec -it securedrop-dev bash
/opt/venvs/securedrop-app-code/bin/python3 getkeytest.py # this is run in the container

You should see output like this:

get_all_sources with 50 sources took 0.47 seconds
get_all_sources with 50 sources took 0.47 seconds

Now edit securedrop/journalist.py to comment out the call to prime_keycache() on line 22, and back in the container shell, run getkeytest.py again. If you get a key error about token, you were too quick at editing and your login attempt has been throttled. Wait 10-15 seconds and try again.

You should see output like this:

get_all_sources with 50 sources took 1.59 seconds
get_all_sources with 50 sources took 0.50 seconds

Note that this time the first call to get_all_sources is slower, as the cache hasn't been primed.

Deployment

The cache will increase memory consumption of both the source and journalist interfaces, but it's limited to 1000 keys, so should not be a problem.

Checklist

If you made changes to the server application code:

Linting (make lint) and tests (make test) pass in the development container

If you made non-trivial code changes:

I have written a test plan and validated it for this PR

securedrop/crypto_util.py

Adds caching to CryptoUtil.getkey, to reduce the number of expensive GPG key lookup operations. It uses CryptoUtil.keycache, an OrderedDict, so we can push out old items once we reach the cache size limit. Using functools.lru_cache would have taken care of that, but meant we couldn't avoid caching sources without keys, so delays in key generation would mean the source key would be unusable until the server were restarted. The cache is primed in securedrop/journalist.py to avoid cold starts.

redshiftzero · 2020-01-23T23:56:33Z

this all looks great! way faster. one thought: check out d04cdd4 which just encapsulates the cache so we can add a test (mostly for documentation purposes so it's super clear to future maintainers what's going on). if you like that you can cherry pick onto this branch or i can push directly

rmol · 2020-01-24T00:16:17Z

Yeah, much nicer.

redshiftzero

thanks! gonna approve from my side, i'll leave open in case anyone has any thoughts on the below and merge sometime tomorrow otherwise

emkll · 2020-01-24T14:42:45Z

LGTM based on visual review: CryptoUtil.get_key is only exposed once to the Source Interface, in the "flag for reply" flow [1]: . This logic should rarely (if ever) be hit, and will be removed soon, per [2].Therefore, the caching introduced here should only effect the Journalist Interface, and not have any effect on the distinguishability of sources (re)visiting the source interface through timing.

[1] : https://github.com/freedomofpress/securedrop/blob/speedy-getkey/securedrop/source_app/main.py#L133
[2] : #1584 (comment)

rmol requested review from heartsucker, kushaldas and redshiftzero as code owners January 22, 2020 01:21

redshiftzero reviewed Jan 22, 2020

View reviewed changes

securedrop/crypto_util.py Outdated Show resolved Hide resolved

rmol force-pushed the speedy-getkey branch from 15940d8 to 2f02b3a Compare January 22, 2020 16:07

rmol force-pushed the speedy-getkey branch from 2f02b3a to 522fa4f Compare January 22, 2020 16:34

rmol changed the title ~~Cache crypto_util.getkey (redshiftzero's idea)~~ Cache CryptoUtil.getkey (redshiftzero's idea) Jan 22, 2020

rmol self-assigned this Jan 22, 2020

app, test: encapsulate FIFO cache and add test

d04cdd4

redshiftzero approved these changes Jan 24, 2020

View reviewed changes

redshiftzero mentioned this pull request Jan 24, 2020

Journalist API v2 #5104

Open

4 tasks

redshiftzero merged commit 08ab115 into develop Jan 24, 2020

redshiftzero deleted the speedy-getkey branch January 24, 2020 16:58

eloquence added this to the 1.2.1 milestone Jan 28, 2020

eloquence mentioned this pull request Jan 28, 2020

Ensure UI remains responsive at all times during sync with server. freedomofpress/securedrop-client#733

Merged

6 tasks

kushaldas mentioned this pull request Feb 18, 2020

Release SecureDrop 1.2.1 #5121

Closed

16 tasks

rmol mentioned this pull request Apr 1, 2020

Add key cache #5184

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache CryptoUtil.getkey (redshiftzero's idea) #5100

Cache CryptoUtil.getkey (redshiftzero's idea) #5100

rmol commented Jan 22, 2020 •

edited

Loading

redshiftzero commented Jan 23, 2020

rmol commented Jan 24, 2020

redshiftzero left a comment

emkll commented Jan 24, 2020

Cache CryptoUtil.getkey (redshiftzero's idea) #5100

Cache CryptoUtil.getkey (redshiftzero's idea) #5100

Conversation

rmol commented Jan 22, 2020 • edited Loading

Status

Description of Changes

Testing

Deployment

Checklist

If you made changes to the server application code:

If you made non-trivial code changes:

redshiftzero commented Jan 23, 2020

rmol commented Jan 24, 2020

redshiftzero left a comment

Choose a reason for hiding this comment

emkll commented Jan 24, 2020

rmol commented Jan 22, 2020 •

edited

Loading