Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache CryptoUtil.getkey (redshiftzero's idea) #5100

Merged
merged 2 commits into from
Jan 24, 2020
Merged

Conversation

rmol
Copy link
Contributor

@rmol rmol commented Jan 22, 2020

Status

Work in process

Description of Changes

Adds caching to CryptoUtil.getkey, to reduce the number of expensive GPG key lookup operations. It uses CryptoUtil.keycache, an OrderedDict, so we can push out old items once we reach the cache size limit. Using functools.lru_cache would have taken care of that, but meant we couldn't avoid caching sources without keys, so delays in key generation would mean the source key would be unusable until the server were restarted.

The cache is primed in securedrop/journalist.py to avoid cold starts.

Credit to @redshiftzero for the solution.

Testing

  • Check out this branch.

Save this script as securedrop/securedrop/getkeytest.py:

#!/usr/bin/env python3

import time

import pyotp
import requests


def api_url(path):
    return "http://localhost:8081/api/v1{}".format(path)


def get_all_sources(headers):
    start = time.perf_counter()
    get_all_sources_response = requests.get(api_url("/sources"), headers=headers)
    elapsed = time.perf_counter() - start
    sources = get_all_sources_response.json()["sources"]
    print("get_all_sources with {:d} sources took {:.2f} seconds".format(len(sources), elapsed))


if __name__ == "__main__":
    token_data = {
        "username": "journalist",
        "passphrase": "correct horse battery staple profanity oil chewy",
        "one_time_code": pyotp.TOTP("JHCOGO7VCER3EJ4L").now(),
    }
    token_response = requests.post(api_url("/token"), json=token_data).json()
    headers = {
        "Authorization": "Token {}".format(token_response["token"])
    }

    get_all_sources(headers)
    get_all_sources(headers)
  • export NUM_SOURCES=50
  • make dev

The dev server will take considerably longer to start than usual, as it creates the extra sources.

In another shell, once the server has finished populating the database and is fully ready:

  • docker exec -it securedrop-dev bash
  • /opt/venvs/securedrop-app-code/bin/python3 getkeytest.py # this is run in the container

You should see output like this:

get_all_sources with 50 sources took 0.47 seconds
get_all_sources with 50 sources took 0.47 seconds

Now edit securedrop/journalist.py to comment out the call to prime_keycache() on line 22, and back in the container shell, run getkeytest.py again. If you get a key error about token, you were too quick at editing and your login attempt has been throttled. Wait 10-15 seconds and try again.

You should see output like this:

get_all_sources with 50 sources took 1.59 seconds
get_all_sources with 50 sources took 0.50 seconds

Note that this time the first call to get_all_sources is slower, as the cache hasn't been primed.

Deployment

The cache will increase memory consumption of both the source and journalist interfaces, but it's limited to 1000 keys, so should not be a problem.

Checklist

If you made changes to the server application code:

  • Linting (make lint) and tests (make test) pass in the development container

If you made non-trivial code changes:

  • I have written a test plan and validated it for this PR

Adds caching to CryptoUtil.getkey, to reduce the number of expensive
GPG key lookup operations. It uses CryptoUtil.keycache, an
OrderedDict, so we can push out old items once we reach the cache size
limit. Using functools.lru_cache would have taken care of that, but
meant we couldn't avoid caching sources without keys, so delays in key
generation would mean the source key would be unusable until the
server were restarted.

The cache is primed in securedrop/journalist.py to avoid cold starts.
@rmol rmol changed the title Cache crypto_util.getkey (redshiftzero's idea) Cache CryptoUtil.getkey (redshiftzero's idea) Jan 22, 2020
@rmol rmol self-assigned this Jan 22, 2020
@redshiftzero
Copy link
Contributor

this all looks great! way faster. one thought: check out d04cdd4 which just encapsulates the cache so we can add a test (mostly for documentation purposes so it's super clear to future maintainers what's going on). if you like that you can cherry pick onto this branch or i can push directly

@rmol
Copy link
Contributor Author

rmol commented Jan 24, 2020

Yeah, much nicer.

Copy link
Contributor

@redshiftzero redshiftzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! gonna approve from my side, i'll leave open in case anyone has any thoughts on the below and merge sometime tomorrow otherwise

@redshiftzero redshiftzero mentioned this pull request Jan 24, 2020
4 tasks
@emkll
Copy link
Contributor

emkll commented Jan 24, 2020

LGTM based on visual review: CryptoUtil.get_key is only exposed once to the Source Interface, in the "flag for reply" flow [1]: . This logic should rarely (if ever) be hit, and will be removed soon, per [2].Therefore, the caching introduced here should only effect the Journalist Interface, and not have any effect on the distinguishability of sources (re)visiting the source interface through timing.

[1] : https://github.com/freedomofpress/securedrop/blob/speedy-getkey/securedrop/source_app/main.py#L133
[2] : #1584 (comment)

@redshiftzero redshiftzero merged commit 08ab115 into develop Jan 24, 2020
@redshiftzero redshiftzero deleted the speedy-getkey branch January 24, 2020 16:58
@eloquence eloquence added this to the 1.2.1 milestone Jan 28, 2020
@kushaldas kushaldas mentioned this pull request Feb 18, 2020
16 tasks
@rmol rmol mentioned this pull request Apr 1, 2020
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants