A super simple in-memory
-
[client:] the client can talk to the Proxy web service over HTTP -- this is left for the user.
-
proxy: the Proxy web service talks with the Redis service over HTTP. This is a containerized Flask application using Gunicorn as the WSGI to handle concurrency. For the purposes of this assignment we set only a single worker and cap the number of threads at
2 * num_cores
on the proxy container. We spawn only a single worker process to ensure consistency with our RedisCache, otherwise we would have an instance of RedisCache inside each worker process.RedisCache: the Proxy web service offloads most of the heavy lifting onto this implementation of a RedisCache. It is an LRU Cache implementation with per-item time-to-live (TTL) setting based on cachetools package. We have overridden the _missing_ method to perform a roundtrip to our Redis backing instance if key is not currently in cache.
-
e2e: a standalone end-to-end testing container as called out in requirements. Returns non-zero exit code if tests fail. If I had enough time I would have liked to write some unittests for my RedisCache subclass of TTLCache. The original author has decent test coverage so we should be OK for now.
-
redis: vanilla containerized Redis --
redis:latest
.
Our configuration is set in .env.
-
REDIS_PORT (integer)
[required]
default: 6379
A valid port number1 <= REDIS_PORT <= 65535
for the host container mapping to the Redis container. -
PROXY_PORT (integer)
[required]
default: 5000
A valid port number1 <= PROXY_PORT <= 65535
for the host container mapping to the Proxy web service container. -
CACHE_TTL (integer)
[optional]
default: 3600
Per-item ttl setting in our RedisCache -
CACHE_MAXSIZE (integer)
[optional]
default: 1024
Fixed key size setting for our RedisCache -
MAX_PARALLEL (integer)
[optional]
default: 1
Set to 1 for sequential concurrent processing otherwise > 1 for parallel concurrent processing. For safety we cap this setting with max_threads. -
MAX_CONNECTIONS (integer)
[optional]
default: 2048
Maximum number of concurrent connections to our web service at any given time.
./proxy/app.py
: Define the Flask application, instantiate an instance of RedisCache and set the main route -- /get/<key>
.
./proxy/Dockerfile
: Basic Dockerfile for Gunicorn backed Flask application.
./proxy/gunicorn.py
: Defines the Gunicorn WSGI configuration.
./proxy/redis_cache.py
: Extends the cachetools based implementation of TTLCache but specifying custom behaviour when key is not in cache.
./.env
: single file for all configuration.
./docker-compose.yml
: Defines our Proxy and Redis components. No need to setup any special networking as the default network and communication between containers is sufficient.
Makefile
: defines test
and run
.
LRU
Cachetools uses a dictionary mapping to a doubly linked list under the hood. Because deletion and insertion into a doubly linked list and dictionary are both O(1) the complexity of the LRU implementation is as follows:
- set: time -- O(1), space -- O(1)
- get: time -- O(1), space -- O(1)
- evict: time -- O(1), space -- O(1)
TTL
This implementation of TTL Cache will attempt to remove all expired keys on every state mutation. Because in the worst case we may need to expire all keys on any given mutation the complexity of the TTL implementation is as follows:
- set: time -- O(n), space -- O(1)
- get: time -- O(n), space -- O(1)
- expire: time -- O(1), space -- O(1)
In practice, when we take TTL's set
and get
and amortize it over time we get an effective O(1) for each operation.
Run tests
make test
Run proxy (and redis)
make
-
documentation: 1hr
-
proxy: 1.5hrs
-
e2e: 1hr
-
redis: n/a
-
misc: 3hrs
Spent reading about best practices (e.g. flask, docker, max_threads) and certain trade offs e.g. treating e2e testing as it's own service for extensibility, rather than setting an environment variable and running tests inside the application container.
- Some unit tests may have been helpful (inside the proxy service).
- RESP protocol for proxy