Reduce latency between EC2-hosted database and Django containers #50

MikeTheCanuck · 2017-03-31T00:15:38Z

Summary

Backend Django API containers deployed to ECS are routinely/rapidly deemed "unhealthy" by ALB and bounced out for a new container, which also doesn't work, ad infinitum.

Details

Generally speaking, the backend Django-hosting containers are not a healthy lot. While some will respond to HTTP requests (either to the Swagger root or to the API endpoints themselves), nearly all of them are in some state of disrepair/inability to service client requests consistently.

Potential Issue: database latency

Requests to the Budget database are incredibly slow for non-trivial endpoints, even when running via a local container and talking to the EC2-hosted PostgreSQL:

502/504 errors for the /ocrb/ and /history/ endpoints when no parameters are submitted
5-15 second response time to the /code/ and /kpm/ endpoints

Oddly, parameterized (i.e. filtered) requests to these endpoints receive super-quick responses.

In the ECS environment, the containers aren't faring any better. In ECS at least, the database is "across the Internet" however - the container app is configured to look for the DB on its external IP address, losing all the benefits of both app + DB being hosted in the same AWS region.

Hell, submitting this request (/budget/history/?fiscal_year=2015-16) via the ECS container still 502'd, but when submitted through a local container, it responded after ~10 seconds

Possible fixes (discussed in #49)

Move to RDS
route from app to DB via private IP addresses in a single VPC
host the PostgreSQL database in an adjacent container

If we had any experience with it to date, the "right" (though likely more costly) answer is start with (1) for as many projects as can tolerate it . That we have no experience with an RDS deployment means we're in danger of sinking days or weeks into figuring that deployment model out, when we have so many other critical tasks between now and Demo Day.

In the absence of (1), (2) sounds like next-best (but adding more complexity to the branching setup we already have), and (3) seems least-good but might be our last resort.

MikeTheCanuck · 2017-03-31T01:14:31Z

Idea: dig into psycopg2, thread safety, "library-friendly lock"

Interesting information: from this gunicorn bug report I spotted this info about the psycopg2 adapter and wonder if this is related:

Following your pointer, I had a look at the psycopg2 adapter - which we use to connect our Django app to Postgres - and discovered this section of the documentation which states:

Warning: Psycopg connections are not green thread safe and can’t be used concurrently by different green threads. Trying to execute more than one command at time using one cursor per thread will result in an error (or a deadlock on versions before 2.4.2).

Therefore, programmers are advised to either avoid sharing connections between coroutines or to use a library-friendly lock to synchronize shared connections, e.g. for pooling.

In other words - psycopg2 doesn't like green threads. Based on the behaviour we encountered, I would guess that this is the source of the error. The suggested way to deal with this issue, according to the psycopg docs, is to use a library which enables psycopg support for coroutines.

The recommended library is psycogreen.

I don't know squat about "green threads" so I'm hoping one of you fine folks recognize if this is relevant.

MikeTheCanuck · 2017-03-31T01:34:11Z

Idea: reduce the ALB Health Check timeout

This comment about a conceptually-similar timeout in Heroku makes this approach seem very promising.

MikeTheCanuck · 2017-03-31T01:36:05Z

Idea: investigate the use of uWSGI

This comment is one anecdote to give us hope?

MikeTheCanuck changed the title ~~Database latency leading to unhealthy containers?~~ Reduce latency between database and Django containers Apr 1, 2017

MikeTheCanuck changed the title ~~Reduce latency between database and Django containers~~ Reduce latency between EC2-hosted database and Django containers Apr 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce latency between EC2-hosted database and Django containers #50

Reduce latency between EC2-hosted database and Django containers #50

MikeTheCanuck commented Mar 31, 2017 •

edited

Loading

MikeTheCanuck commented Mar 31, 2017 •

edited

Loading

MikeTheCanuck commented Mar 31, 2017

MikeTheCanuck commented Mar 31, 2017

Reduce latency between EC2-hosted database and Django containers #50

Reduce latency between EC2-hosted database and Django containers #50

Comments

MikeTheCanuck commented Mar 31, 2017 • edited Loading

Summary

Details

Potential Issue: database latency

Possible fixes (discussed in #49)

MikeTheCanuck commented Mar 31, 2017 • edited Loading

MikeTheCanuck commented Mar 31, 2017

MikeTheCanuck commented Mar 31, 2017

MikeTheCanuck commented Mar 31, 2017 •

edited

Loading

MikeTheCanuck commented Mar 31, 2017 •

edited

Loading