-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce latency between EC2-hosted database and Django containers #50
Comments
Idea: dig into psycopg2, thread safety, "library-friendly lock" Interesting information: from this gunicorn bug report I spotted this info about the psycopg2 adapter and wonder if this is related:
I don't know squat about "green threads" so I'm hoping one of you fine folks recognize if this is relevant. |
Idea: reduce the ALB Health Check timeout This comment about a conceptually-similar timeout in Heroku makes this approach seem very promising. |
Idea: investigate the use of uWSGI This comment is one anecdote to give us hope? |
Summary
Backend Django API containers deployed to ECS are routinely/rapidly deemed "unhealthy" by ALB and bounced out for a new container, which also doesn't work, ad infinitum.
Details
Generally speaking, the backend Django-hosting containers are not a healthy lot. While some will respond to HTTP requests (either to the Swagger root or to the API endpoints themselves), nearly all of them are in some state of disrepair/inability to service client requests consistently.
Potential Issue: database latency
Requests to the Budget database are incredibly slow for non-trivial endpoints, even when running via a local container and talking to the EC2-hosted PostgreSQL:
Oddly, parameterized (i.e. filtered) requests to these endpoints receive super-quick responses.
In the ECS environment, the containers aren't faring any better. In ECS at least, the database is "across the Internet" however - the container app is configured to look for the DB on its external IP address, losing all the benefits of both app + DB being hosted in the same AWS region.
Hell, submitting this request (/budget/history/?fiscal_year=2015-16) via the ECS container still 502'd, but when submitted through a local container, it responded after ~10 seconds
Possible fixes (discussed in #49)
If we had any experience with it to date, the "right" (though likely more costly) answer is start with (1) for as many projects as can tolerate it . That we have no experience with an RDS deployment means we're in danger of sinking days or weeks into figuring that deployment model out, when we have so many other critical tasks between now and Demo Day.
In the absence of (1), (2) sounds like next-best (but adding more complexity to the branching setup we already have), and (3) seems least-good but might be our last resort.
The text was updated successfully, but these errors were encountered: