Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added caching to the LinkRedirectRepository #20036

Conversation

cmraible
Copy link
Contributor

@cmraible cmraible commented Apr 18, 2024

ref https://linear.app/tryghost/issue/ENG-851/implement-a-minimal-but-complete-version-of-redirect-caching-to
ref https://app.incident.io/ghost/incidents/55

Often immediately after sending an email, sites receive a large volume of requests to redirect endpoints from members clicking on the links in the email.

We currently don't cache any of these requests in our CDN, because we also record click events, update the member's last_seen_at timestamp, and send webhooks in response to these clicks, so Ghost needs to handle each of these requests. This means that each of these redirect requests hits Ghost, and currently all these requests hit the database to lookup where to redirect the member to.

Each one of these requests can make up to 11 database queries, which can quickly exhaust Ghost's database connection pool. Even though the redirect lookup query is fairly cheap and quick, these queries aren't prioritized over the "record" queries Ghost needs to handle, so they can get stuck behind other queries in the queue and eventually timeout.

The result is that members are unable to actually reach the destination of the link they clicked on, instead receiving a 500 error in Ghost, or it can take a long time (60s+) for the redirect to happen.

This PR uses our existing adapterManager to cache the redirect lookups either in-memory or in Redis (if configured). This only removes 1 out of 11 queries per redirect request, so it won't reduce the load on the DB drastically, but it at least decouples the serving of the redirect from the DB so the member can be redirected even if the DB is under heavy load.

Local load testing results have shown a decrease in response times from 60 seconds to ~50ms for the redirect requests when handling 500 requests per second, and reduced the 500 error rate to 0.

@cmraible cmraible force-pushed the chris-eng-851-implement-a-minimal-but-complete-version-of-redirect-caching branch from b6f2123 to b0884c1 Compare April 19, 2024 04:52
@cmraible cmraible force-pushed the chris-eng-851-implement-a-minimal-but-complete-version-of-redirect-caching branch 3 times, most recently from 860ff20 to 17a987d Compare April 25, 2024 23:52
@github-actions github-actions bot added affects:admin Anything relating to Ghost Admin affects:portal labels Apr 25, 2024
ref https://linear.app/tryghost/issue/ENG-851/implement-a-minimal-but-complete-version-of-redirect-caching-to
ref https://app.incident.io/ghost/incidents/55

- Often immediately after sending an email, sites receive a large volume of requests to redirect endpoints from members clicking on the links in the email.
- We currently don't cache any of these requests in our CDN, because we also record click events, update the member's `last_seen_at` timestamp, and send webhooks in response to these clicks, so Ghost needs to handle each of these requests.
- This means that each of these redirect requests hits Ghost, and currently all these requests hit the database to lookup where to redirect the member to.
- Each one of these requests can make up to 11 database queries, which can quickly exhaust Ghost's database connection pool.
- Even though the redirect lookup query is fairly cheap and quick, these queries aren't prioritized over the "record" queries Ghost needs to handle, so they can get stuck behind other queries and eventually timeout.
- The result is that members are unable to actually reach the destination of the link they clicked on, instead seeing a 500 error in Ghost, or it can take a long time for the redirect to happen.
- This PR uses our existing `adapterManager` to cache the redirect lookups either in-memory or in Redis (if configured).
- This only removes 1 out of 11 queries per redirect request, so it won't reduce the load on the DB drastically, but it decouples the serving of the redirect from the DB so the member can be redirected even if the DB is under heavy load.
- Local load testing results have shown a decrease in response times from 60 seconds to ~50ms for the redirect requests when handling 500 requests per second, and reduced the 500 error rate to 0.
@cmraible cmraible force-pushed the chris-eng-851-implement-a-minimal-but-complete-version-of-redirect-caching branch from 17a987d to b790c53 Compare April 26, 2024 00:04
@github-actions github-actions bot removed affects:admin Anything relating to Ghost Admin affects:portal labels Apr 26, 2024
@cmraible cmraible merged commit dcd65bf into TryGhost:main Apr 26, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants