Essentially code in our server that controls the prevention of "junk requests" is scripted HTTP requests to endpoints that are not made by regular browser users.
For example, there's middleware code that sees if a GET
request
comes in with a bunch of random looking query strings keys. This would cause a PASS on the CDN but would not actually matter to the rendering. In this
case, we spot this early and return a redirect response to the same URL
without the unrecognized query string keys so that if the request follows
redirects, the eventual 200 would be normalized by a common URL so the CDN
can serve a HIT.
Here's an in-time discussion post that summaries the need and much of the recent things we've done to fortify our backend servers to avoid unnecessary work loads:
How we have fortified Docs for better resiliency and availability (June 2023)
At its root, the src/shielding/frame/middleware/index.ts
is injected into our
Express server. From there, it loads all its individual middleware handlers.
Each middleware is one file that focuses on a single use-case. The use-cases are borne from studying log files to spot patterns of request abuse.
Note
Some shielding "tricks" appear in other places throughout the code
base such as controlling the 404 response for /assets/*
URLs.
We rate limit at multiple levels:
- CDN (Fastly)
- All routes via src/shielding/frame/index.ts and the
createRateLimiter()
middleware.- These routes are only rate limited if they are deemed suspicious based on parameters we check.
- API routes via their declaration in src/frame/middleware/api.ts using the
createRateLimiter()
middleware.- These routes are limited to a certain # of requests per minute, regardless of what the request looks like.