Design client backoff protocol #100
Comments
Here what Sync 2.0 does: http://docs.services.mozilla.com/storage/apis-2.0.html#response-headers |
Based on an IRC conversation with warner, I strawman-propose that we do a blend of the sync2.0 style backoff and a proof-of-work scheme. We can start with a polite request for the client to back off:
If things get really hairy, we send out a 503:
The client at this point has two options. It can just wait and try again later, or it can do a hashcash-style proof-of-work thing and re-submit its request:
The client is expected to submit a fresh proof-of-work with each new request, until the retry-after time has expired. @warner does this adequately capture the gist of our conversation? Thoughts? |
Details on PoW protocol here: On Wed, Jul 31, 2013 at 3:59 PM, Ryan Kelly notifications@github.comwrote:
|
Client side support for PoW needs to be baked in from the start. |
Yeah, that mostly matches what I remember. One thing to clarify for the docs: the client's "options" (retry-after and PoW) aren't really equivalent. We can't distinguish one client from another, so there's no way for us to tell that a client has been politely/patiently waiting (and then accept their request without the PoW). If the DoS attack has stopped by the time they retry (and we're no longer requiring PoWs), then the retry-after might happen to work. But that state might last for hours or days. So only a really lazy client should just do retry-after without the proof-of-work, and they should be prepared to not connect for long periods of time. How exactly would 503+
(The time between the fetch of the PoW parameters and the submission of the completed PoW should be as short as possible) So I guess I'm wondering if we should report 503+ |
Good points. One small nit: clients might arrive in the middle of a DoS and never have seen a Backoff header before being hit with a 503. What I was going for with Retry-After was basically "we estimate it will be at least this long until we switch off PoW", which might let the client make a more intelligent choice between waiting versus doing the work. It's not a promise that your request will succeed if you wait that long - more a guideline than an actual rule. Happy to make these two headers exclusive if it will simplify things for the client. |
/cc @telliott for perspective on proof-of-work idea |
I like the general idea of proof-of-work for clients hitting us too often, but 503 isn't really a good match, since it's a server-side-problem status code, and this is a client problem. 403 is probably the appropriate status here and is explicit that this is a client-fixable issue. |
RFC6585 also defines a "429 Too Many Requests" status which is appropriate here. |
PoW might be useful for both kinds of load but I'm not sure I like penalizing clients (computationally) in the 503 high server load case. It seems nicer to return a RETRY-AFTER header and trust the client to respect it. For the 429 case where individual clients are too chatty I like PoW. |
We should distinguish between these two cases. |
Basic backoff design in #323. |
Handle misbehaving clients or periods of high load. A 20x response with a header or a 503 with or without a header would be appropriate.
The text was updated successfully, but these errors were encountered: