Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request hedging #127

Closed
ecordell opened this issue Sep 27, 2021 · 2 comments
Closed

Request hedging #127

ecordell opened this issue Sep 27, 2021 · 2 comments
Labels
area/perf Affects performance or scalability priority/3 low This would be nice to have

Comments

@ecordell
Copy link
Contributor

From the Zanzibar paper:

Zanzibar’s distributed processing requires measures to accommodate slow tasks. For calls to Spanner and to the Leopard index we rely on request hedging (i.e. we send the same request to multiple servers, use whichever response comes back first, and cancel the other requests). To reduce round-trip times, we try to place at least two replicas of these backend services in every geographical region where we have Zanzibar servers. To avoid unnecessarily multiplying load, we first send one request and defer sending hedged requests until the initial request is known to be slow.
To determine the appropriate hedging delay threshold, each server maintains a delay estimator that dynamically computes an Nth percentile latency based on recent measurements. This mechanism allows us to limit the additional traffic incurred by hedging to a small fraction of total traffic.
Effective hedging requires the requests to have similar costs. In the case of Zanzibar’s authorization checks, some checks are inherently more time-consuming than others because they require more work. Hedging check requests would result in duplicating the most expensive workloads and, ironically, worsening latency. Therefore we do not hedge requests between Zanzibar servers, but rely on the previously discussed sharding among multiple replicas and on monitoring mechanisms to detect and avoid slow servers

@ecordell ecordell added priority/3 low This would be nice to have area/perf Affects performance or scalability labels Sep 27, 2021
@josephschorr
Copy link
Member

See also #19

@jzelinskie
Copy link
Member

Duplicate of #19

@jzelinskie jzelinskie marked this as a duplicate of #19 Sep 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/perf Affects performance or scalability priority/3 low This would be nice to have
Projects
None yet
Development

No branches or pull requests

3 participants