Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster-level hash policy for sticky routing #23060

Open
agrawroh opened this issue Sep 10, 2022 · 5 comments
Open

Cluster-level hash policy for sticky routing #23060

agrawroh opened this issue Sep 10, 2022 · 5 comments
Labels
area/cluster area/load balancing enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@agrawroh
Copy link
Contributor

agrawroh commented Sep 10, 2022

Description

Currently, the hashing policy is defined on the per-route basis [Link] using hash_policy.

We have a use-case where we want to do sticky routing for all the incoming traffic for the external ExtAuthZ and RateLimit services but, there is no good way to achieve it.

We can benefit a lot from a consistent hash Load Balancing like Ring Hash and Maglev by hashing on one of the HTTP headers to achieve consistent hashing and leverage the in-memory cache we have per-replica in these upstream services.

Is it possible to support LB hashing policy on the cluster-level (for all the routes)?

@agrawroh agrawroh added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Sep 10, 2022
@snowp snowp added help wanted Needs help! and removed triage Issue requires triage labels Sep 12, 2022
@htuch
Copy link
Member

htuch commented Sep 12, 2022

Yeah, there is support in the AsyncClient interface (via RequestOptions), but not configurable in a uniform way for things like ext_authz. One option would be to add this to the GrpcService.EnvoyGrpc config. This would avoid any major changes such as having to mix routing logic with ClusterManager. Does this work?

If not, I think the idea of having some cluster-wide control might have merit, but is a deeper discussion that would require @envoyproxy/api-shepherds and @mattklein123 to weigh in.

@agrawroh
Copy link
Contributor Author

@htuch Thanks for chiming in. There was a similar request for mirroring traffic.
Is it possible to identify some of the things that we currently only have on routes which would also make sense to be on the cluster-level and then think more on what would be the best place?

@htuch
Copy link
Member

htuch commented Sep 13, 2022

Yeah, there are others, e.g. fault injection. One thing I can offer here is a workaround - you can loopback the ext_authz cluster through a listener bound to localhost and have that apply a standard route table before hitting the real backend cluster. This is a total kludge, but if it helps your use case it might be worth considering.

I think we should leave this issue open to gauge wider interest. The API design here would require some careful thought.

@agrawroh
Copy link
Contributor Author

Thanks, @htuch! That's exactly what we are doing right now for mirroring & splitting the traffic :)

One more question if you know the answer on top of your head - If we have hash_policy defined on the routes and the clusters to which the traffic is being mirrored/split have one of RING_HASH or MAGLEV LB Policy then would it do sticky routing? Or would it ignore the hash_policy and the split will be completely random?

@htuch
Copy link
Member

htuch commented Sep 13, 2022

In both cases Envoy is using an independent per-cluster HTTP async client with its own pseudo-config, so I strongly suspect the answer is it will ignore the original route hash policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster area/load balancing enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

No branches or pull requests

3 participants