Performance: high control plane cost to handle endpoint churn

Pods scaling up/down is a pretty common task so ought to be handled efficiently. However due to the snapshotPerClient we build a full `XdsSnapWrapper` on each change per each connected client.

If we have 10 XDS connections, and 100 services with 10 endpoints each, with say 1 pod churn/s this can get expensive. Each pod change will recompute 100 clusters and 100 CLAs with 10 endpoints each, for each connection, each second. So we end up with querying 20,000 entries from krt per second.

On pprof this looks like (rest is all GC): 

![Image](https://github.com/user-attachments/assets/eb8867fa-18dd-42d2-87b1-f9fdeca30a32)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance: high control plane cost to handle endpoint churn #11293

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance: high control plane cost to handle endpoint churn #11293

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions