[stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). #23979

stevenzzzz · 2022-11-14T14:27:18Z

Title: disable per-resource update_duration tracking, add per-API update_duration tracking (if necessary).

Description: right now there is a histogram tracking per-resource update duration, i.e. how long it takes Envoy to consume and update a resource's config.

This stat is of less value when there are tons of resources in Envoy, what matters more is how long does Envoy take to load each discoveryResponse.

OTOH, histograms are known to consume lots of RAM and CPU.

We should allow users to disable per-resource update-duration tracking (with a runtime flag) and maybe add a per-API level update-duration tracking instead.

stevenzzzz · 2022-11-14T14:27:39Z

@jmarantz @pradeepcrao

stevenzzzz · 2022-11-14T15:18:16Z

@yanavlasov

stevenzzzz · 2022-12-13T22:58:15Z

@adisuissa could you no-stalebot this issue?

pradeepcrao · 2022-12-14T15:42:50Z

Hi @stevenzzzz are you talking about this histogram? @adisuissa, is my understanding correct that we create only one Subscription object for every type of resource config and not for every instance of a resource? Are we expecting to see very many of these created in an Envoy process?

adisuissa · 2022-12-14T16:41:47Z

Hi @stevenzzzz are you talking about this histogram? @adisuissa, is my understanding correct that we create only one Subscription object for every type of resource config and not for every instance of a resource? Are we expecting to see very many of these created in an Envoy process?

Depends on the resource type. For CDS and LDS for example, there will be a single subscription.
For EDS for example there will be multiple subscription objects (one for each cluster).

pradeepcrao · 2022-12-14T17:39:55Z

Thanks Adi, makes sense.

stevenzzzz · 2022-12-14T21:18:14Z

yes.
For RDS/EDS/VHDS or any "leaf XDS", it's per-resource, which is not ideal, and probably less meaningful.

Instead, I think we should trace the performance at "per-XDS response level".

adisuissa · 2022-12-15T14:08:25Z

Instead, I think we should trace the performance at "per-XDS response level".

I'm not sure if this claim is correct, IIUC the "per-xDS response level". For example, if Envoy has 3 routes, and only 2 are being constantly updated, then if we just update when was the last RDS update that came, we lose the information on when the other route was updated.

stevenzzzz · 2022-12-15T14:15:45Z

I am talking about the update_duration etc, which I think is very expensive and not very useful to a cloud proxy. I think we can have an option to allow user to choose which to track: per-resource level or "per-response" level.

adisuissa · 2022-12-15T14:59:52Z

I am talking about the update_duration etc, which I think is very expensive and not very useful to a cloud proxy. I think we can have an option to allow user to choose which to track: per-resource level or "per-response" level.

Ah, thanks for the clarification. I read the title as "disable per-resource config update stats tracking", which I understood as more generic than just update_duration.

stevenzzzz added enhancement Feature requests. Not bugs or questions. triage Issue requires triage labels Nov 14, 2022

stevenzzzz changed the title ~~[stats, xds] disable per-resource update_duration tracking, add per-API update_duration tracking (if necessary).~~ [stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). Nov 14, 2022

zuercher added area/perf area/stats and removed triage Issue requires triage labels Nov 14, 2022

adisuissa added the no stalebot Disables stalebot from closing an issue label Dec 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). #23979

[stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). #23979

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Dec 13, 2022

pradeepcrao commented Dec 14, 2022

adisuissa commented Dec 14, 2022

pradeepcrao commented Dec 14, 2022

stevenzzzz commented Dec 14, 2022

adisuissa commented Dec 15, 2022

stevenzzzz commented Dec 15, 2022

adisuissa commented Dec 15, 2022

[stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). #23979

[stats, xds] disable per-resource config update stats tracking, add per-API update_duration tracking (if necessary). #23979

Comments

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Nov 14, 2022

stevenzzzz commented Dec 13, 2022

pradeepcrao commented Dec 14, 2022

adisuissa commented Dec 14, 2022

pradeepcrao commented Dec 14, 2022

stevenzzzz commented Dec 14, 2022

adisuissa commented Dec 15, 2022

stevenzzzz commented Dec 15, 2022

adisuissa commented Dec 15, 2022