`service_name` label should determine series order #751

kolesnikovae · 2023-06-06T15:04:06Z

TL;DR: Let's rename service_name to __service_name__ because it is good for performance reasons.

Currently, we have a number of reserved labels (surrounded with __) that mainly determine the placement order of series in the block. The list is the following:

const (
	LabelNameProfileType = "__profile_type__"
	LabelNameType        = "__type__"
	LabelNameUnit        = "__unit__"
	LabelNamePeriodType  = "__period_type__"
	LabelNamePeriodUnit  = "__period_unit__"
	LabelNameDelta       = "__delta__"
	LabelNameProfileName = "__name__"
)

Cardinality of the label values is very low, a handful of unique combinations. As a result, series locality may be very poor, which leads to performance penalties due to the read amplification and redundant seek operations that we need to perform to fetch the data.

Consider an example: two programs app-1 and app-2 are being profiled , each of them is deployed to two pods. Then, their CPU profile series may form the following sequence (I omitted the reserved labels, because they are the same for each series and do not affect the order):

0: [a_label: bvc, pod_name: app-1-bvc, service_name: app-1]
1: [a_label: cxz, pod_name: app-2-cxz, service_name: app-2]
2: [a_label: vcx, pod_name: app-1-vcx, service_name: app-1]
3: [a_label: xcv, pod_name: app-2-xcv, service_name: app-2]

You can see that the series are interleaved: app-1, then app-2, app-1 again, and app-2 in the end. If we, say, want to query profiles of app-1 (rows 0 and 2), we either need to fetch more data (rows 0, 1, 2) or to skip the row 1 and seek to the row 2.

I assume that most of the queries cover a limited number of profiled programs (services/applications), usually just one. Therefore, from a performance standpoint, it would be beneficial to place profile series belonging to the same program close to each other. This would allow to read the data sequentially, and eliminate unnecessary "seeks", which is vital for queries from the object store where I/O latency reaches hundreds of milliseconds.

If we use __service_name__ label key instead of service_name, the order of series changes:

0: [__service_name__: app-1, a_label: bvc, pod_name: app-1-bvc]
1: [__service_name__: app-1, a_label: vcx, pod_name: app-1-vcx]
2: [__service_name__: app-2, a_label: cxz, pod_name: app-2-cxz]
3: [__service_name__: app-2, a_label: xcv, pod_name: app-2-xcv]

The order of the series is now perfect: profiles of the same service are stored sequentially in the block and can be fetched in a single pass.

The text was updated successfully, but these errors were encountered:

cyriltovena · 2023-06-19T19:23:03Z

I'll take a stab at this.

korniltsev mentioned this issue Jun 7, 2023

chore(pyroscope): Ensure presence of service_name label grafana/agent#4035

Merged

3 tasks

cyriltovena self-assigned this Jun 19, 2023

cyriltovena mentioned this issue Jun 20, 2023

Adds __service_name__ labels to improve data locality #782

Merged

cyriltovena closed this as completed in #782 Jun 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`service_name` label should determine series order #751

`service_name` label should determine series order #751

kolesnikovae commented Jun 6, 2023 •

edited

cyriltovena commented Jun 19, 2023

service_name label should determine series order #751

service_name label should determine series order #751

Comments

kolesnikovae commented Jun 6, 2023 • edited

cyriltovena commented Jun 19, 2023

`service_name` label should determine series order #751

`service_name` label should determine series order #751

kolesnikovae commented Jun 6, 2023 •

edited