Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Info PromQL function prototype #598

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

Conversation

aknuds1
Copy link
Contributor

@aknuds1 aknuds1 commented Feb 13, 2024

Prototype of info PromQL function. The implementation looks up info metrics, with the right identifying labels and time range, in the TSDB.

NB: I use ValueTypeVector for the "label-selector" argument to info, but that's just a quick/imperfect hack, and we should probably have a dedicated ValueTypeLabelSelector that doesn't indicate the argument being for selecting an instant vector. Also see Beorn's comment below on the context of this prototype.

Screenshot of example usage:

info-automatic

TODO:

  • Fix info metric TSDB chunks (currently buggy)
  • Try to write benchmark to compare against corresponding join queries

@aknuds1 aknuds1 force-pushed the feat/info-function branch 3 times, most recently from f313af0 to bb08324 Compare February 13, 2024 13:36
@beorn7
Copy link
Contributor

beorn7 commented Feb 14, 2024

Just to get us all on the same page: This is just a proof of concept. Should we decide to implement this approach for real, what we want as the final result is what I have described in the design outline.

The PoC makes the following concessions to simplify the implementation:

  1. It identifies info metrics by searching all metrics for those with a name ending on _info. This is all of (a) expensive, (b) prone to false positives (a metric with a name ending on _info that is not an info metric), and (c) prone to false negatives (it will not find an info metric that does not have a name ending on _info).
  2. It uses all labels shared by the input metric and each candidate info metric as identifying labels. This can lead to underselecting identifying labels (e.g. only job if the input metric misses an instance label), leading to a too broad selection of info metrics. It could also lead to overselecting identifying labels if there is a label that shares a name just by coincidence, which could exclude info metrics we actually would like to use.

The real implementation needed to maintain additional data records to tracks all known info metrics at a given timestamp together with the information which of their labels are identifying labels and which are data labels.

Note that this additional information isn't readily available from exposition formats and the remote write protocol right now. See details in the design outline. In short:

  • OpenMetrics protobuf is (almost) perfect here because it marks the labels of info metrics as identifying labels and data labels. The only problem is that OpenMetrics enforces a magic _info suffix on all info metrics, which prevents marking info-like metrics like kube_pod_labels as info metrics without changing their name. (And of course the other problem is that nobody actually uses OpenMetrics protobuf right now.)
  • OpenMetrics text has the additional problem that it doesn't mark the labels of info metrics as identifying labels and data labels. They all look the same.
  • When directly ingesting OTLP into a Prometheus compatible backend, resource attributes can be converted into fully fledged info metrics, but already existing info metrics (including things like kube_pod_labels) would still have the same problems as above.
  • Classic Prometheus exposition formats don't even have a notion of an info metric.

Therefore, more work than just implementing the info function in Prometheus is needed. Protocols need to be extended for full support. However, ingestion into Prometheus could be configured to mark specific metrics and specific subset of their labels as info metrics and their identifying labels. That would be a somewhat laborious process, as it had to be customized to specific classes of targets. (For example, you could configure the scrape of kube-state-metrics to identify kube_pod_labels as an info metric with pod and namespace as identifying labels.)

@beorn7
Copy link
Contributor

beorn7 commented Feb 14, 2024

/cc @SuperQ as the comment above extends on our discussion on Slack.

@aknuds1 aknuds1 force-pushed the feat/info-function branch 2 times, most recently from 48d7701 to 0a08808 Compare April 11, 2024 07:04
@aknuds1 aknuds1 force-pushed the feat/info-function branch 7 times, most recently from c0629fe to a8958ae Compare May 2, 2024 19:29
@aknuds1 aknuds1 force-pushed the feat/info-function branch 8 times, most recently from 23afb28 to a7fa7bf Compare May 9, 2024 16:05
Comment on lines +387 to +360
p.mtx.Lock()
defer p.mtx.Unlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to hold the mutex for the first part where you're separating labels info identifying/data.

@aknuds1 aknuds1 force-pushed the feat/info-function branch 3 times, most recently from 44dd937 to 957dbb6 Compare June 19, 2024 13:52
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants