Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

promql: add sort_by_label and sort_by_label_desc functions #11299

Merged
merged 2 commits into from
Nov 28, 2023

Conversation

galexrt
Copy link
Contributor

@galexrt galexrt commented Sep 12, 2022

This adds functions to sort a vector by its label value.

Based on #1533

Signed-off-by: Alexander Trost galexrt@googlemail.com


There was some discussion around adding this functionality to PromQL/ Prometheus in the original PR #1533, though based on the various comments from users and now myself stumbling upon this while trying to visualize certain metrics sorted by label value, I would gladly see this topic revisited.

@roidelapluie
Copy link
Member

How difficult would it be to make this a vararg function to be able to sort by multiple labels?

sort_by_label(up, "job","instance")

@galexrt
Copy link
Contributor Author

galexrt commented Sep 12, 2022

@roidelapluie I can look into making it into a vararg function to be able to sort by multiple labels. Is there an existing function that already accepts a list of strings as an argument that I can look at?

@roidelapluie
Copy link
Member

label_join()

@galexrt galexrt force-pushed the sort_by_labels branch 2 times, most recently from 1a1be49 to 84d2e6d Compare September 12, 2022 14:57
@galexrt
Copy link
Contributor Author

galexrt commented Sep 12, 2022

@roidelapluie The sort_by_label and sort_by_label_desc functions now accept a list of labels and is sorting the vector accordingly (see functions.tests).

@@ -401,6 +401,14 @@ in ascending order.

Same as `sort`, but sorts in descending order.

## `sort_by_label()`

`sort_by_label(v instant-vector, label string, ...)` returns vector elements sorted by their label values and sample value in case of label values being equal, in ascending order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wonder if we shouldn't keep this completely independent from the sample value and instead use other label values to break the sorting tie? That way, sort results would be stable over time, given the same set of time series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Julius. We could also decide not to sort at all in that case, so you can do sort_by_label(sort(up),"job")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds even better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should remove the "fallback" to comparing the value in the sort_by_label functions? This might make the test unstable, though I have to re-check if that happens without this "fallback" compare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests should be stable even without this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should make it clear that labels are sorted lexicographically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roidelapluie Without the value comparison the test fails no matter in what order the result in the test(s) are placed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is strange. The order should be predictible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it is not predictable as every X run the test fails and complains about the order of the result.

How should we approach this that we can hopefully get this functionality integrated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective, the fix here is to sort the values using the sort/sort_desc functions and then do the sort by label(s).

promql/functions.go Outdated Show resolved Hide resolved
promql/functions.go Outdated Show resolved Hide resolved
continue
case 1:
return false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a return at the end

@bboreham
Copy link
Member

Note that range queries are always sorted by label sets, right at the end:

sort.Sort(mat)

Perhaps you could mention in the documentation this is intended for instant queries.

@juliusv
Copy link
Member

juliusv commented Sep 13, 2022

Note that range queries are always sorted by label sets, right at the end:

sort.Sort(mat)

Perhaps you could mention in the documentation this is intended for instant queries.

That same comment would apply to sort() and sort_desc() as well. In general, sorting is a no-op for range queries, as output points are always sorted by the X and Y axis there (unless someone cares about a specific legend item sorting, which I haven't heard yet).

I think we can either omit that comment on the new functions as well, or add a comment to the existing sort functions too. Something like:

Please note that sorting functions only affect the results of instant queries, as range query results always have a fixed output ordering.

@bboreham
Copy link
Member

I think it would be easy to work out that sort (by value) is not a good idea over a range, where different series can be the highest value at different points in time.

Whereas sort_by_label is something you might reasonably want for a range query, e.g. to change the order in which lines are arranged in a dashboard. Hence it should be documented that it won't do what you asked.

@juliusv
Copy link
Member

juliusv commented Sep 13, 2022

Ok yeah, that makes sense. Or we could actually make it work for range queries as well, but I doubt that's worth it. So then let's just add a comment for the new functions.

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, we discussed this PR at the Prometheus dev summit today; we would like to move forward, but would like to ask you to put these functions behind a feature flag.

I had a couple of comments about the implementation too.

@@ -1218,6 +1238,78 @@ func (s *vectorByReverseValueHeap) Pop() interface{} {
return el
}

type vectorByLabelHeap struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the word "Heap" denote here?

// Incase the labels are the same, NaN should sort to the bottom, so take
// ascending sort with NaN first and reverse it.
byLabelSorter := vectorByLabelHeap{vector: vals[0].(Vector), labels: stringSliceFromArgs(args[1:])}
sort.Sort(sort.Reverse(byLabelSorter))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slices.SortFunc is probably neater now.

@roidelapluie
Copy link
Member

cc @galexrt are you still willing to work on this, rebase this pull request and put the functions behind a feature flag? Thanks!

@galexrt
Copy link
Contributor Author

galexrt commented Jul 18, 2023

cc @galexrt are you still willing to work on this, rebase this pull request and put the functions behind a feature flag? Thanks!

Oh that totally fell of my radar, yes I am willing to work on that. Do you have a link to a feature flag example somewhere that I can use as an example?

@galexrt galexrt force-pushed the sort_by_labels branch 2 times, most recently from 03b3ecb to b14c014 Compare July 18, 2023 11:37
@roidelapluie
Copy link
Member

You can look at the engine ops EnableNegativeOffset which used to be a feature flag

@galexrt
Copy link
Contributor Author

galexrt commented Jul 18, 2023

You can look at the engine ops EnableNegativeOffset which used to be a feature flag

So should I dynamically add the sort_by_label and sort_by_label_desc funcs to the FunctionsList map on start when the engine is created in promql/engine.go?

@roidelapluie
Copy link
Member

I think a proper error saying that the feature needs to be enabled if the feature flag is not set would be better. However, I don't know if we can properly error from promql functions so we might need to be creative.

@roidelapluie roidelapluie self-assigned this Aug 8, 2023
@galexrt galexrt force-pushed the sort_by_labels branch 2 times, most recently from a054cc4 to 66ff46b Compare August 14, 2023 19:19
@galexrt
Copy link
Contributor Author

galexrt commented Aug 14, 2023

@roidelapluie I think I have found a way to check the feature flag in the evaluator. Please take another look at the latest code changes, thanks!

@galexrt galexrt force-pushed the sort_by_labels branch 2 times, most recently from bff4451 to ae874fd Compare October 18, 2023 16:05
@galexrt
Copy link
Contributor Author

galexrt commented Oct 18, 2023

@roidelapluie I have updated the PR to resolve the conflicts. Any news on this?

Copy link
Member

@juliusv juliusv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks pretty good to me now!

It seems like the main outstanding question is around sort stability and whether we should resort to:

  • Sorting by sample value
  • Sorting by the remaining label set
  • Do nothing

I would be for sorting by the remaining label set if that is easily doable (hopefully just a small extension of the SortFunc for the 0 string comparison case?). Sorting by sample value seems too unrelated to the label sorting and is also not necessarily always stable, as two series can and often do have the same sample value.

docs/feature_flags.md Outdated Show resolved Hide resolved
promql/engine.go Outdated Show resolved Hide resolved

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ["localhost:9090"]
- targets: ["localhost:9138"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some other unrelated changes crept into the PR here now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this unrelated change, PTAL

@galexrt galexrt force-pushed the sort_by_labels branch 2 times, most recently from a9160ce to 224e642 Compare November 22, 2023 11:22
@juliusv
Copy link
Member

juliusv commented Nov 22, 2023

Thanks! I see the tests failing with this error currently:

--- FAIL: TestEvaluations (5.87s)
    --- FAIL: TestEvaluations/testdata/functions.test (1.20s)
        test.go:105: 
            	Error Trace:	/__w/prometheus/prometheus/promql/test.go:105
            	            				/__w/prometheus/prometheus/promql/test.go:83
            	Error:      	Received unexpected error:
            	            	error in eval sort_by_label(http_requests, "instance", "group") (line 498): expected metric {__name__="http_requests", group="production", instance="0", job="api-server"} with [100.000000] at position 3 but was at 1
            	Test:       	TestEvaluations/testdata/functions.test

@@ -20,7 +20,7 @@ rule_files:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"
- job_name: "extended-ceph-exporter"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see this unrelated change in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, all unrelated changes should be gone now

@galexrt
Copy link
Contributor Author

galexrt commented Nov 23, 2023

Thanks! I see the tests failing with this error currently:

--- FAIL: TestEvaluations (5.87s)
    --- FAIL: TestEvaluations/testdata/functions.test (1.20s)
        test.go:105: 
            	Error Trace:	/__w/prometheus/prometheus/promql/test.go:105
            	            				/__w/prometheus/prometheus/promql/test.go:83
            	Error:      	Received unexpected error:
            	            	error in eval sort_by_label(http_requests, "instance", "group") (line 498): expected metric {__name__="http_requests", group="production", instance="0", job="api-server"} with [100.000000] at position 3 but was at 1
            	Test:       	TestEvaluations/testdata/functions.test

I have changed the sorting logic slightly, and it should now work better when more than 2 labels are involved.
To note sorting with 3 or more labels is "weird" because if the first label doesn't it doesn't compare the next label(s), added some test cases for that as well.

PTAL

galexrt and others added 2 commits November 28, 2023 14:40
This adds functions to sort a vector by its label value.

Based on prometheus#1533

Signed-off-by: Alexander Trost <galexrt@googlemail.com>
Signed-off-by: Julien Pivotto <roidelapluie@o11y.eu>
@roidelapluie roidelapluie merged commit 4293de9 into prometheus:main Nov 28, 2023
24 checks passed
@roidelapluie
Copy link
Member

Thanks!

@pstibrany
Copy link
Contributor

pstibrany commented Nov 28, 2023

It's amazing to see this merged, 9 ehm 7 years after first #1533 :) Thanks @galexrt and all the reviewers! 🎉 (I can't count)

@pixelrebel
Copy link

Thank you @galexrt ! This is a great achievement!

@bobrik
Copy link
Contributor

bobrik commented Jan 16, 2024

Straightforward string sorting yield very questionable results for humans:

image

I urge you to consider using natural sorting instead.

Computers can sort by label however they want even without using sort_by_label, but humans need a little help.

See these PRs that added natural sorting to Grafana:

@juliusv
Copy link
Member

juliusv commented Jan 16, 2024

@bobrik Agreed, good point! It's not yet too late to change things, since the new sorting functions are added under an experimental feature flag. Do you want to send a PR?

@bobrik
Copy link
Contributor

bobrik commented Jan 16, 2024

Here's my attempt: #13411.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants