fix: align semantics of metric and log query label extraction #11587

trevorwhitney · 2024-01-04T22:55:56Z

What this PR does / why we need it:

As a result of the query optimization work, we accidentally introduced a discrepancy between the semantics of logs and metrics queries. Metric queries can benefit from a short-circuit during label extraction, where we only need the labels needed for grouping and filtering. Log queries need to always extract all labels, as a user may want to inspect the key=value pairs of all detected fields, not just those filtered on.

However, given a query with nested labels of the same name (ie {"message": {"message": "foo"}}) this short circuit introduces a problem where the metric query will use the value of the first message (since it stops parsing the message key after finding it once), but the log query will use the value of the second message (since it will continue to extract all labels, even those it has already seen). This PR changes the semantics so that both types of queries will only use the first value.

The result of the change is a slight improvement in the hot path of label extraction, which I interpret as us having to do a few more operations due to the removal of the len(requiredLabels) == 0 short circuit, but those operations are quick, and thus more are done in the same runtime.

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 5 3600X 6-Core Processor             
                                                    │  before.txt   │               after.txt               │
                                                    │    sec/op     │    sec/op     vs base                 │
_Parser/json/no_labels_hints-12                        4.433µ ± ∞ ¹   4.502µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          3015.0n ± ∞ ¹   177.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                          1.066µ ± ∞ ¹   1.037µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    77.77n ± ∞ ¹   76.95n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       93.73n ± ∞ ¹   80.21n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     105.70n ± ∞ ¹   80.12n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      714.5n ± ∞ ¹   718.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         657.7n ± ∞ ¹   780.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        656.3n ± ∞ ¹   721.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        14.58n ± ∞ ¹   14.08n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           14.13n ± ∞ ¹   14.86n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          14.65n ± ∞ ¹   13.19n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      2.747µ ± ∞ ¹   3.383µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        3007.0n ± ∞ ¹   546.5n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                        904.8n ± ∞ ¹   905.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                4.460µ ± ∞ ¹   4.545µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   4.566µ ± ∞ ¹   4.453µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  4.474µ ± ∞ ¹   4.529µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         406.0n ± ∞ ¹   455.2n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            457.2n ± ∞ ¹   401.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           431.8n ± ∞ ¹   418.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     258.4n ± ∞ ¹   281.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        223.8n ± ∞ ¹   258.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       165.1n ± ∞ ¹   166.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                452.1n         376.3n        -16.77%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                    │  before.txt   │              after.txt               │
                                                    │     B/op      │    B/op      vs base                 │
_Parser/json/no_labels_hints-12                         280.0 ± ∞ ¹   280.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          176.000 ± ∞ ¹   8.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                           64.00 ± ∞ ¹   64.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                       80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                          80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                         80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12            0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                       336.0 ± ∞ ¹   336.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                         336.00 ± ∞ ¹   52.00 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                         74.00 ± ∞ ¹   74.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                 192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                    192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                   192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12          51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12             51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12            51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                      35.00 ± ∞ ¹   35.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                         32.00 ± ∞ ¹   32.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                        3.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                           ⁴                -18.66%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

                                                    │  before.txt  │              after.txt               │
                                                    │  allocs/op   │  allocs/op   vs base                 │
_Parser/json/no_labels_hints-12                        18.00 ± ∞ ¹   18.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          12.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                          4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      16.00 ± ∞ ¹   16.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        16.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                        6.000 ± ∞ ¹   6.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                          ⁴                -15.91%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

Which issue(s) this PR fixes:
Fixes #11647

github-actions · 2024-01-04T22:59:11Z

Trivy scan found the following vulnerabilities:

HIGH, Target: docker.io/grafana/loki:main-efdea22 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libcrypto3 v3.1.3-r0. Fixed in v3.1.4-r0
HIGH, Target: docker.io/grafana/loki:main-efdea22 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libssl3 v3.1.3-r0. Fixed in v3.1.4-r0
\nTo see more details on these vulnerabilities, and how/where to fix them, please run docker build -t grafana/loki:main-efdea22 -f cmd/loki/Dockerfile .
trivy i grafana/loki:main-efdea22 on your branch. If these were not introduced by your PR, please considering fixing them in via a subsequent PR. Thanks!

chaudum

LGTM

Please add a changelog entry and add the appropriate labels for backporting.

trevorwhitney · 2024-01-11T19:21:51Z

pkg/logql/log/parser_hints.go

+	found := map[string]interface{}{}
+	for _, e := range p.extracted {
+		for _, l := range p.requiredLabels {
+			if e == l {
+				found[l] = nil
+				break
+			}
+		}
+	}
+
+	return len(p.requiredLabels) == len(found)


this is needed because previously, RecordExtracted was only recording required fields that were extracted. As a result, it was previously acceptable to just test the length of the 2 slices against each other. However, now that we're recording all extracted labels, we have to actually compare extracted to required. I'm using a map to prevent duplicate extractions from causing an incorrect result here.

trevorwhitney · 2024-01-11T19:23:13Z

Ok, I think I've figured out why the benchmark was off. Here's a new benchstat after pushing 147dbab:

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 5 3600X 6-Core Processor             
                                                    │  before.txt  │               after.txt               │
                                                    │    sec/op    │    sec/op      vs base                │
_Parser/json/no_labels_hints-12                       4.479µ ± ∞ ¹    4.419µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          3.060µ ± ∞ ¹    3.483µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         1.006µ ± ∞ ¹    1.050µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   76.89n ± ∞ ¹    78.11n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      77.43n ± ∞ ¹    79.86n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     77.66n ± ∞ ¹    76.44n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     798.5n ± ∞ ¹    726.9n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        839.7n ± ∞ ¹    845.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       817.1n ± ∞ ¹    817.1n ± ∞ ¹       ~ (p=1.000 n=1) ³
_Parser/unpack-not_json_line/no_labels_hints-12       13.37n ± ∞ ¹    13.33n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          12.55n ± ∞ ¹    14.73n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         13.90n ± ∞ ¹    13.79n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     3.324µ ± ∞ ¹    3.287µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        3.586µ ± ∞ ¹    4.311µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       981.9n ± ∞ ¹   1067.0n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               4.477µ ± ∞ ¹    4.609µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  4.530µ ± ∞ ¹    4.550µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 4.504µ ± ∞ ¹    4.767µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        484.7n ± ∞ ¹    470.3n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           446.1n ± ∞ ¹    457.4n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          489.2n ± ∞ ¹    473.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    294.0n ± ∞ ¹    313.6n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       315.0n ± ∞ ¹    291.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      179.0n ± ∞ ¹    187.5n ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                               472.4n          484.3n        +2.51%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
³ all samples are equal

                                                    │ before.txt  │              after.txt              │
                                                    │    B/op     │    B/op      vs base                │
_Parser/json/no_labels_hints-12                       280.0 ± ∞ ¹   280.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          176.0 ± ∞ ¹   176.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         64.00 ± ∞ ¹   64.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     336.0 ± ∞ ¹   336.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        336.0 ± ∞ ¹   336.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       74.00 ± ∞ ¹   74.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    35.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       32.00 ± ∞ ¹   32.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      3.000 ± ∞ ¹   3.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                                         ³                +0.00%               ³
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ summaries must be >0 to compute geomean

                                                    │ before.txt  │              after.txt              │
                                                    │  allocs/op  │  allocs/op   vs base                │
_Parser/json/no_labels_hints-12                       18.00 ± ∞ ¹   18.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          12.00 ± ∞ ¹   12.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     16.00 ± ∞ ¹   16.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        16.00 ± ∞ ¹   16.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       6.000 ± ∞ ¹   6.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      1.000 ± ∞ ¹   1.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                                         ³                +0.00%               ³
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ summaries must be >0 to compute geomean

The change needed was in AllRequiredExtracted to account for the fact that we're recording all extracted labels now, and not just required ones.

@MasslessParticle kudos to your diligence here in making sure we understood why this benchmark was off!

MasslessParticle · 2024-01-11T19:57:27Z

pkg/logql/log/parser_hints.go

 		return false
 	}
-	return len(p.extracted) == len(p.requiredLabels)
+
+	found := map[string]interface{}{}


We probably want to say found := make(map[string]struct{}, len(p.requiredLabels)) here. struct{} is smaller than interface{} and initializing with the max len we'll need ensures only one alloc

MasslessParticle · 2024-01-11T20:00:22Z

pkg/logql/log/parser_hints.go

-	return len(p.extracted) == len(p.requiredLabels)
+
+	found := map[string]interface{}{}
+	for _, e := range p.extracted {


It might also be worth seeing if it's faster to just compare the list in two loops. It's an n^2 algorithm but it might be faster because it requires no allocs. (This function is called a lot)

It's the same reason extractedLabels and requiredLabels are []string here rather than map[string]. It's actually faster to iterate a small slice than index all these things from a map!

both metric and log queries use the first extracted label when multiple values are requested for the same label Fixes #11647 (cherry picked from commit 9759c13)

**What this PR does / why we need it**: A data race introduced in #11587 was caught in the [backport to k184](#11668). This removes the shared state of a single global `noParserHints` in favor of creating an empty `Hint` object for each label builder, since the `Hints` is keeping state of `extracted` and `requiredLabels`.

…11668) Backport 9759c13 from #11587 --- **What this PR does / why we need it**: Align the label parsing logic of metric and log queries to both only extract the first instance of a label when the same label is requested multiple times. **Which issue(s) this PR fixes**: Fixes #11647 --------- Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>

…traction (#11667) Backport 9759c13 from #11587 --- **What this PR does / why we need it**: Fix label parsing logic so metric and log queries both only extract the first instance of a label that is requested multiple times. **Which issue(s) this PR fixes**: Fixes #11647 --------- Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>

…a#11587) both metric and log queries use the first extracted label when multiple values are requested for the same label Fixes grafana#11647

) **What this PR does / why we need it**: A data race introduced in grafana#11587 was caught in the [backport to k184](grafana#11668). This removes the shared state of a single global `noParserHints` in favor of creating an empty `Hint` object for each label builder, since the `Hints` is keeping state of `extracted` and `requiredLabels`.

fix: align semantics of metrics and logs queries

14f2ead

trevorwhitney requested a review from a team as a code owner January 4, 2024 22:55

pull-request-size bot added the size/M label Jan 4, 2024

trevorwhitney changed the title ~~fix: align semantics of metrics and logs queries~~ fix: align semantics of metric and log query label extraction Jan 4, 2024

trevorwhitney added 4 commits January 5, 2024 14:14

not sure why we're setting a label we haven't parsed?

3eacd02

preserve empty value behavior for missing label

877d0a1

better way of handling expressions without results

aa99dfd

Merge branch 'main' into debug-query-bug

ed0d366

chaudum reviewed Jan 11, 2024

View reviewed changes

fix logic in AllRequiredExtracted

147dbab

trevorwhitney commented Jan 11, 2024

View reviewed changes

Merge branch 'main' into debug-query-bug

0cfd58c

MasslessParticle reviewed Jan 11, 2024

View reviewed changes

remove map overhead

ed74485

MasslessParticle approved these changes Jan 11, 2024

View reviewed changes

trevorwhitney added type/bug Somehing is not working as expected backport k184 backport release-2.9.x labels Jan 11, 2024

update changelog

c08fbe1

trevorwhitney merged commit 9759c13 into main Jan 11, 2024
8 checks passed

trevorwhitney deleted the debug-query-bug branch January 11, 2024 21:08

grafanabot mentioned this pull request Jan 11, 2024

[release-2.9.x] fix: align semantics of metric and log query label extraction #11667

Merged

grafanabot mentioned this pull request Jan 11, 2024

[k184] fix: align semantics of metric and log query label extraction #11668

Merged

trevorwhitney mentioned this pull request Jan 16, 2024

fix: remove shared state noParserHints to avoid data race #11685

Merged

loki-gh-app bot mentioned this pull request Mar 27, 2024

chore(add-major-release-workflow): release 3.0.0-rc.1 #12380

Closed

jameshartig mentioned this pull request Apr 30, 2024

label semantics with parsers changed in v3.0.0 #12839

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: align semantics of metric and log query label extraction #11587

fix: align semantics of metric and log query label extraction #11587

trevorwhitney commented Jan 4, 2024 •

edited

github-actions bot commented Jan 4, 2024 •

edited

chaudum left a comment

trevorwhitney Jan 11, 2024

trevorwhitney commented Jan 11, 2024

MasslessParticle Jan 11, 2024

MasslessParticle Jan 11, 2024

fix: align semantics of metric and log query label extraction #11587

fix: align semantics of metric and log query label extraction #11587

Conversation

trevorwhitney commented Jan 4, 2024 • edited

github-actions bot commented Jan 4, 2024 • edited

chaudum left a comment

Choose a reason for hiding this comment

trevorwhitney Jan 11, 2024

Choose a reason for hiding this comment

trevorwhitney commented Jan 11, 2024

MasslessParticle Jan 11, 2024

Choose a reason for hiding this comment

MasslessParticle Jan 11, 2024

Choose a reason for hiding this comment

trevorwhitney commented Jan 4, 2024 •

edited

github-actions bot commented Jan 4, 2024 •

edited