[k184] fix: align semantics of metric and log query label extraction #11668

grafanabot · 2024-01-11T21:09:34Z

Backport 9759c13 from #11587

What this PR does / why we need it:

As a result of the query optimization work, we accidentally introduced a discrepancy between the semantics of logs and metrics queries. Metric queries can benefit from a short-circuit during label extraction, where we only need the labels needed for grouping and filtering. Log queries need to always extract all labels, as a user may want to inspect the key=value pairs of all detected fields, not just those filtered on.

However, given a query with nested labels of the same name (ie {"message": {"message": "foo"}}) this short circuit introduces a problem where the metric query will use the value of the first message (since it stops parsing the message key after finding it once), but the log query will use the value of the second message (since it will continue to extract all labels, even those it has already seen). This PR changes the semantics so that both types of queries will only use the first value.

The result of the change is a slight improvement in the hot path of label extraction, which I interpret as us having to do a few more operations due to the removal of the len(requiredLabels) == 0 short circuit, but those operations are quick, and thus more are done in the same runtime.

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 5 3600X 6-Core Processor             
                                                    │  before.txt   │               after.txt               │
                                                    │    sec/op     │    sec/op     vs base                 │
_Parser/json/no_labels_hints-12                        4.433µ ± ∞ ¹   4.502µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          3015.0n ± ∞ ¹   177.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                          1.066µ ± ∞ ¹   1.037µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    77.77n ± ∞ ¹   76.95n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       93.73n ± ∞ ¹   80.21n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     105.70n ± ∞ ¹   80.12n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      714.5n ± ∞ ¹   718.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         657.7n ± ∞ ¹   780.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        656.3n ± ∞ ¹   721.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        14.58n ± ∞ ¹   14.08n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           14.13n ± ∞ ¹   14.86n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          14.65n ± ∞ ¹   13.19n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      2.747µ ± ∞ ¹   3.383µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        3007.0n ± ∞ ¹   546.5n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                        904.8n ± ∞ ¹   905.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                4.460µ ± ∞ ¹   4.545µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   4.566µ ± ∞ ¹   4.453µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  4.474µ ± ∞ ¹   4.529µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         406.0n ± ∞ ¹   455.2n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            457.2n ± ∞ ¹   401.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           431.8n ± ∞ ¹   418.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     258.4n ± ∞ ¹   281.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        223.8n ± ∞ ¹   258.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       165.1n ± ∞ ¹   166.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                452.1n         376.3n        -16.77%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                    │  before.txt   │              after.txt               │
                                                    │     B/op      │    B/op      vs base                 │
_Parser/json/no_labels_hints-12                         280.0 ± ∞ ¹   280.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          176.000 ± ∞ ¹   8.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                           64.00 ± ∞ ¹   64.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                       80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                          80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                         80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12            0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                       336.0 ± ∞ ¹   336.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                         336.00 ± ∞ ¹   52.00 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                         74.00 ± ∞ ¹   74.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                 192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                    192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                   192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12          51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12             51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12            51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                      35.00 ± ∞ ¹   35.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                         32.00 ± ∞ ¹   32.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                        3.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                           ⁴                -18.66%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

                                                    │  before.txt  │              after.txt               │
                                                    │  allocs/op   │  allocs/op   vs base                 │
_Parser/json/no_labels_hints-12                        18.00 ± ∞ ¹   18.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          12.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                          4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      16.00 ± ∞ ¹   16.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        16.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                        6.000 ± ∞ ¹   6.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                          ⁴                -15.91%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

Which issue(s) this PR fixes:
Fixes #11647

both metric and log queries use the first extracted label when multiple values are requested for the same label Fixes #11647 (cherry picked from commit 9759c13)

github-actions · 2024-01-11T21:12:50Z

Trivy scan found the following vulnerabilities:

HIGH, Target: docker.io/grafana/loki:k184-cbf4ffa (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libcrypto3 v3.1.3-r0. Fixed in v3.1.4-r0
HIGH, Target: docker.io/grafana/loki:k184-cbf4ffa (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libssl3 v3.1.3-r0. Fixed in v3.1.4-r0
\nTo see more details on these vulnerabilities, and how/where to fix them, please run docker build -t grafana/loki:k184-cbf4ffa -f cmd/loki/Dockerfile .
trivy i grafana/loki:k184-cbf4ffa on your branch. If these were not introduced by your PR, please considering fixing them in via a subsequent PR. Thanks!

dannykopping · 2024-01-12T08:40:45Z

@trevorwhitney i'm surprised to see the trivy message since you merged #11608.
Do you know why this is the case?

trevorwhitney · 2024-01-16T21:30:16Z

@dannykopping looks like we need to backport #11608 to k184

**What this PR does / why we need it**: A data race introduced in #11587 was caught in the [backport to k184](#11668). This removes the shared state of a single global `noParserHints` in favor of creating an empty `Hint` object for each label builder, since the `Hints` is keeping state of `extracted` and `requiredLabels`.

dannykopping · 2024-01-17T12:35:50Z

@trevorwhitney it's already in k184 it seems.

loki [k184] [14:35:15] $ git cherry-pick 8b48a18
On branch k184
Your branch is up to date with 'upstream/k184'.

You are currently cherry-picking commit 8b48a18d7.
  (all conflicts fixed: run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

nothing to commit, working tree clean
The previous cherry-pick is now empty, possibly due to conflict resolution.
If you wish to commit it anyway, use:

    git commit --allow-empty

Otherwise, please use 'git cherry-pick --skip'

) **What this PR does / why we need it**: A data race introduced in grafana#11587 was caught in the [backport to k184](grafana#11668). This removes the shared state of a single global `noParserHints` in favor of creating an empty `Hint` object for each label builder, since the `Hints` is keeping state of `extracted` and `requiredLabels`.

fix: align semantics of metric and log query label extraction (#11587)

5283e07

both metric and log queries use the first extracted label when multiple values are requested for the same label Fixes #11647 (cherry picked from commit 9759c13)

grafanabot requested a review from a team as a code owner January 11, 2024 21:09

grafanabot added backport size/M type/bug Somehing is not working as expected labels Jan 11, 2024

grafanabot requested a review from trevorwhitney January 11, 2024 21:09

trevorwhitney approved these changes Jan 11, 2024

View reviewed changes

remove shared state noParserHints to avoid data race

553355f

pull-request-size bot added size/L and removed size/M labels Jan 16, 2024

trevorwhitney mentioned this pull request Jan 16, 2024

fix: remove shared state noParserHints to avoid data race #11685

Merged

trevorwhitney merged commit 5991c3d into k184 Jan 16, 2024
10 checks passed

trevorwhitney deleted the backport-11587-to-k184 branch January 16, 2024 22:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[k184] fix: align semantics of metric and log query label extraction #11668

[k184] fix: align semantics of metric and log query label extraction #11668

grafanabot commented Jan 11, 2024

github-actions bot commented Jan 11, 2024

dannykopping commented Jan 12, 2024

trevorwhitney commented Jan 16, 2024

dannykopping commented Jan 17, 2024

[k184] fix: align semantics of metric and log query label extraction #11668

[k184] fix: align semantics of metric and log query label extraction #11668

Conversation

grafanabot commented Jan 11, 2024

github-actions bot commented Jan 11, 2024

dannykopping commented Jan 12, 2024

trevorwhitney commented Jan 16, 2024

dannykopping commented Jan 17, 2024