Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: align semantics of metric and log query label extraction #11587

Merged
merged 9 commits into from Jan 11, 2024

Conversation

trevorwhitney
Copy link
Collaborator

@trevorwhitney trevorwhitney commented Jan 4, 2024

What this PR does / why we need it:

As a result of the query optimization work, we accidentally introduced a discrepancy between the semantics of logs and metrics queries. Metric queries can benefit from a short-circuit during label extraction, where we only need the labels needed for grouping and filtering. Log queries need to always extract all labels, as a user may want to inspect the key=value pairs of all detected fields, not just those filtered on.

However, given a query with nested labels of the same name (ie {"message": {"message": "foo"}}) this short circuit introduces a problem where the metric query will use the value of the first message (since it stops parsing the message key after finding it once), but the log query will use the value of the second message (since it will continue to extract all labels, even those it has already seen). This PR changes the semantics so that both types of queries will only use the first value.

The result of the change is a slight improvement in the hot path of label extraction, which I interpret as us having to do a few more operations due to the removal of the len(requiredLabels) == 0 short circuit, but those operations are quick, and thus more are done in the same runtime.

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 5 3600X 6-Core Processor             
                                                    │  before.txt   │               after.txt               │
                                                    │    sec/op     │    sec/op     vs base                 │
_Parser/json/no_labels_hints-12                        4.433µ ± ∞ ¹   4.502µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          3015.0n ± ∞ ¹   177.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                          1.066µ ± ∞ ¹   1.037µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    77.77n ± ∞ ¹   76.95n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       93.73n ± ∞ ¹   80.21n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     105.70n ± ∞ ¹   80.12n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      714.5n ± ∞ ¹   718.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         657.7n ± ∞ ¹   780.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        656.3n ± ∞ ¹   721.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        14.58n ± ∞ ¹   14.08n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           14.13n ± ∞ ¹   14.86n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          14.65n ± ∞ ¹   13.19n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      2.747µ ± ∞ ¹   3.383µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        3007.0n ± ∞ ¹   546.5n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                        904.8n ± ∞ ¹   905.8n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                4.460µ ± ∞ ¹   4.545µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   4.566µ ± ∞ ¹   4.453µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  4.474µ ± ∞ ¹   4.529µ ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         406.0n ± ∞ ¹   455.2n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            457.2n ± ∞ ¹   401.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           431.8n ± ∞ ¹   418.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     258.4n ± ∞ ¹   281.7n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        223.8n ± ∞ ¹   258.3n ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       165.1n ± ∞ ¹   166.4n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                452.1n         376.3n        -16.77%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                                                    │  before.txt   │              after.txt               │
                                                    │     B/op      │    B/op      vs base                 │
_Parser/json/no_labels_hints-12                         280.0 ± ∞ ¹   280.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          176.000 ± ∞ ¹   8.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                           64.00 ± ∞ ¹   64.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                       80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                          80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                         80.00 ± ∞ ¹   80.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12            0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                       336.0 ± ∞ ¹   336.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                         336.00 ± ∞ ¹   52.00 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                         74.00 ± ∞ ¹   74.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                 192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                    192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                   192.0 ± ∞ ¹   192.0 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12          51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12             51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12            51.00 ± ∞ ¹   51.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                      35.00 ± ∞ ¹   35.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                         32.00 ± ∞ ¹   32.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                        3.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                           ⁴                -18.66%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

                                                    │  before.txt  │              after.txt               │
                                                    │  allocs/op   │  allocs/op   vs base                 │
_Parser/json/no_labels_hints-12                        18.00 ± ∞ ¹   18.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          12.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/json/inline_stages-12                          4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12    0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                      4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                         4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                        4.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12        0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12           0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                      16.00 ± ∞ ¹   16.00 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        16.000 ± ∞ ¹   3.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
_Parser/logfmt/inline_stages-12                        6.000 ± ∞ ¹   6.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12                2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                   2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                  2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12         2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12            2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12           2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                     2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                        1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                                          ⁴                -15.91%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

Which issue(s) this PR fixes:
Fixes #11647

@trevorwhitney trevorwhitney requested a review from a team as a code owner January 4, 2024 22:55
Copy link
Contributor

github-actions bot commented Jan 4, 2024

Trivy scan found the following vulnerabilities:

  • HIGH, Target: docker.io/grafana/loki:main-efdea22 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libcrypto3 v3.1.3-r0. Fixed in v3.1.4-r0
  • HIGH, Target: docker.io/grafana/loki:main-efdea22 (alpine 3.18.4), Type: alpine openssl: Incorrect cipher key and IV length processing in libssl3 v3.1.3-r0. Fixed in v3.1.4-r0
    \nTo see more details on these vulnerabilities, and how/where to fix them, please run docker build -t grafana/loki:main-efdea22 -f cmd/loki/Dockerfile .
    trivy i grafana/loki:main-efdea22 on your branch. If these were not introduced by your PR, please considering fixing them in via a subsequent PR. Thanks!

@trevorwhitney trevorwhitney changed the title fix: align semantics of metrics and logs queries fix: align semantics of metric and log query label extraction Jan 4, 2024
Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Please add a changelog entry and add the appropriate labels for backporting.

Comment on lines 102 to 112
found := map[string]interface{}{}
for _, e := range p.extracted {
for _, l := range p.requiredLabels {
if e == l {
found[l] = nil
break
}
}
}

return len(p.requiredLabels) == len(found)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is needed because previously, RecordExtracted was only recording required fields that were extracted. As a result, it was previously acceptable to just test the length of the 2 slices against each other. However, now that we're recording all extracted labels, we have to actually compare extracted to required. I'm using a map to prevent duplicate extractions from causing an incorrect result here.

@trevorwhitney
Copy link
Collaborator Author

Ok, I think I've figured out why the benchmark was off. Here's a new benchstat after pushing 147dbab:

goos: linux
goarch: amd64
pkg: github.com/grafana/loki/pkg/logql/log
cpu: AMD Ryzen 5 3600X 6-Core Processor             
                                                    │  before.txt  │               after.txt               │
                                                    │    sec/op    │    sec/op      vs base                │
_Parser/json/no_labels_hints-12                       4.479µ ± ∞ ¹    4.419µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          3.060µ ± ∞ ¹    3.483µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         1.006µ ± ∞ ¹    1.050µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   76.89n ± ∞ ¹    78.11n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      77.43n ± ∞ ¹    79.86n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     77.66n ± ∞ ¹    76.44n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     798.5n ± ∞ ¹    726.9n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        839.7n ± ∞ ¹    845.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       817.1n ± ∞ ¹    817.1n ± ∞ ¹       ~ (p=1.000 n=1) ³
_Parser/unpack-not_json_line/no_labels_hints-12       13.37n ± ∞ ¹    13.33n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          12.55n ± ∞ ¹    14.73n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         13.90n ± ∞ ¹    13.79n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     3.324µ ± ∞ ¹    3.287µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        3.586µ ± ∞ ¹    4.311µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       981.9n ± ∞ ¹   1067.0n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               4.477µ ± ∞ ¹    4.609µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  4.530µ ± ∞ ¹    4.550µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 4.504µ ± ∞ ¹    4.767µ ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        484.7n ± ∞ ¹    470.3n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           446.1n ± ∞ ¹    457.4n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          489.2n ± ∞ ¹    473.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    294.0n ± ∞ ¹    313.6n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       315.0n ± ∞ ¹    291.7n ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      179.0n ± ∞ ¹    187.5n ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                               472.4n          484.3n        +2.51%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
³ all samples are equal

                                                    │ before.txt  │              after.txt              │
                                                    │    B/op     │    B/op      vs base                │
_Parser/json/no_labels_hints-12                       280.0 ± ∞ ¹   280.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          176.0 ± ∞ ¹   176.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         64.00 ± ∞ ¹   64.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     336.0 ± ∞ ¹   336.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        336.0 ± ∞ ¹   336.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       74.00 ± ∞ ¹   74.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 192.0 ± ∞ ¹   192.0 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          51.00 ± ∞ ¹   51.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    35.00 ± ∞ ¹   35.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       32.00 ± ∞ ¹   32.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      3.000 ± ∞ ¹   3.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                                         ³                +0.00%               ³
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ summaries must be >0 to compute geomean

                                                    │ before.txt  │              after.txt              │
                                                    │  allocs/op  │  allocs/op   vs base                │
_Parser/json/no_labels_hints-12                       18.00 ± ∞ ¹   18.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/labels_hints-12                          12.00 ± ∞ ¹   12.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/json/inline_stages-12                         4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/no_labels_hints-12   0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/labels_hints-12      0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/jsonParser-not_json_line/inline_stages-12     0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/no_labels_hints-12                     4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/labels_hints-12                        4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack/inline_stages-12                       4.000 ± ∞ ¹   4.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/no_labels_hints-12       0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/labels_hints-12          0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/unpack-not_json_line/inline_stages-12         0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/no_labels_hints-12                     16.00 ± ∞ ¹   16.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/labels_hints-12                        16.00 ± ∞ ¹   16.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/logfmt/inline_stages-12                       6.000 ± ∞ ¹   6.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/no_labels_hints-12               2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/labels_hints-12                  2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_greedy/inline_stages-12                 2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/no_labels_hints-12        2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/labels_hints-12           2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/regex_status_digits/inline_stages-12          2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/no_labels_hints-12                    2.000 ± ∞ ¹   2.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/labels_hints-12                       1.000 ± ∞ ¹   1.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
_Parser/pattern/inline_stages-12                      1.000 ± ∞ ¹   1.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
geomean                                                         ³                +0.00%               ³
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ summaries must be >0 to compute geomean

The change needed was in AllRequiredExtracted to account for the fact that we're recording all extracted labels now, and not just required ones.

@MasslessParticle kudos to your diligence here in making sure we understood why this benchmark was off!

return false
}
return len(p.extracted) == len(p.requiredLabels)

found := map[string]interface{}{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to say found := make(map[string]struct{}, len(p.requiredLabels)) here. struct{} is smaller than interface{} and initializing with the max len we'll need ensures only one alloc

return len(p.extracted) == len(p.requiredLabels)

found := map[string]interface{}{}
for _, e := range p.extracted {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might also be worth seeing if it's faster to just compare the list in two loops. It's an n^2 algorithm but it might be faster because it requires no allocs. (This function is called a lot)

It's the same reason extractedLabels and requiredLabels are []string here rather than map[string]. It's actually faster to iterate a small slice than index all these things from a map!

@trevorwhitney trevorwhitney merged commit 9759c13 into main Jan 11, 2024
8 checks passed
@trevorwhitney trevorwhitney deleted the debug-query-bug branch January 11, 2024 21:08
grafanabot pushed a commit that referenced this pull request Jan 11, 2024
both metric and log queries use the first extracted label when multiple values are requested for the same label

Fixes #11647

(cherry picked from commit 9759c13)
grafanabot pushed a commit that referenced this pull request Jan 11, 2024
both metric and log queries use the first extracted label when multiple values are requested for the same label

Fixes #11647

(cherry picked from commit 9759c13)
trevorwhitney added a commit that referenced this pull request Jan 16, 2024
**What this PR does / why we need it**:

A data race introduced in #11587 was
caught in the [backport to
k184](#11668). This removes the
shared state of a single global `noParserHints` in favor of creating an
empty `Hint` object for each label builder, since the `Hints` is keeping
state of `extracted` and `requiredLabels`.
trevorwhitney added a commit that referenced this pull request Jan 16, 2024
…11668)

Backport 9759c13 from #11587

---

**What this PR does / why we need it**:
Align the label parsing logic of metric and log queries to both only extract the first instance of a label when the same label is requested multiple times.

**Which issue(s) this PR fixes**:
Fixes #11647

---------

Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
trevorwhitney added a commit that referenced this pull request Jan 16, 2024
…traction (#11667)

Backport 9759c13 from #11587

---

**What this PR does / why we need it**:
Fix label parsing logic so metric and log queries both only extract the first instance of a label that is requested multiple times.

**Which issue(s) this PR fixes**:
Fixes #11647

---------

Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
…a#11587)

both metric and log queries use the first extracted label when multiple values are requested for the same label

Fixes grafana#11647
rhnasc pushed a commit to inloco/loki that referenced this pull request Apr 12, 2024
)

**What this PR does / why we need it**:

A data race introduced in grafana#11587 was
caught in the [backport to
k184](grafana#11668). This removes the
shared state of a single global `noParserHints` in favor of creating an
empty `Hint` object for each label builder, since the `Hints` is keeping
state of `extracted` and `requiredLabels`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

discrepancy between logs and metrics query when extracting nested values to the same label
3 participants