Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(logql): Make LabelSampleExtractor work as expected when label doesn't exist in logline #6766

Merged
merged 10 commits into from
Aug 3, 2022

Conversation

kavirajk
Copy link
Collaborator

@kavirajk kavirajk commented Jul 25, 2022

What this PR does / why we need it:
Consider Log lines

msg=hello subqueries=5
msg=hello subqueries=0
msg=hello

If you run following logql

{foo="bar"}|logfmt|subqueries!=0

it returns

msg=hello subqueries=5

But if you run following logql

{foo="bar"}|logfmt|subqueries!="0" # NOTE: "0" as string.

it returns

msg=hello subqueries=5
msg=hello

NOTE: it also returns log lines that doesn't contain subqueries label in the first
place.

This cause subtle error if used in metric queries using that label in unwrap.

e.g: following logql on above log lines returns

sum_over_time({foo="bar"}|logfmt|subqueries!="0"|unwrap subqueries[1m])

Returns pipeline error: 'SampleExtractionErr'. Because, the lines without subqueries
labels are not skipped during ProcessLine and during metric extraction, our extractor
sets this SampleExtractionerror.

EDIT: After some discussions with @cyriltovena and @owen-d. We have decided fix only the sample extractor without changing any behavior of label filter.

Which issue(s) this PR fixes:
Fixes # #6713

Special notes for your reviewer:

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@@ -342,7 +342,10 @@ func (s *StringLabelFilter) Process(_ int64, line []byte, lbs *LabelsBuilder) ([
if s.Name == logqlmodel.ErrorLabel {
return line, s.Matches(lbs.GetErr())
}
v, _ := lbs.Get(s.Name)
v, ok := lbs.Get(s.Name)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the real fix.

Consider Log lines

```
msg=hello subqueries=5
msg=hello subqueries=0
msg=hello
```

If you run following logql
```
{foo="bar"}|logfmt|subqueries!=0
```
it returns
```
msg=hello subqueries=5
```

But if you run following logql
```
{foo="bar"}|logfmt|subqueries!="0" # NOTE: "0" as string.
```
it returns
```
msg=hello subqueries=5
msg=hello
```

NOTE: it also returns log lines that doesn't contain `subqueries` label in the first
place.

This cause subtle error if used in metric queries using that `label` in unwrap.

e.g: following logql on above log lines returns
```
sum_over_time({foo="bar"}|logfmt|subqueries!="0"|unwrap subqueries[1m])
```

Returns `pipeline error: 'SampleExtractionErr'`. Because, the lines without `subqueries`
labels are not skipped during `ProcessLine` and during metric extraction, our extractor
sets this `SampleExtractionerror`.

Fixes: #6713
@kavirajk kavirajk force-pushed the kavirajk/logql-label-check-bug branch from 7b0d82a to d0fbf3c Compare July 25, 2022 22:55
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@pull-request-size pull-request-size bot added size/L and removed size/M labels Jul 25, 2022
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@kavirajk kavirajk force-pushed the kavirajk/logql-label-check-bug branch from 70531ec to 8120941 Compare July 25, 2022 23:05
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@kavirajk kavirajk marked this pull request as ready for review July 25, 2022 23:19
@kavirajk kavirajk requested a review from a team as a code owner July 25, 2022 23:19
@kavirajk kavirajk requested a review from owen-d July 25, 2022 23:19
Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here because this seems related to the value type that we use in the query, i.e. either string != "0" or integer != 0.
If a log line does not contain a label, e.g.

msg=hello

I would always assume the value of the label subqueries would be the empty string "", and, therefore, it would be different than both string "0" and integer 0.
With the changes from this PR it seems this assumption is wrong, so it should be clearly documented

pkg/logql/log/label_filter_test.go Show resolved Hide resolved
pkg/logql/log/metrics_extraction_test.go Show resolved Hide resolved
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
-               loki	-0.6%

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@ssncferreira ssncferreira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks 🙏
Do you think it would also make sense to update the documentation with some examples? (can be part of a separate PR)

// 4.msg="hello" subqueries=""
//```

// Now when quering like `{...}|logfmt|subqueries=""` should consider lines 3 and 4.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this, it doesn't really make it easier to understand for the end user as this make it more obscure:

  • when using equality we consider non existent label empty
  • but for inequality we don't.

Honestly in #6713 I think the expectations were wrong.

sum(sum_over_time({job="loki-ops/query-frontend"}
  |= "metrics.go"
  | logfmt
  | subqueries!="0"
  | unwrap subqueries
  [5m]
))

When subqueries doesn't exist then subqueries!="0" is true.

What is weird was that empty labels does transform into a 0 integer.

Copy link
Collaborator Author

@kavirajk kavirajk Jul 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When subqueries doesn't exist then subqueries!="0" is true.

So you saying subqueries!=0 and subqueries!="0" ok to return different log lines?. But that's unexpected behaviour to end-user IMHO. Both should return exactly same results IMO.

What is weird was that empty labels does transform into a 0 integer.

That's because, we don't mutate the log line if label is not present. So treat it as if label="" during label filter, But still log line doesn't contain actual label, which breaks when doing unwrap on same label during metrics extraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is weird was that empty labels does transform into a 0 integer.

I agree with @cyriltovena. This is the part I think we need to address.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we discard empty values ? If that is possible.

Yes that's possible. We can skip empty values during metrics extraction when using unwrap without changing any of the existing label filter behavior.

Also the more I think about it, subqueries!="0" and subqueries!=0 are two different label filters and should be treated as such. And former can indeed match log lines that doesn't have subqueries label at all (we are assuming it has default empty string substring=" when doesn't exist).

I have change the PR like it's mentioned here.

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@kavirajk kavirajk changed the title fix(logql): Make StringLabelfilter work as expected fix(logql): Make LabelSampleExtractor work as expected when label doesn't exist in logline Aug 1, 2022
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I think the changelog needs a better documentation of the change.

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0.1%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

@kavirajk kavirajk merged commit 7b89f20 into main Aug 3, 2022
@kavirajk kavirajk deleted the kavirajk/logql-label-check-bug branch August 3, 2022 07:48
lxwzy pushed a commit to lxwzy/loki that referenced this pull request Nov 7, 2022
…oesn't exist in logline (grafana#6766)

* fix(logql): Make `StringLabelfilter` work as expected

Consider Log lines

```
msg=hello subqueries=5
msg=hello subqueries=0
msg=hello
```

If you run following logql
```
{foo="bar"}|logfmt|subqueries!=0
```
it returns
```
msg=hello subqueries=5
```

But if you run following logql
```
{foo="bar"}|logfmt|subqueries!="0" # NOTE: "0" as string.
```
it returns
```
msg=hello subqueries=5
msg=hello
```

NOTE: it also returns log lines that doesn't contain `subqueries` label in the first
place.

This cause subtle error if used in metric queries using that `label` in unwrap.

e.g: following logql on above log lines returns
```
sum_over_time({foo="bar"}|logfmt|subqueries!="0"|unwrap subqueries[1m])
```

Returns `pipeline error: 'SampleExtractionErr'`. Because, the lines without `subqueries`
labels are not skipped during `ProcessLine` and during metric extraction, our extractor
sets this `SampleExtractionerror`.

Fixes: grafana#6713

* Add it on changelog

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Fix linter

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Add more tests and fix the edge case with equality to empty strings

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* typo on the comment

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* handle this edge case during metric extraction. Not in label filter

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Fix linter

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Remove commented code

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>

* Update changelog

Signed-off-by: Kaviraj <kavirajkanagaraj@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants