Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change generated LogQL queries to be case insensitive by default #53

Merged
merged 4 commits into from
Jun 12, 2023

Conversation

kelnage
Copy link
Collaborator

@kelnage kelnage commented May 4, 2023

String values in Sigma rules must be interpreted as case insensitive, but when checking equality with strings in LogQL, those values are treated as case sensitive. To workaround this and ensure broad compatibility, any such string values must be converted into regular expressions that have the case insensitive flag (?i) at the start.

However, such LogQL queries will inherently be slower than the same queries using string equality, and hence I have added a case_insensitive flag (default: True) in the backend, which can be used to toggle back to the previous behaviour. This should only be used when the rule converter is confident that the Sigma rule's string values will have the same case as the data they are querying over.

Note: at this point, Loki has an open issue grafana/loki#9294 around its handling of case-insensitive field equality - so this PR should not be merged into main until it is resolved.

Closes #50.

SigmaString values should be treated as case insensitive. But Loki
treats equality as being case sensitive. To resolve this, the backend
will now convert all SigmaStrings into regular expressions with the
case insensitive flag set (?i).

Such queries will be less performant - hence the introduction of a
"case_insensitive" boolean flag (default: True). If the performance of a
query is important and the rule is believed to accurately match the case
of the log data, setting this to False will mean the generated query
uses string equality instead (i.e., the query will be case sensitive).
Largely involved changing a=`b` to a=~`(?i)b`, but a few more involved
changes (especially including the line filter functionality, which does
not currently handle case-insensitive searches as effectively as it
could).

Added new test files that were basically identical to the previous
files, but with the case_insensitivity flag set to False to ensure both
cases are covered.
Switching most tests to case insensitive meant that there were fewer
tests on strings, slightly reducing the coverage of the test suite.
@kelnage kelnage added bug Something isn't working querying labels May 4, 2023
@kelnage kelnage self-assigned this May 4, 2023
@kelnage kelnage requested a review from a team as a code owner May 4, 2023 10:49
@kelnage kelnage linked an issue May 4, 2023 that may be closed by this pull request
@kelnage kelnage changed the title Change generated Loki queries to be case insensitive by default Change generated LogQL queries to be case insensitive by default May 4, 2023
@github-actions
Copy link

github-actions bot commented May 4, 2023

Pull Request Test Coverage Report for Build 4990008363

  • 11 of 11 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.004%) to 99.777%

Totals Coverage Status
Change from base Build 4989655407: 0.004%
Covered Lines: 447
Relevant Lines: 448

💛 - Coveralls

Copy link
Contributor

@romain-gaillard romain-gaillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@kelnage kelnage merged commit ab6b3e6 into main Jun 12, 2023
9 checks passed
@kelnage kelnage deleted the case-insensitive-default branch June 12, 2023 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working querying
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Case sensitivity in generated LogQL queries
2 participants