Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix log level detection #12651

Merged
merged 26 commits into from
May 6, 2024
Merged

fix: Fix log level detection #12651

merged 26 commits into from
May 6, 2024

Conversation

shantanualsi
Copy link
Contributor

@shantanualsi shantanualsi commented Apr 17, 2024

What this PR does / why we need it:
Currently detection of log levels is not truly case insensitive. For eg. "warn" or "WARN" is detected, however "Warn" is not. The implementation uses string.Contains over the entire log line to check for these levels.

However, checking ignoring case would need us to toLower the entire log line which is inefficient. This PR detects the level first -> "level": in case of JSON and level= in case of logfmt and then compares the next 12 characters after that ignoring case.

The benchmarks show similar result on my computer with the fixed version at times few 100 ns faster.

implementation from main branch:
image
implementation with Fix (This PR):
image

The PR also supports log level TRACE and FATAL and even 3 letter abbreviations that might be common.

Edit:
Added more improvements after some discussions-

  1. Using logfmt and jsonparsers to parse each log line. After benchmarking, this performs slightly better than the previous implementation
  2. Changed the label from level to detected_level so that it is clear for users
  3. Support "Level", "Severity" and "Lvl" in lowercase, uppercase and Pascal case.

Few caveats:
- "level" should still be lowercase or block case ("LEVEL"). Documentation for configuration is updated. This does not support labels like Severity yet.

Which issue(s) this PR fixes:
Ref: #12645

Special notes for your reviewer:

Checklist

  • Reviewed the CONTRIBUTING.md guide (required)
  • Documentation added
  • Tests updated
  • Title matches the required conventional commits format, see here
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/setup/upgrade/_index.md
  • For Helm chart changes bump the Helm chart version in production/helm/loki/Chart.yaml and update production/helm/loki/CHANGELOG.md and production/helm/loki/README.md. Example PR
  • If the change is deprecating or removing a configuration option, update the deprecated-config.yaml and deleted-config.yaml files respectively in the tools/deprecated-config-checker directory. Example PR

@shantanualsi shantanualsi changed the title Fix log level detection fix: Fix log level detection Apr 17, 2024
@shantanualsi shantanualsi marked this pull request as ready for review April 17, 2024 13:21
@shantanualsi shantanualsi requested a review from a team as a code owner April 17, 2024 13:21
@@ -904,42 +904,46 @@ func extractLogLevelFromLogLine(log string) string {
}

if firstNonSpaceChar == '{' && lastNonSpaceChar == '}' {
if strings.Contains(log, `:"err"`) || strings.Contains(log, `:"ERR"`) ||
strings.Contains(log, `:"error"`) || strings.Contains(log, `:"ERROR"`) {
levelIndex := strings.Index(log, `"level":`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

problem is that it could be Level, LEVEL, serverity, Serverity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i've mentioned it in the caveats in notes.. can we impose this in our documentation? or make this string configurable (case sensetive)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we pass a hint to this function? for example, I think we could change this code to something like the following

firstEntry := stream.Entries[0]
firtsStructuredMetadata := logproto.FromLabelAdaptersToLabels(firstEntry.StructuredMetadata)
prevTs := firstEntry.Timestamp
addLogLevel := validationContext.allowStructuredMetadata &&
				validationContext.discoverLogLevels &&
				!lbs.Has(labelLevel)

logLevelField := detectLogLevelFieldFromLogEntry(firstEntry, firtsStructuredMetadata)

this would allow detectLogLEvelFieldFromLogEntry to do a case insensitive search for a bunch of different options (ie. level, severity, etc) since it was only doing it once for the whole stream per push. We could then pass that as a hint to detectLogLevelFromLogEntry, allowing it to short-circuit when passed.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, I think we should may be support different field name ?

almost wonder if we should use some fast json and logfmt parser really to grab the key !

@pull-request-size pull-request-size bot added size/L and removed size/M labels Apr 19, 2024
@github-actions github-actions bot added the type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories label Apr 19, 2024
@shantanualsi
Copy link
Contributor Author

I found an old issue, a comment on which claims that 3 letter abbreviations for levels (dbg, wrn) are common as well (i haven't seen them yet). I've added support for them too in this PR.

Also, supports "level" or "LEVEL" and updated the documentation.

@cyriltovena If everything else looks okay, we can merge this fix first?

To support different labels such as severity including any other cases of ``level`, I think it should be a part of configuration. We can add support for that in a separate PR.

@shantanualsi
Copy link
Contributor Author

almost wonder if we should use some fast json and logfmt parser really to grab the key !

this is a good idea, i'll try and benchmark this. My hunch is that it's going to be much slower than the current bruteforce way...

@cyriltovena
Copy link
Contributor

almost wonder if we should use some fast json and logfmt parser really to grab the key !

this is a good idea, i'll try and benchmark this. My hunch is that it's going to be much slower than the current bruteforce way...

don't know about that I think it can be quite good. I'm specially worried for structured log.

Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about trying to detect the level field once per stream per push rather than on every line?

@@ -904,42 +904,46 @@ func extractLogLevelFromLogLine(log string) string {
}

if firstNonSpaceChar == '{' && lastNonSpaceChar == '}' {
if strings.Contains(log, `:"err"`) || strings.Contains(log, `:"ERR"`) ||
strings.Contains(log, `:"error"`) || strings.Contains(log, `:"ERROR"`) {
levelIndex := strings.Index(log, `"level":`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we pass a hint to this function? for example, I think we could change this code to something like the following

firstEntry := stream.Entries[0]
firtsStructuredMetadata := logproto.FromLabelAdaptersToLabels(firstEntry.StructuredMetadata)
prevTs := firstEntry.Timestamp
addLogLevel := validationContext.allowStructuredMetadata &&
				validationContext.discoverLogLevels &&
				!lbs.Has(labelLevel)

logLevelField := detectLogLevelFieldFromLogEntry(firstEntry, firtsStructuredMetadata)

this would allow detectLogLEvelFieldFromLogEntry to do a case insensitive search for a bunch of different options (ie. level, severity, etc) since it was only doing it once for the whole stream per push. We could then pass that as a hint to detectLogLevelFromLogEntry, allowing it to short-circuit when passed.

@cyriltovena
Copy link
Contributor

Actually think it's fine for doing this on every line, I suggest though we make a custom image that runs the detection without applying the change, to test in prod and see the impact.

@shantanualsi
Copy link
Contributor Author

shantanualsi commented Apr 22, 2024

I'm spending some time today using json-parser and logfmt parser and benchmark.

@shantanualsi
Copy link
Contributor Author

is it entirely safe to assume that the label for "LEVEL" will be exactly the same in every line within a stream?

@shantanualsi
Copy link
Contributor Author

shantanualsi commented Apr 22, 2024

Turns out parsing is ~3x slower than the brute force way. (using our inbuilt parser implementations from logql_log)

1st run is using no parsers (original implementation with bugfix). 2nd run makes an attempt to parse using json and logfmt
Check commit
Screenshot 2024-04-22 at 16 34 32

@@ -385,10 +396,10 @@ func (d *Distributor) Push(ctx context.Context, req *logproto.PushRequest) (*log
}

structuredMetadata := logproto.FromLabelAdaptersToLabels(entry.StructuredMetadata)
if addLogLevel && !structuredMetadata.Has(labelLevel) {
if addLogLevel && !hasAnyLevelLabels(structuredMetadata) {
logLevel := detectLogLevelFromLogEntry(entry, structuredMetadata)
entry.StructuredMetadata = append(entry.StructuredMetadata, logproto.LabelAdapter{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we consider not adding any data if we don't find the level ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should still keep the detected_level and set it to unknown even if we cannot find it to keep the queries uniform for the user? Might break queries or rules if the label is absent for some log lines at random

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might break queries or rules if the label is absent for some log lines at random.

I can't think of an example, this label shouldn't be in used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed the change to not set detected_level if unknown

@shantanualsi
Copy link
Contributor Author

sorry, ruined the PR by doing pull --rebase. Will try to reflog and revert or raise another PR with changes

@shantanualsi shantanualsi marked this pull request as ready for review May 6, 2024 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L type/docs Issues related to technical documentation; the Docs Squad uses this label across many repositories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants