-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for discovering and adding log levels as structured metadata #12428
Conversation
return extractLogLevelFromLogLine(entry.Line) | ||
} | ||
|
||
func extractLogLevelFromLogLine(log string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the benchmark please this will be useful to compare changes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I'd love to have the benchmark in the PR if possible.
@@ -57,6 +58,13 @@ const ( | |||
|
|||
labelServiceName = "service_name" | |||
serviceUnknown = "unknown_service" | |||
labelLevel = "level" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we, at least for now, rename that to something like?
labelLevel = "level" | |
labelLevel = "level_detected" |
I'm a bit afraid it could lead to false detections, and then it would even overwrite the "real" detected label from the log line and change that to level_extracted
, which might break users workflows.
Take this logline as an example:
logger=tsdb.loki endpoint=queryData pluginId=loki dsName=grafanacloud-securityops-logs dsUID=grafanacloud-logs traceID=e6560d272e58b8bd94ed0d6da53c5a30 fromAlert=true end=2024-04-02T10:52:26.954369164Z step=30s query="{error=~\".+\"} " queryType=rangestatus=ok level=info
It's just an info
log, but the detection would see the error=
, which is part of the "parsed" query
label. Leading to add the logLevelError
as structured metadata, and the parsed level
label renamed to level_extracted
.
What this PR does / why we need it:
The most common way to query logs from Loki is using Grafana. Grafana relies on log levels to colour-code logs and histograms to help users see the distribution of logs by various levels. It would be good to have these log levels present in a uniform way for all the logs.
This PR adds support for detecting and adding log levels as structured metadata, if it is not already present in the stream labels or structured metadata. The feature is behind per-tenant configuration and is kept disabled by default.
For otlp logs, we are using severityNumber from the otel spec to map them to a log level as defined here. For non-otel logs we detect the log level by looking for specific keywords in the log line. If we do not find any of the keywords in the log line, we default to
info
level for that log. The initial implementation is simple and can be refined further.We are using
strings.Contains
which might seem in-optimal, but I did some benchmarks and compared it against regex and Aho–Corasick go implementations.strings.Contains
outperformed them in all the aspects. We can revisit this if we see any performance bottlenecks.Checklist
CHANGELOG.md
updated