-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support usage trackers for received and discarded bytes. #11840
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have some mechanism for cleaning these metrics up if they haven't been updated recently, in which case they're just excess cardinality.
Also, it looks like disitrbutors processing Push
requests is roughly 10% of their CPU time, it would be good to know what kind of additional CPU cost we're incurring when we have some of these custom trackers for a tenant or two.
pkg/distributor/distributor.go
Outdated
type labelData struct { | ||
ls labels.Labels | ||
labels string | ||
hash uint64 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we modify this struct to just have the ls
? or do we call access the labels
enough that ls.String()
would be too expensive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really weird. We use it to override stream.Labels
. I think this is because the parsing also sorts the labels.
pkg/distributor/validator.go
Outdated
return fmt.Errorf(validation.LineTooLongErrorMsg, maxSize, labels, len(entry.Line)) | ||
} | ||
|
||
if len(entry.StructuredMetadata) > 0 { | ||
if !ctx.allowStructuredMetadata { | ||
validation.DiscardedSamples.WithLabelValues(validation.DisallowedStructuredMetadata, ctx.userID).Inc() | ||
validation.DiscardedBytes.WithLabelValues(validation.DisallowedStructuredMetadata, ctx.userID).Add(float64(len(entry.Line))) | ||
if v.customStreamsTracker != nil { | ||
for _, matchedLbs := range ctx.customTrackerConfig.MatchTrackers(labels) { | ||
v.customStreamsTracker.DiscardedBytesAdd(ctx.userID, matchedLbs, float64(len(entry.Line))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be useful to pass the reason why the bytes have been discarded
pkg/loghttp/push/otlp.go
Outdated
@@ -145,6 +145,18 @@ func otlpToLokiPushRequest(ld plog.Logs, userID string, tenantsRetention Tenants | |||
labelsStr := streamLabels.String() | |||
|
|||
lbs := modelLabelsSetToLabelsList(streamLabels) | |||
|
|||
// Init custom streams tracking | |||
trackedLabels := customTrackersConfig.MatchTrackers(lbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we take into account the labels from structured metadata to track them also?
pkg/loghttp/push/otlp.go
Outdated
|
||
for i := range trackedLabels { | ||
stats.logLinesBytesCustomTrackers[i].Bytes[tenantsRetention.RetentionPeriodFor(userID, lbs)] += int64(len(entry.Line)) | ||
stats.structuredMetadataBytesCustomTrackers[i].Bytes[tenantsRetention.RetentionPeriodFor(userID, lbs)] += int64(labelsSize(entry.StructuredMetadata) - resourceAttributesAsStructuredMetadataSize - scopeAttributesAsStructuredMetadataSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we need to include in this metric everything that is added to structured metadata
stats.structuredMetadataBytesCustomTrackers[i].Bytes[tenantsRetention.RetentionPeriodFor(userID, lbs)] += int64(labelsSize(entry.StructuredMetadata) - resourceAttributesAsStructuredMetadataSize - scopeAttributesAsStructuredMetadataSize) | |
stats.structuredMetadataBytesCustomTrackers[i].Bytes[tenantsRetention.RetentionPeriodFor(userID, lbs)] += int64(labelsSize(entry.StructuredMetadata)) |
pkg/loghttp/push/push.go
Outdated
if err != nil { | ||
return nil, nil, fmt.Errorf("couldn't parse labels: %w", err) | ||
} | ||
} | ||
trackedLabels := customTrackers.MatchTrackers(lbs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call it trackers
trackedLabels := customTrackers.MatchTrackers(lbs) | |
trackers := customTrackers.MatchTrackers(lbs) |
@vlad-diachenko and @cstyan I've changed the code to just inject a usage tracker. This is quite generic and allows us to encapsulate the labels filtering in summing in a few simple methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I still think the additional CPU usage is worth measuring
Also, it looks like disitrbutors processing Push requests is roughly 10% of their CPU time, it would be good to know what kind of additional CPU cost we're incurring when we have some of these custom trackers for a tenant or two.
Thanks @cstyan. I'll measure the impact when I'm adding a concrete tracker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 💎
What this PR does / why we need it:
Currently, Loki exposes the
distributor_bytes_received_total
anddiscarded_bytes_total
metrics. These only expose the tenant and the discarded reason in the labels. In some cases it is useful to be specific to to certain streams. E.g. one might want to know how many bytes from a certain region were received and dropped. This is where usage trackers come in.They receive the added or discarded bytes of a stream and its labels. This can be used to build a custom tracker such as Mimir's grafana/mimir#1188 which uses regex matchers and exports the active series as a Prometheus metric.
Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updatedadd-to-release-notes
labeldocs/sources/setup/upgrade/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PRdeprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR