-
Notifications
You must be signed in to change notification settings - Fork 692
Adds push limits on profiles data #2157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pkg/validation/limits.go
Outdated
f.IntVar(&l.MaxQueryParallelism, "querier.max-query-parallelism", 0, "Maximum number of queries that will be scheduled in parallel by the frontend.") | ||
|
||
f.IntVar(&l.MaxProfileSizeBytes, "validation.max-profile-size-bytes", 2*1024*1024, "Maximum size of a profile in bytes. This is based off the uncompressed size. 0 to disable.") | ||
f.IntVar(&l.MaxProfileStacktraceSamples, "validation.max-profile-stacktrace-samples", 2000, "Maximum number of samples in a profile. 0 to disable.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this value is overly conservative. With the default 15s duration of a profiling session, and 100Hz of sample rate, we may expect that a single CPU core may emit up to 1500 of samples (edge case). Of course, most of them point to the same stack trace, therefore they are aggregated, but a 32 core machine may easily generate a profile containing a few thousand of samples with unique stack traces.
Also, senders of cumulative profiles may always hit this limit. This is especially true for Go heap profiles. The same for validation.max-profile-size-bytes
: cumulative (before diffing) heap profiles may reach megabytes in size, it may make sense to increase the limit to something like 4MB by default.
Another important factor here is pprof labels: they may dramatically increase cardinality of samples. We should either drop them and dedup samples, or take them into account. I'd go for the former.
Should we set it to something in between 4K and 16K?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sounds good let me update that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'd maybe adjust some of the default values
if depthLimit != 0 && len(s.LocationId) > depthLimit { | ||
// truncate the deepest frames | ||
s.LocationId = s.LocationId[:depthLimit] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a stub like "other" – I think it could be a separate PR
|
||
if depthLimit != 0 && len(s.LocationId) > depthLimit { | ||
// truncate the deepest frames | ||
s.LocationId = s.LocationId[:depthLimit] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have realised that this truncates parent locations... This may make the flamegraph unreadable. I guess we could achieve better results with cutting the stack tip:
s.LocationId = s.LocationId[len(s.LocationId)-depthLimit:]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that wasn't intentional, let's fix it.
This adds 5 new limits.
Closes https://github.com/grafana/pyroscope-squad/issues/16