Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/parcacol: Group all profiles of a write request into one buffer #2202

Merged
merged 2 commits into from
Dec 1, 2022

Conversation

brancz
Copy link
Member

@brancz brancz commented Nov 30, 2022

$ benchstat old.txt new.txt
name                              old time/op    new time/op    delta
ProfileColumnStoreWriteSeries-10    2.74ms ± 4%    0.00ms ± 6%  -99.99%  (p=0.000 n=10+10)

name                              old alloc/op   new alloc/op   delta
ProfileColumnStoreWriteSeries-10    4.05MB ± 0%    0.00MB ± 0%  -99.99%  (p=0.000 n=10+10)

name                              old allocs/op  new allocs/op  delta
ProfileColumnStoreWriteSeries-10     32.3k ± 0%      0.0k ± 0%  -99.98%  (p=0.000 n=10+10)

@brancz brancz requested a review from a team as a code owner November 30, 2022 18:36
@brancz brancz changed the title Single parquet buffer pkg/parcacol: Group all profiles of a write request into one buffer Nov 30, 2022
Comment on lines 181 to 188
func sortedKeys(m map[string]struct{}) []string {
keys := make([]string, 0, len(m))
for k := range m {
keys = append(keys, k)
}
sort.Strings(keys)
return keys
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func sortedKeys(m map[string]struct{}) []string {
keys := make([]string, 0, len(m))
for k := range m {
keys = append(keys, k)
}
sort.Strings(keys)
return keys
}
func sortedKeys(in map[string]struct{}) []string {
if len(in) == 0 {
return []string{}
}
out := maps.Keys(in)
sort.Strings(out)
return out
}

Sadly the sort.Sort isn't generic yet, but when it's going to be, there will only be one of this ever.

@metalmatze
Copy link
Member

YAY! 🎉 🚀
Soooo happy this has the same impact for Parca.

Copy link
Contributor

@thorfour thorfour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This benchmark was previously used to optimize label handling so it
didn't need profiling data, but that's now changing.
Comment on lines +98 to +100
for _, s := range series {
for _, normalizedProfiles := range s.Samples {
for _, p := range normalizedProfiles {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the biggest change I would still propose to this.
Given that we built this nested series>profiles>samples construct earlier on, but then use it here for just a bit of reusing labels, I'd propose to flatten this earlier already.
Basically, the Ingest takes in []normalizedSamples and we only have one loop to iterate over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants