Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/query: Improve flamegraph performance through dictionary unification #3651

Merged
merged 4 commits into from
Aug 21, 2023

Conversation

brancz
Copy link
Member

@brancz brancz commented Aug 19, 2023

Previously we looked at each individual string to be inserted into the resulting flamegraph, which was very inefficient since strings occur many times over. That causes dictionary building to be inefficient, but also comparisons are done on strings rather than integers.

Changing this to use dictionary unification and transposing yielded a very good performance improvement:

$ benchstat old.txt new.txt
name                old time/op  new time/op  delta
ArrowFlamegraph-10  5.95ms ± 0%  3.38ms ± 0%  -43.28%  (p=0.008 n=5+5)

@alwaysmeticulous
Copy link

alwaysmeticulous bot commented Aug 19, 2023

🤖 Meticulous spotted visual differences in 26 of 171 screens tested: view and approve differences detected.

Last updated for commit b8255ee. This comment will update as new commits are pushed.

Previously we looked at each individual string to be inserted into the
resulting flamegraph, which was very inefficient since strings occur
many times over. That causes dictionary building to be inefficient, but
also comparisons are done on strings rather than integers.

Changing this to use dictionary unification and transposing yielded a
very good performance improvement:

```
$ benchstat old.txt new.txt
name                old time/op  new time/op  delta
ArrowFlamegraph-10  5.95ms ± 0%  3.38ms ± 0%  -43.28%  (p=0.008 n=5+5)
```
@brancz
Copy link
Member Author

brancz commented Aug 20, 2023

The latest commit (removing maps from tracking labels) improves it by another 15%:

$ benchstat old.txt new.txt
name                old time/op  new time/op  delta
ArrowFlamegraph-10  3.39ms ± 2%  2.87ms ± 1%  -15.52%  (p=0.008 n=5+5)

@@ -270,9 +281,106 @@ func generateFlamegraphArrowRecord(ctx context.Context, mem memory.Allocator, tr
return record, fb.cumulative, fb.maxHeight + 1, 0, nil
}

type transpositions struct {
mappingIDIndicesData *array.Data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think it would be nicer to unify these *Data and *Indices fields into a struct to halve the number of fields here and e.g. Release calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in encapsulate them in a struct, or how do you mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member

@metalmatze metalmatze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You even included the UI changes 🎉

Comment on lines 839 to 844
if r.Mapping.IsValid(locationRow) {
fb.builderMappingStart.Append(r.MappingStart.Value(locationRow))
fb.builderMappingLimit.Append(r.MappingLimit.Value(locationRow))
fb.builderMappingOffset.Append(r.MappingOffset.Value(locationRow))
fb.builderMappingFileIndices.Append(t.mappingFileIndices.Value(r.MappingFile.GetValueIndex(locationRow)))
fb.builderMappingBuildIDIndices.Append(t.mappingIDIndices.Value(r.MappingBuildID.GetValueIndex(locationRow)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, very nice!

@@ -26,16 +32,20 @@ export function nodeLabel(
showBinaryName: boolean
): string {
const functionName: string | null = table.getChild(FIELD_FUNCTION_NAME)?.get(row);
const labelsOnly: boolean | null = table.getChild(FIELD_LABELS_ONLY)?.get(row);
const labels: string | null = table.getChild(FIELD_LABELS)?.get(row);
console.log(labelsOnly, labels);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to remove this before merging.

return functionName;
}

if (level === 1 && labelsOnly !== null && labelsOnly && labels !== null && labels !== '') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we now have this labelsOnly column to read from, we can probably remove passing the level all the way down.

@brancz brancz merged commit efcf48c into main Aug 21, 2023
38 checks passed
@brancz brancz deleted the arrow-dict-unifier branch August 21, 2023 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants