This repository has been archived by the owner on Jul 19, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 73
Sort Profiles across row groups when flushing blocks #803
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cyriltovena
force-pushed
the
feat/sort-profiles
branch
from
June 28, 2023 11:31
1e2c0d7
to
b13c199
Compare
cyriltovena
force-pushed
the
feat/sort-profiles
branch
from
June 28, 2023 11:32
b13c199
to
31f82fa
Compare
cyriltovena
commented
Jun 28, 2023
I was able to speed this up quite significantly
|
kolesnikovae
approved these changes
Jul 4, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
pkg/parquet/row_reader.go
Outdated
Comment on lines
109
to
116
n, err := r.reader.ReadRows(r.buff) | ||
if n == 0 { | ||
return false | ||
} | ||
if err != nil && err != io.EOF { | ||
r.err = err | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should check error before n == 0
to not miss it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah good point
simonswine
pushed a commit
to simonswine/pyroscope
that referenced
this pull request
Jul 18, 2023
) * Sort Profiles across row groups when flushing blocks * Refactor and properly tests parquet reader and writer * moar tests * Fixes bad uint32 to int32 conversion causing maxval to be min * add working tests back * add missing tests back * improve less function for profile rows * check error first
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #800
This k-way merge rowgroups on disk when we flush to a full block. Which means now a single series is expected to appear in only on rowgroup.
The only caveat if we only cut the final file rowgroup per rowcount but I think we can live with this since we never cut by size.