Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #1746: Moving QUANTILE #1761

Merged
merged 26 commits into from May 19, 2021
Merged

Conversation

hawkfish
Copy link
Contributor

First pass at moving quantile implementation. There is more to do, but this has an important fix for the API by passing in NULL masks, which was a bug.

Richard Wesley and others added 24 commits April 28, 2021 13:43
Move ResizeState into QuantileState.
Eventually this will store the index
for moving quantile instead of the
values (which will be passed in.)
Simplify quantile to to leverate
temporal struct types.
Code and tests. Waiting on simpler MODE changes.
Move reused objects into the class
to improve framed performance by 2x.
Rework the infrastructure for `Frame` function support.
Reduce frame data to a union of both frames.
Refactor QUANTILE to remove Value.
Fix std:min/max vs Win64.
Rename Frame to Window.
Add support for NULLs to Window API
Check for oversized frames.
Add missing NULL detection and test it.
Reduce calls to Reset.
Removes one of the two main hot spots
in the fixed frame size test computation.

This code is shared with the segment tree
implementation, so it should help there too.
@hawkfish
Copy link
Contributor Author

The latest change is just a perf change, but it impacts all windowed aggregates by removing unnecessary memory allocations from most segment tree computations.

The perf improvement for windowed quantile (10M rows, fixed frame) is about 20%:

./build/reldebug/test/unittest test/sql/window/test_moving_quantile.test_slow  7.42s user 0.30s system 99% cpu 7.728 total
./build/reldebug/test/unittest test/sql/window/test_moving_quantile.test_slow  6.10s user 0.26s system 96% cpu 6.582 total

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! The code looks good to me. Some more suggestions/ideas for the tests:

test/sql/window/test_quantile_window.test_slow Outdated Show resolved Hide resolved
test/sql/window/test_quantile_window.test_slow Outdated Show resolved Hide resolved
test/sql/window/test_mode_window.test Show resolved Hide resolved
Richard Wesley added 2 commits May 17, 2021 07:53
Add more tests for scattered NULLs.

Fix corner case in MODE Finalize
when the hash table is present but empty.
@Mytherin
Copy link
Collaborator

Thanks, looks great now!

@Mytherin Mytherin merged commit bc61944 into duckdb:master May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants