Expose intermediary states in aggregation functions #16239
Unanswered
mgrenonville
asked this question in
Q&A
Replies: 2 comments
-
👋 -- perhaps https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.Accumulator.html#tymethod.state is relevant |
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks for your reply.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
While looking at Datafusion (what an awesome project !!), I wondered if it's possible to expose intermediary states (ie: before merge_batch) to allow what clickhouse calls "-Merge", "-State", "-MergeState" combinators.
This allow clickhouse to persist pre-aggregated data using a grouping key as key, thus allow to compress data without loosing ability to filter it.
For example, uniqState returns a statistical structure (kind of count min sketch) that can be merge later, while querying. With this, it's easy to keep a uniqState by minute, and query uniqMerge by hour.
Thanks
Beta Was this translation helpful? Give feedback.
All reactions