-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed aggregation on sparse grouping columns #1068
Conversation
e4da02f
to
401bd18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @JohanMabille.
LGTM modulo a few comment.
Here are some comments. I think I would wait for someone else's approval before pursuing, because I do not have yet an entire comprehensive knowledge of Column
's internals.
Also, is there a non-regression test we could craft for #528?
{ | ||
iter = std::make_optional(input_data.bit_vector()->first()); | ||
// We use 0 for the missing value group id | ||
next_group_id++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, but I cannot tell if this can invalidate some potential implicit invariant the aggregation clause or its tests might have been relying one.
e60a987
to
1b6d3d1
Compare
d5bc06b
to
ae94957
Compare
cpp/arcticdb/processing/clause.cpp
Outdated
@@ -304,6 +304,20 @@ Composite<ProcessingUnit> AggregationClause::process(std::shared_ptr<Store> stor | |||
// 11.01 seconds with caching | |||
// Not worth worrying about right now | |||
robin_hood::unordered_flat_map<RawType, size_t> offset_to_group; | |||
|
|||
bool is_sparse = col.column_->is_sparse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be const, although I suspect it won't make a lot of difference
ae94957
to
51f928e
Compare
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
51f928e
to
6941e7f
Compare
Sparsy Sparsy Sparsy me |
Reference Issues/PRs
Second part of #1007
Fixes #528
What does this implement or fix?
Any other comments?
mark_absent_rows
on the grouping column, which is fragile. More generally, Column constructors should force the user to pass the size of the column for sparse columns.Checklist
Checklist for code changes...