New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coprocessor: Add batch aggregate function FIRST #4771
Conversation
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
#[inline] | ||
fn update(&mut self, _ctx: &mut EvalContext, value: &Option<T>) -> Result<()> { | ||
if let AggrFnStateFirst::Empty = self { | ||
// TODO: avoid this clone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😕 There's no way to avoid the clone since value is an &Option<T>
which cannot be moved out. You could avoid the clone only if value is Option<T>
or &mut Option<T>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we cannot avoid it unless we change the parameter type. But we do not always need the clone, for example, we can actually move out the value if it is owned in the RpnStackNode. To accomplish that, we need a big change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for an aggregate function, it's rare we need to clone the value. FIRST is quite a rarely used aggregate function too. (I know only one case like this: SELECT count(*), some_col from t;
) Maybe we can just ignore the problem here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We can ignore the clone cost, since it is called only once for every group, which is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
#[inline] | ||
fn update(&mut self, _ctx: &mut EvalContext, value: &Option<T>) -> Result<()> { | ||
if let AggrFnStateFirst::Empty = self { | ||
// TODO: avoid this clone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. We can ignore the clone cost, since it is called only once for every group, which is fine.
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
/run-integration-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine! Actually I wanted to see some comments like.. "FIRST only needs the first input datum and there is no need to process the rest, so manually implement update_repeat
and update_vector
".
According to this PR maybe we need some capability to customize |
/run-integration-tests |
Oh, another unstable test sadly SELECT col0 + col1 AS col0, col0 AS col0 FROM tab0 GROUP BY col0 order by col0; |
/run-integration-common-tests tidb-test=pr/826 tidb-private-test=pr/826 |
/run-integration-tests |
/run-integration-common-test |
/run-integration-tests |
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
// FIRST outputs one column with the same type as its child | ||
out_schema.push(aggr_def.take_field_type()); | ||
|
||
// FIRST doesn't need to cast, so using the expression directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel weird to add the comment here. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, seems too trivial... Let me delete it.
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
/run-all-tests |
/run-integration-tests |
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen sticnarf@gmail.com
What have you changed? (mandatory)
This PR adds aggregate function FIRST under the new batch aggregate function framework.
What are the type of the changes? (mandatory)
New feature
How has this PR been tested? (mandatory)
Unit test
Does this PR affect documentation (docs) or release note? (mandatory)
No
Does this PR affect tidb-ansible update? (mandatory)
No
FIRST
aggregate function is too special. I feel very uncomfortable to implement it. As allAggrFunctionState
need to implementAggrFunctionStateUpdatePartial
of all eval types, I need to write default impl ofAggrFunctionStateUpdatePartial
for myAggrFnStateFirst
just for invalid types. Another problem is that I find it hard to pass and store a reference in anAggrFunctionState
under the current design.