Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coprocessor: Add batch aggregate function FIRST #4771

Merged
merged 14 commits into from May 29, 2019

Conversation

sticnarf
Copy link
Contributor

@sticnarf sticnarf commented May 25, 2019

Signed-off-by: Yilin Chen sticnarf@gmail.com

What have you changed? (mandatory)

This PR adds aggregate function FIRST under the new batch aggregate function framework.

What are the type of the changes? (mandatory)

New feature

How has this PR been tested? (mandatory)

Unit test

Does this PR affect documentation (docs) or release note? (mandatory)

No

Does this PR affect tidb-ansible update? (mandatory)

No


FIRST aggregate function is too special. I feel very uncomfortable to implement it. As all AggrFunctionState need to implement AggrFunctionStateUpdatePartial of all eval types, I need to write default impl of AggrFunctionStateUpdatePartial for my AggrFnStateFirst just for invalid types. Another problem is that I find it hard to pass and store a reference in an AggrFunctionState under the current design.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
#[inline]
fn update(&mut self, _ctx: &mut EvalContext, value: &Option<T>) -> Result<()> {
if let AggrFnStateFirst::Empty = self {
// TODO: avoid this clone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😕 There's no way to avoid the clone since value is an &Option<T> which cannot be moved out. You could avoid the clone only if value is Option<T> or &mut Option<T>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we cannot avoid it unless we change the parameter type. But we do not always need the clone, for example, we can actually move out the value if it is owned in the RpnStackNode. To accomplish that, we need a big change.

Copy link
Contributor Author

@sticnarf sticnarf May 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for an aggregate function, it's rare we need to clone the value. FIRST is quite a rarely used aggregate function too. (I know only one case like this: SELECT count(*), some_col from t;) Maybe we can just ignore the problem here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can ignore the clone cost, since it is called only once for every group, which is fine.

breezewish
breezewish previously approved these changes May 27, 2019
Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

src/coprocessor/dag/aggr_fn/impl_first.rs Outdated Show resolved Hide resolved
#[inline]
fn update(&mut self, _ctx: &mut EvalContext, value: &Option<T>) -> Result<()> {
if let AggrFnStateFirst::Empty = self {
// TODO: avoid this clone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can ignore the clone cost, since it is called only once for every group, which is fine.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@breezewish
Copy link
Member

/run-integration-tests

breezewish
breezewish previously approved these changes May 27, 2019
Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine! Actually I wanted to see some comments like.. "FIRST only needs the first input datum and there is no need to process the rest, so manually implement update_repeat and update_vector".

@breezewish
Copy link
Member

breezewish commented May 27, 2019

According to this PR maybe we need some capability to customize update_repeat and update_vector only (instead of re-implementing the whole XxxPartal). We can refine it later.

@sticnarf
Copy link
Contributor Author

/run-integration-tests

@breezewish
Copy link
Member

Oh, another unstable test sadly

SELECT col0 + col1 AS col0, col0 AS col0 FROM tab0 GROUP BY col0 order by col0;

@breezewish
Copy link
Member

/run-integration-common-tests tidb-test=pr/826 tidb-private-test=pr/826

@breezewish
Copy link
Member

/run-integration-tests

@sticnarf
Copy link
Contributor Author

/run-integration-common-test

@breezewish breezewish added sig/coprocessor SIG: Coprocessor status/LGT1 Status: PR - There is already 1 approval labels May 28, 2019
@breezewish
Copy link
Member

/run-integration-tests

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
// FIRST outputs one column with the same type as its child
out_schema.push(aggr_def.take_field_type());

// FIRST doesn't need to cast, so using the expression directly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel weird to add the comment here. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, seems too trivial... Let me delete it.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@lonng lonng added status/LGT2 Status: PR - There are already 2 approvals and removed status/LGT1 Status: PR - There is already 1 approval labels May 29, 2019
@sticnarf
Copy link
Contributor Author

/run-all-tests

@breezewish
Copy link
Member

/run-integration-tests

@breezewish breezewish merged commit bed9a3c into tikv:master May 29, 2019
breezewish pushed a commit to breezewish/tikv that referenced this pull request Jun 12, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
breezewish pushed a commit to breezewish/tikv that referenced this pull request Jun 12, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
breezewish pushed a commit to breezewish/tikv that referenced this pull request Jun 12, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
sticnarf added a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/coprocessor SIG: Coprocessor status/LGT2 Status: PR - There are already 2 approvals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants