Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coprocessor/dag/expr: add some un-pushed builtin UDF(Reverse, ReverseBinary) #3435

Merged
merged 21 commits into from
Sep 3, 2018

Conversation

spongedu
Copy link
Collaborator

@spongedu spongedu commented Aug 11, 2018

What have you changed? (mandatory)

implement builtin function: Reverse, ReverseBinary in coprocessor.
This is for #3275

What are the type of the changes? (mandatory)

Improvement

How has this PR been tested? (mandatory)

unittest

Does this PR affect documentation (docs/docs-cn) update? (mandatory)

Does this PR affect tidb-ansible update? (mandatory)

Refer to a related PR or issue link (optional)

#3275

Benchmark result if necessary (optional)

Add a few positive/negative examples (optional)

Signed-off-by: spongedc <duchuangucas@live.com>
@sre-bot
Copy link
Contributor

sre-bot commented Aug 11, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

siddontang
siddontang previously approved these changes Aug 12, 2018
Copy link
Contributor

@siddontang siddontang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AndreMouche AndreMouche added sig/coprocessor SIG: Coprocessor contribution This PR is from a community contributor. labels Aug 13, 2018
AndreMouche
AndreMouche previously approved these changes Aug 13, 2018
Copy link
Member

@AndreMouche AndreMouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

breezewish
breezewish previously approved these changes Aug 13, 2018
Copy link
Member

@breezewish breezewish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! Please resolve conflicts.

Signed-off-by: spongedc <duchuangucas@live.com>
@spongedu
Copy link
Collaborator Author

@breeswish conflicts resolved. PTAL ~

ctx: &mut EvalContext,
row: &'a [Datum],
) -> Result<Option<Cow<'a, [u8]>>> {
let s = try_opt!(self.children[0].eval_string(ctx, row));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here we need to call eval_string_and_decode instead of call str::from_utf8 directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hicqu . I've fix this. PTAL. I made some modifications on eval_string_and_decode, disable the charset check and use a TODO comment instead because different charsets may be pushed down from TiDB and some builtins depends on it. UPPER, LOWER for example.

@hicqu
Copy link
Contributor

hicqu commented Aug 13, 2018

rest LGTM.

breezewish
breezewish previously approved these changes Aug 13, 2018
Signed-off-by: spongedc <duchuangucas@live.com>
Signed-off-by: spongedc <duchuangucas@live.com>
breezewish
breezewish previously approved these changes Aug 13, 2018
if types::is_binary_str(self.children[0].get_tp()) {
return Ok(Some(s));
return Ok(Some(Cow::Owned(s.to_string().into_bytes())));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can directly s.into_bytes().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&str has no into_bytes method . We can either s.to_string().into_bytes() or s.as_bytes().to_vec(), I think they are equal....

Signed-off-by: spongedc <duchuangucas@live.com>
Signed-off-by: spongedc <duchuangucas@live.com>
return s.map(Some);
}
Err(box_err!("unsupported charset: {}", chrst))
// TODO: Deal with more supported codec in the future
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it for test case?

];
let mut ctx = EvalContext::default();
for (arg, exp) in cases {
let op = scalar_func_expr(ScalarFuncSig::Reverse, &[datum_expr(arg)]);
Copy link
Contributor

@hicqu hicqu Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to do op.mut_field_type().set_charset(...). Now only utf8 is supported, so set it to utf8 is ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can take a look here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hicqu In TiDB, for binary string types such as BINARY, VARBINARY and so on, the charset and collate is set to binary. These settings are pushed down to the kv layer. If we panic on non-utf8 charsets, we can't deal with these binary string types. For example, type BINARY has the type setting as follows:

         for (input, exp) in cases {
             let input = string_datum_expr_with_tp(
                 input,
                 VAR_STRING,
                 BINARY_FLAG,
                 -1,
                 CHARSET_BIN.to_owned(),
                 COLLATION_BIN_ID,
             );

and the implementation of builtin UPPER in TiDB is as follows:

func (b *builtinUpperSig) evalString(row chunk.Row) (d string, isNull bool, err error) {
	d, isNull, err = b.args[0].EvalString(b.ctx, row)
	if isNull || err != nil {
		return d, isNull, errors.Trace(err)
	}

	if types.IsBinaryStr(b.args[0].GetType()) {
		return d, false, nil
	}

	return strings.ToUpper(d), false, nil
}

…val_string_and_decode'

Signed-off-by: spongedc <duchuangucas@live.com>
@spongedu
Copy link
Collaborator Author

@hicqu I've revert the changes on eval_string_and_decode, implement ReverseBinary and modified Reverse. PTAL :)

@spongedu spongedu changed the title coprocessor/dag/expr: add some un-pushed builtin UDF(Reverse) coprocessor/dag/expr: add some un-pushed builtin UDF(Reverse, ReverseBinary) Aug 14, 2018
@hicqu
Copy link
Contributor

hicqu commented Aug 14, 2018

LGTM. :)

hicqu
hicqu previously approved these changes Aug 14, 2018
AndreMouche
AndreMouche previously approved these changes Aug 15, 2018
Copy link
Member

@AndreMouche AndreMouche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spongedu spongedu dismissed stale reviews from AndreMouche and hicqu via c222115 August 27, 2018 16:28
@spongedu
Copy link
Collaborator Author

@hicqu @AndreMouche PTAL

Copy link
Contributor

@huachaohuang huachaohuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@huachaohuang
Copy link
Contributor

@hicqu @breeswish Please check and approve if everything goes well.

@huachaohuang huachaohuang added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 30, 2018
@huachaohuang
Copy link
Contributor

@hicqu @breeswish PTAL

@hicqu
Copy link
Contributor

hicqu commented Sep 3, 2018

LGTM.

@huachaohuang
Copy link
Contributor

/run-unit-test

@huachaohuang huachaohuang merged commit 5aa8768 into tikv:master Sep 3, 2018
sticnarf pushed a commit to sticnarf/tikv that referenced this pull request Oct 27, 2019
…Binary) (tikv#3435)

Signed-off-by: spongedc <duchuangucas@live.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. sig/coprocessor SIG: Coprocessor status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants