-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-6034: [C++][Gandiva] Add string functions in Gandiva #4942
Conversation
e2f1742
to
54c8299
Compare
Codecov Report
@@ Coverage Diff @@
## master #4942 +/- ##
==========================================
+ Coverage 87.98% 89.15% +1.16%
==========================================
Files 910 722 -188
Lines 133521 101821 -31700
Branches 1418 0 -1418
==========================================
- Hits 117483 90779 -26704
+ Misses 16028 11042 -4986
+ Partials 10 0 -10
Continue to review full report at Codecov.
|
@@ -68,7 +68,20 @@ std::vector<NativeFunction> GetStringFunctionRegistry() { | |||
|
|||
NativeFunction("like", {}, DataTypeVector{utf8(), utf8()}, boolean(), | |||
kResultNullIfNull, "gdv_fn_like_utf8_utf8", | |||
NativeFunction::kNeedsFunctionHolder)}; | |||
NativeFunction::kNeedsFunctionHolder), | |||
NativeFunction("substr", {"substring"}, DataTypeVector{utf8(), int64(), int64()}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are non-obvious - can you please add comments for the params/functions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
uint64_t ctx_ptr = reinterpret_cast<int64>(&ctx); | ||
int32 out_len = 0; | ||
|
||
char* out_str = substr_utf8_int64_int64(ctx_ptr, "asdf", 4, 1, 2, &out_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add a test for 0 len output ?
substr_utf8_int64_int64(ctx_ptr, "asdf", 4, 1, 0, &out_len)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
thanks for the change, @pprudhvi |
Add following functions in Gandiva: substr(str, offset, len), substr(str, offset), concat(str1, str2), castVARCHAR(timestamp, len), convert_fromUTF8(binary) Closes apache#4942 from pprudhvi/utf8-funcs and squashes the following commits: f88773e <Prudhvi Porandla> add len 0 substr unittest 3900f8e <Prudhvi Porandla> static cast size_t to int32 2082241 <Prudhvi Porandla> add convert_fromUTF8 method 112c933 <Prudhvi Porandla> add castVARCHAR(timestamp) method 77d3cdd <Prudhvi Porandla> add concatOperator 9e2623f <Prudhvi Porandla> add unittests for substr 48c4d08 <Prudhvi Porandla> add substr methods Authored-by: Prudhvi Porandla <prudhvi.porandla@icloud.com> Signed-off-by: Pindikura Ravindra <ravindra@dremio.com>
Add following functions in Gandiva:
substr(str, offset, len), substr(str, offset), concat(str1, str2), castVARCHAR(timestamp, len), convert_fromUTF8(binary)