-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Implement the 'left' function in issue #98545 #98660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dreamquster
commented
Aug 21, 2023
- implement 'Left' function but document is needed to complete.
nik9000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a guide for functions that'd be worth looking at!
#98648
I think you might have discovered a sneaky problem with substring. Or maybe I found it while reviewing your code. Either way, please have a look at the guide and see where you can get. For the utf-8 allocation stuff I'd be ok skipping it in this PR. I can look in a followup if you don't have time for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We try not to fail running queries. It's fine to fail them when planning though. This will fail. But if you add IllegalArgumentException to warnExceptions it'll return a warning instead.
I noticed that MySQL and Postgres don't fail here. They actually do different things, but it's worth thinking about what's right. https://www.db-fiddle.com/f/fB8wEtxY9JYQwrmEFtkmJ4/0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I get it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be lovely if you could do this without allocating on ever iteration. I think it's possible with something like:
static BytesRef process(@Fixed BytesRef out, @Fixed UTF8CodePoint cp, BytesRef str, int length) {
...
result.bytes = str.bytes;
result.offset = str.offset;
result.length = str.length;
for (int i = 0; i < length; i++) {
cp = UnicodeUtil.codePointAt(result.bytes, result.start, cp);
result.start += cp.numBytes;
result.length -= cp.numBytes;
}
result.length = Math.max(0, result.length);
return result;
}
At least, I think something like that'd work. Actually, I think I wrote right instead of left, but something like that.
With yours we have to convert the whole thing into a utf16 string and back again. Also, I'm a bit worried about places where utf-16 characters aren't a single code point. I think this'll cut them up differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. It looks like Substring has the same problem....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option, I'm not sure if it's worth it, is to just return the SubstringEvaluator here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good that just return SubstringEvaluator(str, 0, length)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. It looks like
Substringhas the same problem....
I just modify the string in place at first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with my opertion?
I check out a branch named 'left-function' at local and commit some code.
Then I merge the code from elastic repo into local branch and push rebasely to my forked repo.
But why is there such a large difference in the pull request?
|
Pinging @elastic/es-ql (Team:QL) |
|
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
|
Something seems to have broken in the merge! |
|
You can't merge and rebase. That rewrites all the commits in your branch.
Only merge.
…On Sat, Aug 26, 2023, 10:08 AM dreamquster ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/string/Left.java
<#98660 (comment)>
:
> + this.length = length;
+ }
+
+ @evaluator
+ static BytesRef process(BytesRef str, int length) {
+ if (str.length == 0) {
+ return str;
+ }
+ if (length < 0) {
+ throw new IllegalArgumentException("Length parameter cannot be negative, found [" + length + "]");
+ }
+
+ int codePointCount = UnicodeUtil.codePointCount(str);
+ int indexEnd = Math.min(codePointCount, length);
+ String s = str.utf8ToString();
+ return new BytesRef(s.substring(0, indexEnd));
What's wrong with my opertion?
I check out a branch named 'left-function' at local and commit some code.
Then I merge the code from elastic repo into local branch and push
rebasely to my forked repo.
But why is there such a large difference in the pull request?
—
Reply to this email directly, view it on GitHub
<#98660 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIQKXB45XB47CZJJAX3XXH7MTANCNFSM6AAAAAA3YCDSDY>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
2682334 to
ca3dc3a
Compare