Skip to content

Conversation

@dreamquster
Copy link
Contributor

  1. implement 'Left' function but document is needed to complete.

@elasticsearchmachine elasticsearchmachine added v8.11.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 21, 2023
@nik9000 nik9000 self-requested a review August 21, 2023 11:16
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a guide for functions that'd be worth looking at!
#98648

I think you might have discovered a sneaky problem with substring. Or maybe I found it while reviewing your code. Either way, please have a look at the guide and see where you can get. For the utf-8 allocation stuff I'd be ok skipping it in this PR. I can look in a followup if you don't have time for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try not to fail running queries. It's fine to fail them when planning though. This will fail. But if you add IllegalArgumentException to warnExceptions it'll return a warning instead.

I noticed that MySQL and Postgres don't fail here. They actually do different things, but it's worth thinking about what's right. https://www.db-fiddle.com/f/fB8wEtxY9JYQwrmEFtkmJ4/0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I get it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be lovely if you could do this without allocating on ever iteration. I think it's possible with something like:

static BytesRef process(@Fixed BytesRef out, @Fixed UTF8CodePoint cp, BytesRef str, int length) {
  ...
  result.bytes = str.bytes;
  result.offset = str.offset;
  result.length = str.length;
  for (int i = 0; i < length; i++) {
    cp = UnicodeUtil.codePointAt(result.bytes, result.start, cp);
    result.start += cp.numBytes;
    result.length -= cp.numBytes;
  }
  result.length = Math.max(0, result.length);
  return result;
}

At least, I think something like that'd work. Actually, I think I wrote right instead of left, but something like that.

With yours we have to convert the whole thing into a utf16 string and back again. Also, I'm a bit worried about places where utf-16 characters aren't a single code point. I think this'll cut them up differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. It looks like Substring has the same problem....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option, I'm not sure if it's worth it, is to just return the SubstringEvaluator here instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good that just return SubstringEvaluator(str, 0, length)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. It looks like Substring has the same problem....

I just modify the string in place at first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with my opertion?
I check out a branch named 'left-function' at local and commit some code.
Then I merge the code from elastic repo into local branch and push rebasely to my forked repo.
But why is there such a large difference in the pull request?

@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Aug 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Aug 21, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@nik9000
Copy link
Member

nik9000 commented Aug 26, 2023

Something seems to have broken in the merge!

@nik9000
Copy link
Member

nik9000 commented Aug 26, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:QL (Deprecated) Meta label for query languages team v8.11.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants