-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Use code points instead of grapheme clusters for string functions #3054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3054 +/- ##
==========================================
+ Coverage 85.84% 85.88% +0.03%
==========================================
Files 289 289
Lines 51862 51844 -18
==========================================
+ Hits 44520 44525 +5
+ Misses 7342 7319 -23
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
|
cc @ovr |
|
@alamb @andygrove this is reviewable now. |
|
LGTM. I assume there should be a documentation update as well as part of this PR? Perhaps the rustdoc for |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice simplification and improvement to me ❤️ Thank you @Dandandan
cc @ovr who I think contributed the original implementation
LGTM, but I am not an original author 😄 |
|
I think this was @seddonm1 |
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @Dandandan
|
Benchmark runs are scheduled for baseline = a4fa44f and contender = 88f6548. 88f6548 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3049
Rationale for this change
Matching PostgreSQL, Spark and improve performance. Overall simplification.
What changes are included in this PR?
Moves most functions, except
rpad,lpadandtranslateto use code points rather than grapheme clusters.Also code simplification & avoiding some allocations.
Are there any user-facing changes?