-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-8214][SQL]Add function hex #6976
Conversation
|
||
/** | ||
* Convert every character in s to two hex digits. | ||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this line.
use Converts instead of Convert
Test build #35653 has finished for PR 6976 at commit
|
Test build #35664 has finished for PR 6976 at commit
|
* Converts every character in s to two hex digits. | ||
*/ | ||
private def hex(str: UTF8String): UTF8String = { | ||
if (str == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the null checking, as it's did in eval already.
LGTM, except some minor issues. Can you also remove the WIP? |
Test build #35818 has finished for PR 6976 at commit
|
@rxin could you please take a look at this ? |
* Converts every character in s to two hex digits. | ||
*/ | ||
private def hex(str: UTF8String): UTF8String = { | ||
doHex(str.toString.getBytes, str.length()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If use 'UTF8' as the encoding here, it should be str.getBytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would remove toString() here.
Test build #35950 has finished for PR 6976 at commit
|
Test build #35951 has finished for PR 6976 at commit
|
Test build #35957 has finished for PR 6976 at commit
|
* Converts every character in s to two hex digits. | ||
*/ | ||
private def hex(str: UTF8String): UTF8String = { | ||
doHex(str.getBytes, str.length()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UTF8String.length
is the number of code points, not the number of bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would reuse the hex(bytes) method as a work around and add a unit test which contain chinese character to cover this.
Test build #35972 has finished for PR 6976 at commit
|
Test build #35978 has finished for PR 6976 at commit
|
LGTM, merging this into master, thanks! |
cc @chenghao-intel @adrian-wang