-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat: support binary arguments for StringConcat operator #21883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5d46932
c539d00
905571c
5629996
6dc7a22
46bfc74
77768b9
5469c45
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -321,3 +321,40 @@ query T | |
| SELECT split_part(CAST(binary AS VARCHAR), 'o', 2) FROM t WHERE binary = X'466f6f'; | ||
| ---- | ||
| (empty) | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we add a test with fixedsizebinary too?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, added support for this type as well |
||
| # Pipe concatenation of binaries always provides a binary | ||
| query ? | ||
| SELECT x'636166c3a9' || x'68656c6c6f'; | ||
| ---- | ||
| 636166c3a968656c6c6f | ||
|
|
||
| # Pipe concatenation of binary and other kind of binary also provides a binary | ||
| query ? | ||
| SELECT x'636166c3a9' || arrow_cast(x'68656c6c6f', 'LargeBinary'); | ||
| ---- | ||
| 636166c3a968656c6c6f | ||
|
|
||
| query ? | ||
| SELECT x'636166c3a9' || arrow_cast(arrow_cast(x'68656c6c6f', 'LargeBinary'), 'BinaryView'); | ||
| ---- | ||
| 636166c3a968656c6c6f | ||
|
|
||
| query ?T | ||
| SELECT x'636166c3a9' || arrow_cast(x'68656c6c6f', 'FixedSizeBinary(5)'), arrow_typeof(x'636166c3a9' || arrow_cast(x'68656c6c6f', 'FixedSizeBinary(5)')); | ||
| ---- | ||
| 636166c3a968656c6c6f Binary | ||
|
|
||
| query ?T | ||
| SELECT arrow_cast(x'6361', 'FixedSizeBinary(2)') || arrow_cast(x'68656c6c6f', 'FixedSizeBinary(5)'), arrow_typeof(arrow_cast(x'6361', 'FixedSizeBinary(2)') || arrow_cast(x'68656c6c6f', 'FixedSizeBinary(5)')); | ||
| ---- | ||
| 636168656c6c6f Binary | ||
|
|
||
| # Byte pipe operator is forbidden for mixed binary and text | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about mixed binary? 1. query failed: DataFusion error: Error during planning: Cannot infer common string type for string concat operation Binary || LargeBinary
[SQL] SELECT x'636166c3a9' || arrow_cast(x'68656c6c6f', 'LargeBinary');
at /Users/jeffrey/Code/datafusion/datafusion/sqllogictest/test_files/binary.slt:331Is this intentional?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, I slipped it, thanks!. Added unit and SLT tests |
||
| query error DataFusion error: Error during planning: Cannot infer common string type for string concat operation Binary || Utf8 | ||
| SELECT x'c3a9' || 'hello'; | ||
|
|
||
| query error DataFusion error: Error during planning: Cannot infer common string type for string concat operation Utf8 || LargeBinary | ||
| SELECT 'hello' || arrow_cast(arrow_cast('hello', 'Binary'), 'LargeBinary'); | ||
|
|
||
| query error DataFusion error: Error during planning: Cannot infer common string type for string concat operation Utf8 || BinaryView | ||
| SELECT 'hello' || arrow_cast(arrow_cast('hello', 'Binary'), 'BinaryView'); | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before the change,
Binary || Utf8returned aUtf8string, andBinary || Binaryalso returnedUtf8. Now the first errors out, and the second returnsBinary, wouldn't this be a breaking change for users who are hitting this in a query?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is worth noting, as we decided to ban mixing according to this analysis.
I'll add
api changelabel and put an upgrade notice shortly.I intend to put a separate PR to harmonise
concatUDF behaviour (it allows mixing) with the pipe operator's behaviour to avoid confusion