[SPARK-28474][SQL] Hive 0.12 JDBC client can not handle binary type#25379
[SPARK-28474][SQL] Hive 0.12 JDBC client can not handle binary type#25379wangyum wants to merge 3 commits intoapache:masterfrom wangyum:SPARK-28474
Conversation
|
Test build #108776 has finished for PR 25379 at commit
|
| return stringValue(((HiveDecimal)value)); | ||
| case BINARY_TYPE: | ||
| return stringValue((String)value); | ||
| return stringValue(new String((byte[])value)); |
There was a problem hiding this comment.
We stored it as Array[Byte]:
sql/hive-thriftserver/v1.2.1/src/main/java/org/apache/hive/service/cli/ColumnValue.java
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/v2.3.5/src/main/java/org/apache/hive/service/cli/ColumnValue.java
Outdated
Show resolved
Hide resolved
sql/hive-thriftserver/v1.2.1/src/main/java/org/apache/hive/service/cli/ColumnValue.java
Outdated
Show resolved
Hide resolved
|
Test build #108796 has finished for PR 25379 at commit
|
...st/scala/org/apache/spark/sql/hive/thriftserver/SparkThriftServerProtocolVersionsSuite.scala
Show resolved
Hide resolved
HyukjinKwon
left a comment
There was a problem hiding this comment.
looks good if the test works.
|
Test build #108804 has finished for PR 25379 at commit
|
|
Merged to master. |
|
Hi, @wangyum and @HyukjinKwon . |
| case BINARY_TYPE: | ||
| return stringValue((String)value); | ||
| String strVal = value == null ? null : UTF8String.fromBytes((byte[])value).toString(); | ||
| return stringValue(strVal); |
There was a problem hiding this comment.
@wangyum, #25480 this might be the cause. UTF8String.toString().
new String(getBytes(), StandardCharsets.UTF_8);
It actually reads bytes with UTF-8. IIRC, Java mangles the data if it's not conformed as specified encoding.
For some protocols, it might be unable to recognise those mangled strings back to binary. It might be the cause.
BTW, we should allow arbitrary binary, for instance, for the use case like images.
If you are unable to find the root cause, we can partially revert to the failed protocols for now.
What changes were proposed in this pull request?
This PR fix Hive 0.12 JDBC client can not handle binary type:
Server log:
How was this patch tested?
unit tests