[SPARK-22435][SQL] Support processing array and map type using script#19652
[SPARK-22435][SQL] Support processing array and map type using script#19652jinxing64 wants to merge 1 commit intoapache:masterfrom
Conversation
There was a problem hiding this comment.
t.getText doesn't work, we need to process the token. e.g. remove the quote
There was a problem hiding this comment.
writer writes records into the input stream of script. Isn't it should be initialized with input format?
There was a problem hiding this comment.
DelimitedJSONSerDe doesn't support deserialize.
Use DelimitedJSONSerDe for input SerDe and LazySimpleSerDe for output SerDe.
There was a problem hiding this comment.
TODO: build json string for more types
|
Test build #83399 has finished for PR 19652 at commit
|
4a99426 to
0d706ff
Compare
|
Test build #83446 has finished for PR 19652 at commit
|
|
Thanks for working on it. Will review it next week. |
|
@gatorsmile |
|
Test build #100089 has finished for PR 19652 at commit
|
What changes were proposed in this pull request?
Currently, It is not supported to use script(e.g. python) to process array type or map type, it will complain with below message:
org.apache.spark.sql.catalyst.expressions.UnsafeArrayData cannot be cast to [Ljava.lang.Objectorg.apache.spark.sql.catalyst.expressions.UnsafeMapData cannot be cast to java.util.MapThis pr proposes to support it by using
DelimitedJSONSerDeThis pr also fixes a bug -- when using input row format with script, no data will be produced from
ScriptTransformationExec.How was this patch tested?
Tests added.