Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47922][SQL] Implement the try_parse_json expression #46141

Closed
wants to merge 89 commits into from
Closed
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
81f4866
Implemented is_variant_null expression
harshmotw-db Apr 11, 2024
3673ef7
Merge branch 'apache:master' into master
harshmotw-db Apr 11, 2024
7acb773
minor change
harshmotw-db Apr 11, 2024
efe4ba1
Intermediate step in mapping from_json to parse_json
harshmotw-db Apr 11, 2024
e1e561d
Regenerated golden files
harshmotw-db Apr 12, 2024
c2329f2
Fixed some comments left by Chenhao (one still remaining)
harshmotw-db Apr 12, 2024
cb973a9
Merge branch 'apache:master' into master
harshmotw-db Apr 12, 2024
f82c11a
Fixed major bug in the StaticInvoke call in IsVariantNull
harshmotw-db Apr 12, 2024
85afe57
Added is_variant_null tests in VariantEndToEndSuite where it is worki…
harshmotw-db Apr 12, 2024
506dd20
Adding one more test to re trigger CI/CD tests. Currently, there are …
harshmotw-db Apr 12, 2024
c0607f0
regenerated golden files again
harshmotw-db Apr 12, 2024
af0ddc2
Added support for variant in from_json
harshmotw-db Apr 15, 2024
c2dd13a
Added unit tests to check if from_json is working as intended when sc…
harshmotw-db Apr 15, 2024
763a478
Update sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expr…
harshmotw-db Apr 15, 2024
82b3a27
Merge branch 'from_json_variant' of https://github.com/harshmotw-db/s…
harshmotw-db Apr 15, 2024
7767e55
Merge branch 'master' into master
harshmotw-db Apr 15, 2024
b32b78d
Merge branch 'master' into master
harshmotw-db Apr 16, 2024
6d8a588
temporary commit
harshmotw-db Apr 16, 2024
db4766d
change to help with merging
harshmotw-db Apr 16, 2024
a495b8d
change to help with merging
harshmotw-db Apr 16, 2024
2de1bec
Merge branch 'master' into from_json_variant
harshmotw-db Apr 16, 2024
1340ace
Fixed error message in error-conditions.json to incorporate variant type
harshmotw-db Apr 16, 2024
8e09ce0
Added intermediate support for variant in pyspark
harshmotw-db Apr 17, 2024
4a6b522
Added remaining scalar types to the Python Variant library
harshmotw-db Apr 18, 2024
678f02a
Removed unnecessary changes
harshmotw-db Apr 18, 2024
9a8ac8d
style improvements
harshmotw-db Apr 18, 2024
3c40bf2
Removed unnecessary changes
harshmotw-db Apr 18, 2024
74c8476
Removed unnecessary changes
harshmotw-db Apr 18, 2024
70c1336
Removed unnecessary changes
harshmotw-db Apr 18, 2024
2bfa77c
Removed unnecessary changes
harshmotw-db Apr 18, 2024
4f5b86e
Removed unnecessary changes
harshmotw-db Apr 18, 2024
5a3aed8
Removed unnecessary zone id in _to_python
harshmotw-db Apr 18, 2024
6db38ce
Removed unnecessary changes
harshmotw-db Apr 18, 2024
533c01a
Removed unnecessary changes
harshmotw-db Apr 18, 2024
f0742cf
Merge branch 'master' of https://github.com/harshmotw-db/spark into p…
harshmotw-db Apr 18, 2024
f53ee04
Resolved merge conflicts
harshmotw-db Apr 18, 2024
9ecbdc7
Merge branch 'master' of https://github.com/harshmotw-db/spark into p…
harshmotw-db Apr 18, 2024
69c53e8
Removed unnecessary changes
harshmotw-db Apr 18, 2024
402f380
Temporary commit
harshmotw-db Apr 18, 2024
428b354
Added changes recommended by Gene
harshmotw-db Apr 18, 2024
a5073da
Remove unnecessary changes
harshmotw-db Apr 18, 2024
5967553
Fixed Hyukjin's comment
harshmotw-db Apr 19, 2024
0afcfae
Fixed Python linting
harshmotw-db Apr 19, 2024
6920915
Merge branch 'python_scalar_variant' of https://github.com/harshmotw-…
harshmotw-db Apr 19, 2024
351f2d2
Made linting change
harshmotw-db Apr 19, 2024
b7d24ce
Replaced pytz with zoneinfo
harshmotw-db Apr 19, 2024
16e7b63
removed unnecessary change
harshmotw-db Apr 19, 2024
85ab57f
Fixed linting and a comment
harshmotw-db Apr 19, 2024
e3f494f
Fixed lint comment
harshmotw-db Apr 19, 2024
524a4f4
Fixed documentation message
harshmotw-db Apr 19, 2024
d70c2db
Fixed toJson function
harshmotw-db Apr 19, 2024
8cce917
Fixed minor error in _to_json
harshmotw-db Apr 19, 2024
473a0f2
Added quotes around date and timestamp in toJson
harshmotw-db Apr 19, 2024
6f06be3
minor changes
harshmotw-db Apr 19, 2024
bd4e0a9
minor changes
harshmotw-db Apr 19, 2024
764e633
Merge branch 'apache:master' into master
harshmotw-db Apr 19, 2024
a3ceee9
Implemented try_parse_json
harshmotw-db Apr 19, 2024
42910b9
regenerated golden files
harshmotw-db Apr 19, 2024
06c9a65
More python linting
harshmotw-db Apr 19, 2024
843e042
Removed unnecessary change
harshmotw-db Apr 19, 2024
2e3863e
Added change for document generation
harshmotw-db Apr 19, 2024
8ee42ae
Added aliases in scala and python for try_parse_json
harshmotw-db Apr 20, 2024
36edcbc
minor change
harshmotw-db Apr 20, 2024
db1eb06
more golden files
harshmotw-db Apr 20, 2024
0c80ede
scalafmt
harshmotw-db Apr 20, 2024
488a249
fix
harshmotw-db Apr 20, 2024
731f6b5
Merge branch 'apache:master' into master
harshmotw-db Apr 20, 2024
ee98221
Merge branch 'master' of https://github.com/harshmotw-db/spark
harshmotw-db Apr 20, 2024
308d760
Merge branch 'master' into try_parse_json
harshmotw-db Apr 20, 2024
0b00d2a
Merge branch 'master' into python_scalar_variant
harshmotw-db Apr 21, 2024
5780ab3
regenerated golden files
harshmotw-db Apr 21, 2024
692c01d
Changed implementation of try_parse_json from TryEval to something mo…
harshmotw-db Apr 22, 2024
3b25338
regerated a golden file
harshmotw-db Apr 22, 2024
282ff02
Minor changes
harshmotw-db Apr 22, 2024
925bd00
Minor change
harshmotw-db Apr 22, 2024
d23cf61
regenerated golden files
harshmotw-db Apr 22, 2024
c029e38
Merge branch 'try_parse_json' of https://github.com/harshmotw-db/spar…
harshmotw-db Apr 22, 2024
74da802
Merge branch 'master' into try_parse_json
harshmotw-db Apr 22, 2024
5b6af19
Merge branch 'apache:master' into try_parse_json
harshmotw-db Apr 22, 2024
e1bb3e7
Merged and regerated golden files
harshmotw-db Apr 22, 2024
7b25cb7
Merge branch 'apache:master' into master
harshmotw-db Apr 22, 2024
f810036
Removed changes that added try_parse_json
harshmotw-db Apr 22, 2024
2329391
Merge branch 'master' of https://github.com/harshmotw-db/spark into p…
harshmotw-db Apr 22, 2024
e7da9d2
Removed changes relating to try_parse_json
harshmotw-db Apr 23, 2024
2405005
One more minor change
harshmotw-db Apr 23, 2024
841dd4c
Resolved merge conflicts
harshmotw-db Apr 23, 2024
5534627
Merge branch 'apache:master' into master
harshmotw-db Apr 25, 2024
b78ea27
Resolved conflicts and added more test cases
harshmotw-db Apr 25, 2024
6aa7475
Made minor change to fix linting issue
harshmotw-db Apr 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -6964,6 +6964,18 @@ object functions {
fnWithOptions("from_json", options, e, schema)
}

/**
* Parses a JSON string and constructs a Variant value. Returns null if the input string is not
* a valid JSON value.
*
* @param json
* a string column that contains JSON data.
*
* @group variant_funcs
* @since 4.0.0
*/
def try_parse_json(json: Column): Column = Column.fn("try_parse_json", json)

/**
* Parses a JSON string and constructs a Variant value.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2481,6 +2481,10 @@ class PlanGenerationTestSuite
Collections.singletonMap("allowNumericLeadingZeros", "true"))
}

functionTest("try_parse_json") {
fn.try_parse_json(fn.col("g"))
}

functionTest("to_json") {
fn.to_json(fn.col("d"), Map(("timestampFormat", "dd/MM/yyyy")))
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Project [tryeval(staticinvoke(class org.apache.spark.sql.catalyst.expressions.variant.VariantExpressionEvalUtils$, VariantType, parseJson, g#0, StringType, true, false, true)) AS try_parse_json(g)#0]
+- LocalRelation <empty>, [id#0L, a#0, b#0, d#0, e#0, f#0, g#0]
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"common": {
"planId": "1"
},
"project": {
"input": {
"common": {
"planId": "0"
},
"localRelation": {
"schema": "struct\u003cid:bigint,a:int,b:double,d:struct\u003cid:bigint,a:int,b:double\u003e,e:array\u003cint\u003e,f:map\u003cstring,struct\u003cid:bigint,a:int,b:double\u003e\u003e,g:string\u003e"
}
},
"expressions": [{
"unresolvedFunction": {
"functionName": "try_parse_json",
"arguments": [{
"unresolvedAttribute": {
"unparsedIdentifier": "g"
}
}]
}
}]
}
}
Binary file not shown.
8 changes: 8 additions & 0 deletions python/docs/source/reference/pyspark.sql/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,14 @@ JSON Functions
to_json


VARIANT Functions
-----------------
.. autosummary::
:toctree: api/

try_parse_json


XML Functions
--------------
.. autosummary::
Expand Down
7 changes: 7 additions & 0 deletions python/pyspark/sql/connect/functions/builtin.py
Original file line number Diff line number Diff line change
Expand Up @@ -2041,6 +2041,13 @@ def str_to_map(
str_to_map.__doc__ = pysparkfuncs.str_to_map.__doc__


def try_parse_json(col: "ColumnOrName") -> Column:
return _invoke_function("try_parse_json", _to_col(col))


try_parse_json.__doc__ = pysparkfuncs.try_parse_json.__doc__


def parse_json(col: "ColumnOrName") -> Column:
return _invoke_function("parse_json", _to_col(col))

Expand Down
33 changes: 32 additions & 1 deletion python/pyspark/sql/functions/builtin.py
Original file line number Diff line number Diff line change
Expand Up @@ -15423,12 +15423,43 @@ def from_json(
return _invoke_function("from_json", _to_java_column(col), schema, _options_to_str(options))


@_try_remote_functions
def try_parse_json(
col: "ColumnOrName",
) -> Column:
"""
Parses a column containing a JSON string into a :class:`VariantType`. Returns None if a string
contains an invalid JSON value.

.. versionadded:: 4.0.0

Parameters
----------
col : :class:`~pyspark.sql.Column` or str
a column or column name JSON formatted strings

Returns
-------
:class:`~pyspark.sql.Column`
a new column of VariantType.

Examples
--------
>>> df = spark.createDataFrame([ {'json': '''{ "a" : 1 }'''}, {'json': '''{a : 1}'''} ])
>>> df.select(to_json(try_parse_json(df.json))).collect()
[Row(to_json(try_parse_json(json))='{"a":1}'), Row(to_json(try_parse_json(json))=None)]
"""

return _invoke_function("try_parse_json", _to_java_column(col))


@_try_remote_functions
def parse_json(
col: "ColumnOrName",
) -> Column:
"""
Parses a column containing a JSON string into a :class:`VariantType`.
Parses a column containing a JSON string into a :class:`VariantType`. Throws exception if a
string represents an invalid JSON value.

.. versionadded:: 4.0.0

Expand Down
8 changes: 8 additions & 0 deletions python/pyspark/sql/tests/test_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1325,6 +1325,14 @@ def test_schema_of_json(self):
message_parameters={"arg_name": "json", "arg_type": "int"},
)

def test_try_parse_json(self):
df = self.spark.createDataFrame([{"json": """{ "a" : 1 }"""}, {"json": """{ a : 1 }"""}])
actual = df.select(
F.to_json(F.try_parse_json(df.json)).alias("var"),
).collect()
self.assertEqual("""{"a":1}""", actual[0]["var"])
self.assertEqual(None, actual[1]["var"])

def test_schema_of_csv(self):
with self.assertRaises(PySparkTypeError) as pe:
F.schema_of_csv(1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -821,7 +821,8 @@ object FunctionRegistry {
expression[JsonObjectKeys]("json_object_keys"),

// Variant
expression[ParseJson]("parse_json"),
expressionBuilder("parse_json", ParseJsonExpressionBuilder),
expressionBuilder("try_parse_json", TryParseJsonExpressionBuilder),
expression[IsVariantNull]("is_variant_null"),
expressionBuilder("variant_get", VariantGetExpressionBuilder),
expressionBuilder("try_variant_get", TryVariantGetExpressionBuilder),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,16 +31,24 @@ import org.apache.spark.unsafe.types.{UTF8String, VariantVal}
*/
object VariantExpressionEvalUtils {

def parseJson(input: UTF8String): VariantVal = {
def parseJson(input: UTF8String, failOnError: Boolean = true): VariantVal = {
def parseJsonFailure(exception: Throwable): VariantVal = {
if (failOnError) {
throw exception
} else {
null
}
}
try {
val v = VariantBuilder.parseJson(input.toString)
new VariantVal(v.getValue, v.getMetadata)
} catch {
case _: VariantSizeLimitException =>
throw QueryExecutionErrors.variantSizeLimitError(VariantUtil.SIZE_LIMIT, "parse_json")
parseJsonFailure(QueryExecutionErrors
.variantSizeLimitError(VariantUtil.SIZE_LIMIT, "parse_json"))
case NonFatal(e) =>
throw QueryExecutionErrors.malformedRecordsDetectedInRecordParsingError(
input.toString, BadRecordException(() => input, cause = e))
parseJsonFailure(QueryExecutionErrors.malformedRecordsDetectedInRecordParsingError(
input.toString, BadRecordException(() => input, cause = e)))
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,34 +42,30 @@ import org.apache.spark.types.variant._
import org.apache.spark.types.variant.VariantUtil.Type
import org.apache.spark.unsafe.types._

// scalastyle:off line.size.limit
@ExpressionDescription(
usage = "_FUNC_(jsonStr) - Parse a JSON string as an Variant value. Throw an exception when the string is not valid JSON value.",
examples = """
Examples:
> SELECT _FUNC_('{"a":1,"b":0.8}');
{"a":1,"b":0.8}
""",
since = "4.0.0",
group = "variant_funcs"
)
// scalastyle:on line.size.limit
case class ParseJson(child: Expression)

/**
* The implementation for `parse_json` and `try_parse_json` expressions. Parse a JSON string as an
* Variant value.
* @param child The string value to extract parse to as a variant.
* @param failOnError Controls whether the expression should throw an exception or return null if
* the string does not represent a valid JSON value.
*/
case class ParseJson(child: Expression, failOnError: Boolean = true)
extends UnaryExpression with ExpectsInputTypes with RuntimeReplaceable {

override lazy val replacement: Expression = StaticInvoke(
VariantExpressionEvalUtils.getClass,
VariantType,
"parseJson",
Seq(child),
inputTypes,
Seq(child, Literal(failOnError, BooleanType)),
inputTypes :+ BooleanType,
returnNullable = false)

override def inputTypes: Seq[AbstractDataType] = StringType :: Nil

override def dataType: DataType = VariantType

override def prettyName: String = "parse_json"
override def prettyName: String = if (failOnError) "parse_json" else "try_parse_json"

override protected def withNewChildInternal(newChild: Expression): ParseJson =
copy(child = newChild)
Expand Down Expand Up @@ -425,6 +421,47 @@ case object VariantGet {
}
}

abstract class ParseJsonExpressionBuilderBase(failOnError: Boolean) extends ExpressionBuilder {
override def build(funcName: String, expressions: Seq[Expression]): Expression = {
val numArgs = expressions.length
if (numArgs == 1) {
ParseJson(expressions.head, failOnError)
} else {
throw QueryCompilationErrors.wrongNumArgsError(funcName, Seq(1), numArgs)
}
}
}

// scalastyle:off line.size.limit
@ExpressionDescription(
usage = "_FUNC_(jsonStr) - Parse a JSON string as an Variant value. Throw an exception when the string is not valid JSON value.",
examples = """
Examples:
> SELECT _FUNC_('{"a":1,"b":0.8}');
{"a":1,"b":0.8}
""",
since = "4.0.0",
group = "variant_funcs"
)
// scalastyle:on line.size.limit
object ParseJsonExpressionBuilder extends ParseJsonExpressionBuilderBase(true)

// scalastyle:off line.size.limit
@ExpressionDescription(
usage = "_FUNC_(jsonStr) - Parse a JSON string as an Variant value. Return NULL when the string is not valid JSON value.",
examples = """
Examples:
> SELECT _FUNC_('{"a":1,"b":0.8}');
{"a":1,"b":0.8}
> SELECT _FUNC_('{"a":1,');
NULL
""",
since = "4.0.0",
group = "variant_funcs"
)
// scalastyle:on line.size.limit
object TryParseJsonExpressionBuilder extends ParseJsonExpressionBuilderBase(false)

abstract class VariantGetExpressionBuilderBase(failOnError: Boolean) extends ExpressionBuilder {
override def build(funcName: String, expressions: Seq[Expression]): Expression = {
val numArgs = expressions.length
Expand Down
11 changes: 11 additions & 0 deletions sql/core/src/main/scala/org/apache/spark/sql/functions.scala
Original file line number Diff line number Diff line change
Expand Up @@ -6594,6 +6594,17 @@ object functions {
fnWithOptions("from_json", options, e, schema)
}

/**
* Parses a JSON string and constructs a Variant value. Returns null if the input string is not
* a valid JSON value.
*
* @param json a string column that contains JSON data.
*
* @group variant_funcs
* @since 4.0.0
*/
def try_parse_json(json: Column): Column = Column.fn("try_parse_json", json)

/**
* Parses a JSON string and constructs a Variant value.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -437,9 +437,10 @@
| org.apache.spark.sql.catalyst.expressions.aggregate.VarianceSamp | var_samp | SELECT var_samp(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<var_samp(col):double> |
| org.apache.spark.sql.catalyst.expressions.aggregate.VarianceSamp | variance | SELECT variance(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<variance(col):double> |
| org.apache.spark.sql.catalyst.expressions.variant.IsVariantNull | is_variant_null | SELECT is_variant_null(parse_json('null')) | struct<is_variant_null(parse_json(null)):boolean> |
| org.apache.spark.sql.catalyst.expressions.variant.ParseJson | parse_json | SELECT parse_json('{"a":1,"b":0.8}') | struct<parse_json({"a":1,"b":0.8}):variant> |
| org.apache.spark.sql.catalyst.expressions.variant.ParseJsonExpressionBuilder | parse_json | SELECT parse_json('{"a":1,"b":0.8}') | struct<parse_json({"a":1,"b":0.8}):variant> |
| org.apache.spark.sql.catalyst.expressions.variant.SchemaOfVariant | schema_of_variant | SELECT schema_of_variant(parse_json('null')) | struct<schema_of_variant(parse_json(null)):string> |
| org.apache.spark.sql.catalyst.expressions.variant.SchemaOfVariantAgg | schema_of_variant_agg | SELECT schema_of_variant_agg(parse_json(j)) FROM VALUES ('1'), ('2'), ('3') AS tab(j) | struct<schema_of_variant_agg(parse_json(j)):string> |
| org.apache.spark.sql.catalyst.expressions.variant.TryParseJsonExpressionBuilder | try_parse_json | SELECT try_parse_json('{"a":1,"b":0.8}') | struct<try_parse_json({"a":1,"b":0.8}):variant> |
| org.apache.spark.sql.catalyst.expressions.variant.TryVariantGetExpressionBuilder | try_variant_get | SELECT try_variant_get(parse_json('{"a": 1}'), '$.a', 'int') | struct<try_variant_get(parse_json({"a": 1}), $.a):int> |
| org.apache.spark.sql.catalyst.expressions.variant.VariantGetExpressionBuilder | variant_get | SELECT variant_get(parse_json('{"a": 1}'), '$.a', 'int') | struct<variant_get(parse_json({"a": 1}), $.a):int> |
| org.apache.spark.sql.catalyst.expressions.xml.XPathBoolean | xpath_boolean | SELECT xpath_boolean('<a><b>1</b></a>','a/b') | struct<xpath_boolean(<a><b>1</b></a>, a/b):boolean> |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,46 @@ class VariantEndToEndSuite extends QueryTest with SharedSparkSession {
check("[0.0, 1.00, 1.10, 1.23]", "[0,1,1.1,1.23]")
}

test("try_parse_json/to_json round-trip") {
def check(input: String, output: String = "INPUT IS OUTPUT"): Unit = {
val df = Seq(input).toDF("v")
val variantDF = df.selectExpr("to_json(try_parse_json(v)) as v").select(Column("v"))
val expected = if (output != "INPUT IS OUTPUT") output else input
checkAnswer(variantDF, Seq(Row(expected)))
}

check("null")
check("true")
check("false")
check("-1")
check("1.0E10")
check("\"\"")
check("\"" + ("a" * 63) + "\"")
check("\"" + ("b" * 64) + "\"")
// scalastyle:off nonascii
check("\"" + ("你好,世界" * 20) + "\"")
// scalastyle:on nonascii
check("[]")
check("{}")
// scalastyle:off nonascii
check(
"[null, true, false,-1, 1e10, \"\\uD83D\\uDE05\", [ ], { } ]",
"[null,true,false,-1,1.0E10,\"😅\",[],{}]"
)
// scalastyle:on nonascii
check("[0.0, 1.00, 1.10, 1.23]", "[0,1,1.1,1.23]")
// Places where parse_json should fail and therefore, try_parse_json should return null
check("{1:2}", null)
check("{\"a\":1", null)
check("{\"a\":[a,b,c]}", null)
}

test("try_parse_json with invalid input type") {
// This test is required because the type checking logic in try_parse_json is custom.
val exception = intercept[Exception](spark.sql("select try_parse_json(1)"))
assert(exception != null)
}

test("to_json with nested variant") {
val df = Seq(1).toDF("v")
val variantDF1 = df.select(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,15 @@ class VariantSuite extends QueryTest with SharedSparkSession {
}
}

test("basic try_parse_json alias") {
val df = spark.createDataFrame(Seq(Row("""{ "a" : 1 }"""), Row("""{ a : 1 }""")).asJava,
new StructType().add("json", StringType))
val actual = df.select(to_json(try_parse_json(col("json")))).collect()

assert(actual(0)(0) == """{"a":1}""")
assert(actual(1)(0) == null)
}

test("basic parse_json alias") {
val df = spark.createDataFrame(Seq(Row("""{ "a" : 1 }""")).asJava,
new StructType().add("json", StringType))
Expand Down