Skip to content

Commit

Permalink
[SPARK-48150][SQL] try_parse_json output should be declared as nullable
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

The `try_parse_json` expression added in #46141 declares improper output nullability: the `try_` version's output must be marked as nullable. This PR corrects the nullability and adds a test.

### Why are the changes needed?

Incorrectly declaring an expression's output as non-nullable when it is actually nullable may lead to crashes.

### Does this PR introduce _any_ user-facing change?

Yes, it affects output nullability and thus may affect query result schemas.

### How was this patch tested?

New unit test cases.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #46409 from JoshRosen/fix-try-parse-json-nullability.

Authored-by: Josh Rosen <joshrosen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
  • Loading branch information
JoshRosen authored and dongjoon-hyun committed May 7, 2024
1 parent 0862f69 commit 8cf602a
Show file tree
Hide file tree
Showing 3 changed files with 11 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
Project [staticinvoke(class org.apache.spark.sql.catalyst.expressions.variant.VariantExpressionEvalUtils$, VariantType, parseJson, g#0, false, StringType, BooleanType, true, false, true) AS try_parse_json(g)#0]
Project [staticinvoke(class org.apache.spark.sql.catalyst.expressions.variant.VariantExpressionEvalUtils$, VariantType, parseJson, g#0, false, StringType, BooleanType, true, true, true) AS try_parse_json(g)#0]
+- LocalRelation <empty>, [id#0L, a#0, b#0, d#0, e#0, f#0, g#0]
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ case class ParseJson(child: Expression, failOnError: Boolean = true)
"parseJson",
Seq(child, Literal(failOnError, BooleanType)),
inputTypes :+ BooleanType,
returnNullable = false)
returnNullable = !failOnError)

override def inputTypes: Seq[AbstractDataType] = StringType :: Nil

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -810,6 +810,15 @@ class VariantExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
"Hello")
}

test("SPARK-48150: ParseJson expression nullability") {
assert(!ParseJson(Literal("["), failOnError = true).replacement.nullable)
assert(ParseJson(Literal("["), failOnError = false).replacement.nullable)
checkEvaluation(
ParseJson(Literal("["), failOnError = false).replacement,
null
)
}

test("cast to variant") {
def check[T : TypeTag](input: T, expectedJson: String): Unit = {
val cast = Cast(Literal.create(input), VariantType, evalMode = EvalMode.ANSI)
Expand Down

0 comments on commit 8cf602a

Please sign in to comment.