Skip to content

[Core] Spark assert_true and raise_error function support #5991

@gaoyangxiaozhu

Description

@gaoyangxiaozhu

Description

spark assert_true and raise_error function not compatible with current velox type system and function register interface.
as below query a example,

val df = spark.sql("select assert_true(1 > 0) as v, id from t")
df: org.apache.spark.sql.DataFrame = [v: void, id: bigint]
df.schema
res3: org.apache.spark.sql.types.StructType = StructType(StructField(v,NullType,true),StructField(id,LongType,true))

it would generate void type and NullType in output schema which would convert to unknown type in velox, for raise_error, it expect void return type but velox function always expect a non-void return type.

One way i tried is convert raise_error from void return type to string type but it need we update not only expression itself but also the output type schema also from NullType to StringType, otherwise would encounter mem allocation issue based on my local test.

Update:
RaiseError's data type is NullType. Ditto for AssertTrue as it is replaced by IF + RaiseError.
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/misc.scala#L82

@zhouyuan / @PHILO-HE / @FelixYBW

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions