Skip to content

bug: null IN () returns false instead of null under legacy null-in-empty behavior #4786

Description

@mbutrovich

Describe the bug

For a null operand against an empty IN list, Comet always returns false. Spark returns null when spark.sql.legacy.nullInEmptyListBehavior=true. Comet ignores the config and the null operand.

Steps to reproduce

Empty IN lists are not expressible in SQL, so build via the DataFrame API; ConvertToLocalRelation and OptimizeIn must be disabled or the IN is folded away. Add to a suite extending CometTestBase (config exists in Spark 4+):

test("null IN empty list under legacy behavior") {
  assume(org.apache.comet.CometSparkSessionExtensions.isSpark40Plus)
  withSQLConf(
    "spark.sql.optimizer.excludedRules" ->
      ("org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation," +
        "org.apache.spark.sql.catalyst.optimizer.OptimizeIn"),
    "spark.sql.legacy.nullInEmptyListBehavior" -> "true") {
    val data: Seq[(Integer, Integer)] =
      Seq((Integer.valueOf(1), Integer.valueOf(1)), (null, Integer.valueOf(2)))
    withParquetTable(data, "t") {
      val df = sql("SELECT _1 AS a FROM t").select(col("a"), col("a").isin())
      checkSparkAnswer(df)
    }
  }
}
== Results ==
!== Spark Answer - 2 ==            == Comet Answer - 2 ==
 struct<a:int,(a IN ()):boolean>   struct<a:int,(a IN ()):boolean>
 [1,false]                         [1,false]
![null,null]                       [null,false]

Expected behavior

null IN () returns null under legacy behavior, matching Spark. (Under ANSI-on, the Spark 4 default, false is expected and Comet already matches, so only the legacy path is affected.)

Additional context

Found while enabling CometLocalTableScanExec by default (#4393), but reproduces over a Parquet scan. Upstream test: EmptyInSuite "IN with empty list".

Metadata

Metadata

Assignees

Labels

Type

Fields

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions