Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement ANSI support for UnaryMinus #465

Closed
andygrove opened this issue May 23, 2024 · 1 comment · Fixed by #471
Closed

feat: Implement ANSI support for UnaryMinus #465

andygrove opened this issue May 23, 2024 · 1 comment · Fixed by #471
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@andygrove
Copy link
Member

What is the problem the feature request solves?

Comet does not support ANSI mode for UnaryMinus.

Create test data

val df = Seq(Int.MaxValue, Int.MinValue).toDF("a")
df.write.parquet("/tmp/int.parquet")
spark.read.parquet("/tmp/int.parquet").createTempView("t")

Test with ANSI mode disabled

Behavior is correct with ANSI mode disabled:

scala> spark.conf.set("spark.sql.ansi.enabled", false)

scala> spark.conf.set("spark.comet.enabled", false)

scala> spark.sql("select a, -a from t").show
+-----------+-----------+
|          a|      (- a)|
+-----------+-----------+
| 2147483647|-2147483647|
|-2147483648|-2147483648|
+-----------+-----------+


scala> spark.conf.set("spark.comet.enabled", true)

scala> spark.sql("select a, -a from t").show
24/05/23 13:55:00 WARN CometSparkSessionExtensions$CometExecRule: Comet cannot execute some parts of this plan natively because CollectLimit is not supported
+-----------+-----------+
|          a|      (- a)|
+-----------+-----------+
| 2147483647|-2147483647|
|-2147483648|-2147483648|
+-----------+-----------+

Test with ANSI mode enabled

With ANSI mode enabled, Spark throws an exception, but Comet does not.

spark.conf.set("spark.sql.ansi.enabled", true)
spark.conf.set("spark.comet.ansi.enabled", true)


scala> spark.conf.set("spark.comet.enabled", false)

scala> spark.sql("select a, -a from t").show
24/05/23 13:55:36 WARN CometSparkSessionExtensions$CometExecRule: Using Comet's experimental support for ANSI mode.
24/05/23 13:55:36 ERROR Executor: Exception in task 0.0 in stage 18.0 (TID 18)
org.apache.spark.SparkArithmeticException: [ARITHMETIC_OVERFLOW] integer overflow. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.


scala> spark.conf.set("spark.comet.enabled", true)

scala> spark.sql("select a, -a from t").show
24/05/23 13:55:48 WARN CometSparkSessionExtensions$CometExecRule: Using Comet's experimental support for ANSI mode.
24/05/23 13:55:48 WARN CometSparkSessionExtensions$CometExecRule: Comet cannot execute some parts of this plan natively because CollectLimit is not supported
+-----------+-----------+
|          a|      (- a)|
+-----------+-----------+
| 2147483647|-2147483647|
|-2147483648|-2147483648|
+-----------+-----------+

Describe the potential solution

No response

Additional context

No response

@andygrove andygrove added enhancement New feature or request good first issue Good for newcomers labels May 23, 2024
@vaibhawvipul
Copy link
Contributor

I am working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants