Skip to content

[SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode#55938

Open
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-56913-arithmetic-byte-short
Open

[SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode#55938
gengliangwang wants to merge 1 commit into
apache:masterfrom
gengliangwang:SPARK-56913-arithmetic-byte-short

Conversation

@gengliangwang
Copy link
Copy Markdown
Member

Title: [SPARK-56913][SQL] Refactor BinaryArithmetic byte/short codegen under ANSI mode
Base: master (independent)
Head: gengliangwang:SPARK-56913-arithmetic-byte-short

What changes were proposed in this pull request?

Introduce ArithmeticUtils.java with six static helpers (byteAddExact, byteSubtractExact, byteMultiplyExact, shortAddExact, shortSubtractExact, shortMultiplyExact) and use them from BinaryArithmetic.doGenCode and from Add / Subtract / Multiply.nullSafeEval.

The Byte/Short ANSI overflow-check branch of BinaryArithmetic.doGenCode previously emitted ~7 lines per call site (int tmpResult + overflow check + cast back). After this PR it emits a single ArithmeticUtils.<type><Op>Exact(...) call.

The eval-path counterparts for Add/Subtract/Multiply also delegate to the helpers under ANSI mode, replacing the previous fall-through to numeric.plus/minus/times (which threw a generic ArithmeticException) -- the eval path now produces the same SQL-formatted BINARY_ARITHMETIC_OVERFLOW error as the codegen path.

Primitive int/long/float/double branches are intentionally left inline (single bytecode op; routing through a static method would be a runtime regression).

Why are the changes needed?

Part of SPARK-56908 (umbrella). The Byte/Short ANSI branch is the largest single inline body in BinaryArithmetic.doGenCode.

Does this PR introduce any user-facing change?

No. Compiled behavior is identical; the eval path now produces a SQL-formatted overflow error matching the codegen path (the previous generic ArithmeticException was an inconsistency).

How was this patch tested?

build/sbt "catalyst/testOnly *ArithmeticExpressionSuite"

35/35 pass.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x

@gengliangwang
Copy link
Copy Markdown
Member Author


Stack overview (SPARK-56908 umbrella)

This PR is part of a stack of 8 PRs against SPARK-56908. Order:

  1. [SPARK-56909][SQL] Simplify Cast to int/long codegen under ANSI mode #55934 — [SPARK-56909][SQL] Simplify Cast to int/long codegen under ANSI mode (this stack base)
  2. [SPARK-56910][SQL] Simplify Cast to byte/short codegen under ANSI mode #55935 — [SPARK-56910][SQL] Simplify Cast to byte/short codegen under ANSI mode
  3. [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode #55936 — [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode
  4. [SPARK-56912][SQL] Simplify Cast to boolean codegen under ANSI mode #55937 — [SPARK-56912][SQL] Simplify Cast to boolean codegen under ANSI mode
  5. [SPARK-56914][SQL] Simplify decimal arithmetic codegen under ANSI mode #55939 — [SPARK-56914][SQL] Simplify decimal arithmetic codegen under ANSI mode (depends on [SPARK-56911][SQL] Simplify Cast to decimal codegen under ANSI mode #55936)
  6. [SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode #55938 — [SPARK-56913][SQL] Simplify BinaryArithmetic byte/short codegen under ANSI mode (independent)
  7. [SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode #55940 — [SPARK-56915][SQL] Simplify MakeDate/MakeInterval codegen under ANSI mode (independent)
  8. [SPARK-56916][SQL] Simplify ElementAt array codegen under ANSI mode #55941 — [SPARK-56916][SQL] Simplify ElementAt array codegen under ANSI mode (independent)

PRs 1-4 are linearly stacked on each other (each branch is based on the previous one). PR 5 (decimal arithmetic) is stacked on top of PR 3 (cast decimal) since it uses CastUtils.changePrecisionExact. PRs 6, 7, 8 branch off master independently.

… ANSI mode

### What changes were proposed in this pull request?

Introduce `ArithmeticUtils.java` with six static helpers
(`byteAddExact`, `byteSubtractExact`, `byteMultiplyExact`,
`shortAddExact`, `shortSubtractExact`, `shortMultiplyExact`) and use
them from `BinaryArithmetic.doGenCode` and from
`Add` / `Subtract` / `Multiply.nullSafeEval`.

The `Byte`/`Short` ANSI overflow-check branch of
`BinaryArithmetic.doGenCode` previously emitted ~7 lines per call site
(int tmpResult + overflow check + cast back). After this PR it emits a
single `ArithmeticUtils.<type><Op>Exact(...)` call.

The eval-path counterparts for Add/Subtract/Multiply also delegate to
the helpers under ANSI mode, replacing the previous fall-through to
`numeric.plus`/`minus`/`times` (which threw a generic
`ArithmeticException`) -- the eval path now produces the same
SQL-formatted `BINARY_ARITHMETIC_OVERFLOW` error as the codegen path.

Primitive `int`/`long`/`float`/`double` branches are intentionally left
inline (single bytecode op; routing through a static method would be a
runtime regression).

### Why are the changes needed?

Part of SPARK-56908 (umbrella). The Byte/Short ANSI branch is the
largest single inline body in `BinaryArithmetic.doGenCode`.

### Does this PR introduce _any_ user-facing change?

No. Compiled behavior is identical; the eval path now produces a
SQL-formatted overflow error matching the codegen path (the previous
generic `ArithmeticException` was an inconsistency).

### How was this patch tested?

```
build/sbt "catalyst/testOnly *ArithmeticExpressionSuite"
```

35/35 pass.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 1.x
@gengliangwang gengliangwang force-pushed the SPARK-56913-arithmetic-byte-short branch from 7344920 to b38e73a Compare May 18, 2026 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant