Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn need be translated to predicate too #36776

Closed

Conversation

beliefer
Copy link
Contributor

@beliefer beliefer commented Jun 6, 2022

What changes were proposed in this pull request?

#35768 assume the expression in And, Or and Not must be predicate.
#36370 and #36325 supported push down expressions in GROUP BY and ORDER BY. But the children of And, Or and Not can be FieldReference.column(name).
FieldReference.column(name) is not a predicate, so the assert may fail.

Why are the changes needed?

This PR fix the bug for PushableColumnWithoutNestedColumn.

Does this PR introduce any user-facing change?

'Yes'.
Let the push-down framework more correctly.

How was this patch tested?

New tests

@github-actions github-actions bot added the SQL label Jun 6, 2022
@beliefer beliefer changed the title [SPARK-38997][SPARK-39037][FOLLOWUP] PushableColumnWithoutNestedColumn need be translated to predicate too [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn need be translated to predicate too Jun 6, 2022
@beliefer
Copy link
Contributor Author

beliefer commented Jun 7, 2022

ping @huaxingao cc @cloud-fan

@@ -55,8 +55,13 @@ class V2ExpressionBuilder(
} else {
Some(FieldReference(name))
}
case pushableColumn(name) if !nestedPredicatePushdownEnabled =>
Some(FieldReference.column(name))
case col @ pushableColumn(name) if !nestedPredicatePushdownEnabled =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we merge the code a bit more?

case col @ pushableColumn(name) =>
  val ref = if (nestedPredicatePushdownEnabled) ... else ...
  if (predicate) ... else ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.3!

@cloud-fan cloud-fan closed this in 125555c Jun 9, 2022
cloud-fan pushed a commit that referenced this pull request Jun 9, 2022
…Column` need be translated to predicate too

### What changes were proposed in this pull request?
#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
#36370 and #36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes #36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 125555c)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@beliefer
Copy link
Contributor Author

beliefer commented Jun 9, 2022

@cloud-fan @huaxingao Thank you for the review.

chenzhx pushed a commit to chenzhx/spark that referenced this pull request Jun 13, 2022
…Column` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
chenzhx pushed a commit to chenzhx/spark that referenced this pull request Jun 15, 2022
…Column` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
chenzhx added a commit to Kyligence/spark that referenced this pull request Jun 15, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
yabola pushed a commit to Kyligence/spark that referenced this pull request Jun 21, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
leejaywei pushed a commit to Kyligence/spark that referenced this pull request Jul 14, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

JDBC dialect supports registering dialect specific functions

'No'.
New feature.

New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

Capitalize sql keywords in `JDBCV2Suite`.

'No'.
Just update test cases.

N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

This PR fix the bug for `PushableColumnWithoutNestedColumn`.

'Yes'.
Let the push-down framework more correctly.

New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

yes, bug fix

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
zheniantoushipashi pushed a commit to Kyligence/spark that referenced this pull request Aug 8, 2022
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

### What changes were proposed in this pull request?
The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

### Why are the changes needed?
JDBC dialect supports registering dialect specific functions

### Does this PR introduce _any_ user-facing change?
'No'.
New feature.

### How was this patch tested?
New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

### What changes were proposed in this pull request?
`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

### Why are the changes needed?
Capitalize sql keywords in `JDBCV2Suite`.

### Does this PR introduce _any_ user-facing change?
'No'.
Just update test cases.

### How was this patch tested?
N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

### What changes were proposed in this pull request?
apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

### Why are the changes needed?
This PR fix the bug for `PushableColumnWithoutNestedColumn`.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Let the push-down framework more correctly.

### How was this patch tested?
New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

### What changes were proposed in this pull request?

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

### Why are the changes needed?

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

### Does this PR introduce _any_ user-facing change?

yes, bug fix

### How was this patch tested?

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
RolatZhang pushed a commit to Kyligence/spark that referenced this pull request Aug 29, 2023
…mal binary arithmetic (#481)

* [SPARK-39270][SQL] JDBC dialect supports registering dialect specific functions

The build-in functions in Spark is not the same as JDBC database.
We can provide the chance users could register dialect specific functions.

JDBC dialect supports registering dialect specific functions

'No'.
New feature.

New tests.

Closes apache#36649 from beliefer/SPARK-39270.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39413][SQL] Capitalize sql keywords in JDBCV2Suite

`JDBCV2Suite` exists some test case which uses sql keywords are not capitalized.
This PR will capitalize sql keywords in `JDBCV2Suite`.

Capitalize sql keywords in `JDBCV2Suite`.

'No'.
Just update test cases.

N/A.

Closes apache#36805 from beliefer/SPARK-39413.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: huaxingao <huaxin_gao@apple.com>

* [SPARK-38997][SPARK-39037][SQL][FOLLOWUP] PushableColumnWithoutNestedColumn` need be translated to predicate too

apache#35768 assume the expression in `And`, `Or` and `Not` must be predicate.
apache#36370 and apache#36325 supported push down expressions in `GROUP BY` and `ORDER BY`. But the children of `And`, `Or` and `Not` can be `FieldReference.column(name)`.
`FieldReference.column(name)` is not a predicate, so the assert may fail.

This PR fix the bug for `PushableColumnWithoutNestedColumn`.

'Yes'.
Let the push-down framework more correctly.

New tests

Closes apache#36776 from beliefer/SPARK-38997_SPARK-39037_followup.

Authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* [SPARK-39316][SQL] Merge PromotePrecision and CheckOverflow into decimal binary arithmetic

The main change:
- Add a new method `resultDecimalType` in `BinaryArithmetic`
- Add a new expression `DecimalAddNoOverflowCheck` for the internal decimal add, e.g. `Sum`/`Average`, the different with `Add` is:
  - `DecimalAddNoOverflowCheck` does not check overflow
  - `DecimalAddNoOverflowCheck` make `dataType` as its input parameter
- Merge the decimal precision code of `DecimalPrecision` into each arithmetic data type, so every arithmetic should report the accurate decimal type. And we can remove the unused expression `PromotePrecision` and related code
- Merge `CheckOverflow` iinto arithmetic eval and code-gen code path, so every arithmetic can handle the overflow case during runtime

Merge `PromotePrecision` into `dataType`, for example, `Add`:
```scala
override def resultDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
  val resultScale = max(s1, s2)
  if (allowPrecisionLoss) {
    DecimalType.adjustPrecisionScale(max(p1 - s1, p2 - s2) + resultScale + 1,
      resultScale)
  } else {
    DecimalType.bounded(max(p1 - s1, p2 - s2) + resultScale + 1, resultScale)
  }
}
```

Merge `CheckOverflow`, for example, `Add` eval:
```scala
dataType match {
  case decimalType: DecimalType =>
    val value = numeric.plus(input1, input2)
    checkOverflow(value.asInstanceOf[Decimal], decimalType)
  ...
}
```

Note that, `CheckOverflow` is still useful after this pr, e.g. `RowEncoder`. We can do further in a separate pr.

Fix the bug of `TypeCoercion`, for example:
```sql
SELECT CAST(1 AS DECIMAL(28, 2))
UNION ALL
SELECT CAST(1 AS DECIMAL(18, 2)) / CAST(1 AS DECIMAL(18, 2));
```

Relax the decimal precision at runtime, so we do not need redundant Cast

yes, bug fix

Pass exists test and add some bug fix test in `decimalArithmeticOperations.sql`

Closes apache#36698 from ulysses-you/decimal.

Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

* fix ut

Co-authored-by: Jiaan Geng <beliefer@163.com>
Co-authored-by: ulysses-you <ulyssesyou18@gmail.com>
Co-authored-by: Wenchen Fan <cloud0fan@gmail.com>
Co-authored-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants