Please sign in to comment.
[SPARK-22036][SQL] Decimal multiplication with high precision/scale o…
…ften returns NULL ## What changes were proposed in this pull request? When there is an operation between Decimals and the result is a number which is not representable exactly with the result's precision and scale, Spark is returning `NULL`. This was done to reflect Hive's behavior, but it is against SQL ANSI 2011, which states that "If the result cannot be represented exactly in the result type, then whether it is rounded or truncated is implementation-defined". Moreover, Hive now changed its behavior in order to respect the standard, thanks to HIVE-15331. Therefore, the PR propose to: - update the rules to determine the result precision and scale according to the new Hive's ones introduces in HIVE-15331; - round the result of the operations, when it is not representable exactly with the result's precision and scale, instead of returning `NULL` - introduce a new config `spark.sql.decimalOperations.allowPrecisionLoss` which default to `true` (ie. the new behavior) in order to allow users to switch back to the previous one. Hive behavior reflects SQLServer's one. The only difference is that the precision and scale are adjusted for all the arithmetic operations in Hive, while SQL Server is said to do so only for multiplications and divisions in the documentation. This PR follows Hive's behavior. A more detailed explanation is available here: https://mail-archives.apache.org/mod_mbox/spark-dev/201712.mbox/%3CCAEorWNAJ4TxJR9NBcgSFMD_VxTg8qVxusjP%2BAJP-x%2BJV9zH-yA%40mail.gmail.com%3E. ## How was this patch tested? modified and added UTs. Comparisons with results of Hive and SQLServer. Author: Marco Gaido <email@example.com> Closes #20023 from mgaido91/SPARK-22036. (cherry picked from commit e28eb43) Signed-off-by: Wenchen Fan <firstname.lastname@example.org>
- Loading branch information...
Showing with 434 additions and 82 deletions.
- +5 −0 docs/sql-programming-guide.md
- +84 −30 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
- +1 −1 sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
- +12 −0 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
- +44 −1 sql/catalyst/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
- +2 −2 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
- +10 −10 sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecisionSuite.scala
- +47 −0 sql/core/src/test/resources/sql-tests/inputs/typeCoercion/native/decimalArithmeticOperations.sql
- +227 −18 ...core/src/test/resources/sql-tests/results/typeCoercion/native/decimalArithmeticOperations.sql.out
- +2 −2 sql/core/src/test/resources/sql-tests/results/typeCoercion/native/decimalPrecision.sql.out
- +0 −18 sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
Oops, something went wrong.