Skip to content

Commit

Permalink
[SPARK-25454][SQL] add a new config for picking minimum precision for…
Browse files Browse the repository at this point in the history
… integral literals

## What changes were proposed in this pull request?

#20023 proposed to allow precision lose during decimal operations, to reduce the possibilities of overflow. This is a behavior change and is protected by the DECIMAL_OPERATIONS_ALLOW_PREC_LOSS config. However, that PR introduced another behavior change: pick a minimum precision for integral literals, which is not protected by a config. This PR add a new config for it: `spark.sql.literal.pickMinimumPrecision`.

This can allow users to work around issue in SPARK-25454, which is caused by a long-standing bug of negative scale.

## How was this patch tested?

a new test

Closes #22494 from cloud-fan/decimal.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
(cherry picked from commit d0990e3)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  • Loading branch information
cloud-fan authored and gatorsmile committed Sep 27, 2018
1 parent 2381d60 commit 26d893a
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 4 deletions.
Expand Up @@ -290,11 +290,13 @@ object DecimalPrecision extends TypeCoercionRule {
// potentially loosing 11 digits of the fractional part. Using only the precision needed
// by the Literal, instead, the result would be DECIMAL(38 + 1 + 1, 18), which would
// become DECIMAL(38, 16), safely having a much lower precision loss.
case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType]
&& l.dataType.isInstanceOf[IntegralType] =>
case (l: Literal, r) if r.dataType.isInstanceOf[DecimalType] &&
l.dataType.isInstanceOf[IntegralType] &&
SQLConf.get.literalPickMinimumPrecision =>
b.makeCopy(Array(Cast(l, DecimalType.fromLiteral(l)), r))
case (l, r: Literal) if l.dataType.isInstanceOf[DecimalType]
&& r.dataType.isInstanceOf[IntegralType] =>
case (l, r: Literal) if l.dataType.isInstanceOf[DecimalType] &&
r.dataType.isInstanceOf[IntegralType] &&
SQLConf.get.literalPickMinimumPrecision =>
b.makeCopy(Array(l, Cast(r, DecimalType.fromLiteral(r))))
// Promote integers inside a binary expression with fixed-precision decimals to decimals,
// and fixed-precision decimals in an expression with floats / doubles to doubles
Expand Down
Expand Up @@ -1103,6 +1103,15 @@ object SQLConf {
.booleanConf
.createWithDefault(true)

val LITERAL_PICK_MINIMUM_PRECISION =
buildConf("spark.sql.legacy.literal.pickMinimumPrecision")
.internal()
.doc("When integral literal is used in decimal operations, pick a minimum precision " +
"required by the literal if this config is true, to make the resulting precision and/or " +
"scale smaller. This can reduce the possibility of precision lose and/or overflow.")
.booleanConf
.createWithDefault(true)

val SQL_OPTIONS_REDACTION_PATTERN =
buildConf("spark.sql.redaction.options.regex")
.doc("Regex to decide which keys in a Spark SQL command's options map contain sensitive " +
Expand Down Expand Up @@ -1519,6 +1528,8 @@ class SQLConf extends Serializable with Logging {

def decimalOperationsAllowPrecisionLoss: Boolean = getConf(DECIMAL_OPERATIONS_ALLOW_PREC_LOSS)

def literalPickMinimumPrecision: Boolean = getConf(LITERAL_PICK_MINIMUM_PRECISION)

def continuousStreamingExecutorQueueSize: Int = getConf(CONTINUOUS_STREAMING_EXECUTOR_QUEUE_SIZE)

def continuousStreamingExecutorPollIntervalMs: Long =
Expand Down
Expand Up @@ -2824,6 +2824,13 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
val result = ds.flatMap(_.bar).distinct
result.rdd.isEmpty
}

test("SPARK-25454: decimal division with negative scale") {
// TODO: completely fix this issue even when LITERAL_PRECISE_PRECISION is true.
withSQLConf(SQLConf.LITERAL_PICK_MINIMUM_PRECISION.key -> "false") {
checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), Row(BigDecimal("26.3934994510000")))
}
}
}

case class Foo(bar: Option[String])

0 comments on commit 26d893a

Please sign in to comment.