Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -175,10 +175,12 @@ object DecimalPrecision extends TypeCoercionRule {
resultType, nullOnOverflow)

case expr @ IntegralDivide(
e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2)) =>
e1 @ DecimalType.Expression(p1, s1), e2 @ DecimalType.Expression(p2, s2), returnLong) =>
val widerType = widerDecimalType(p1, s1, p2, s2)
val promotedExpr =
IntegralDivide(promotePrecision(e1, widerType), promotePrecision(e2, widerType))
val promotedExpr = IntegralDivide(
promotePrecision(e1, widerType),
promotePrecision(e2, widerType),
returnLong)
if (expr.dataType.isInstanceOf[DecimalType]) {
// This follows division rule
val intDig = p1 - s1 + s2
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -403,11 +403,18 @@ case class Divide(left: Expression, right: Expression) extends DivModLike {
""",
since = "3.0.0")
// scalastyle:on line.size.limit
case class IntegralDivide(left: Expression, right: Expression) extends DivModLike {
case class IntegralDivide(
left: Expression,
right: Expression,
returnLong: Boolean) extends DivModLike {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just add a private val returnLong = SQLConf.get.integralDivideReturnLong in the class body? Then the config value is fixed when the expression is created. And it can be serialized to executors.

The spark Expression constructor is kind of exposed to end users when they call functions in SQL. BTW Cast already use a val to store config values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can potentially change value every time you transform the tree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or how about we create 2 expressions IntegralDivide and IntegralDivideReturnLong? I'm just worried about we allow end users to specify the returnLong parameter which becomes an API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just worried about we allow end users to specify the returnLong parameter which becomes an API.

We don't allow that:

SELECT div(3, 2, false);

fails with:

Invalid number of arguments for function div. Expected: 2; Found: 3; line 1 pos 7
org.apache.spark.sql.AnalysisException: Invalid number of arguments for function div. Expected: 2; Found: 3; line 1 pos 7
	at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$.$anonfun$expression$7(FunctionRegistry.scala:618)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.catalyst.analysis.FunctionRegistry$.$anonfun$expression$4(FunctionRegistry.scala:602)
	at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:121)
	at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1418)

I ran the command on the PR changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the non-expression parameter doesn't count? Then I'm fine with it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we count only expressions, see

val params = Seq.fill(expressions.size)(classOf[Expression])
val f = constructors.find(_.getParameterTypes.toSeq == params).getOrElse {


def this(left: Expression, right: Expression) = {
this(left, right, SQLConf.get.integralDivideReturnLong)
}

override def inputType: AbstractDataType = TypeCollection(IntegralType, DecimalType)

override def dataType: DataType = if (SQLConf.get.integralDivideReturnLong) {
override def dataType: DataType = if (returnLong) {
LongType
} else {
left.dataType
Expand All @@ -416,7 +423,7 @@ case class IntegralDivide(left: Expression, right: Expression) extends DivModLik
override def symbol: String = "/"
override def decimalMethod: String = "quot"
override def decimalToDataTypeCodeGen(decimalResult: String): String = {
if (SQLConf.get.integralDivideReturnLong) {
if (returnLong) {
s"$decimalResult.toLong()"
} else {
decimalResult
Expand All @@ -433,7 +440,7 @@ case class IntegralDivide(left: Expression, right: Expression) extends DivModLik
d.asIntegral.asInstanceOf[Integral[Any]]
}
val divide = integral.quot _
if (SQLConf.get.integralDivideReturnLong) {
if (returnLong) {
val toLong = integral.asInstanceOf[Integral[Any]].toLong _
(x, y) => {
val res = divide(x, y)
Expand All @@ -451,6 +458,12 @@ case class IntegralDivide(left: Expression, right: Expression) extends DivModLik
override def evalOperation(left: Any, right: Any): Any = div(left, right)
}

object IntegralDivide {
def apply(left: Expression, right: Expression): IntegralDivide = {
new IntegralDivide(left, right)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we define unapply as well? Most of the time we don't care about the returnLong paramter. When we do, we should write case e @ IntegralDivide(left, right) if e.returnLong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one place where we unapply IntegralDivide. It is here https://github.com/apache/spark/pull/27628/files#diff-8e1575bb706d6f7e8b5ea0b175eaeafcR178 . And returnLong is not checked, it is just extracted and copy-pasted.

Copy link
Member Author

@MaxGekk MaxGekk Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I define unapply, it will be not used. Not sure that is is needed.


@ExpressionDescription(
usage = "expr1 _FUNC_ expr2 - Returns the remainder after `expr1`/`expr2`.",
examples = """
Expand Down