[SPARK-8279][SQL]Add math function round #6938

yjshen · 2015-06-22T18:01:58Z

JIRA: https://issues.apache.org/jira/browse/SPARK-8279

rxin · 2015-06-22T18:15:43Z

Jenkins, add to whitelist.

marmbrus · 2015-06-22T18:35:20Z

ok to test

SparkQA · 2015-06-22T18:41:55Z

Test build #35467 has finished for PR 6938 at commit 0495feb.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(children: Seq[Expression]) extends Expression
- trait BigDecimalConverter[T]

SparkQA · 2015-06-22T20:56:19Z

Test build #35469 has finished for PR 6938 at commit c5ce169.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(children: Seq[Expression]) extends Expression
- trait BigDecimalConverter[T]

chenghao-intel · 2015-06-23T06:34:21Z

duplicated with #6836?

yjshen · 2015-06-23T06:36:35Z

@chenghao-intel, Yep, not aware there are two JIRAs for Round. Mind looking at this as well?

chenghao-intel · 2015-06-23T06:50:09Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala

@@ -312,3 +315,90 @@ case class Logarithm(left: Expression, right: Expression)
    """
  }
 }
+
+case class Round(children: Seq[Expression]) extends Expression {


Actually we support multiple constructors now in expression, see https://github.com/apache/spark/pull/6806/files#diff-d788f93e29b4d25cdd7d60328587678bR229

chenghao-intel · 2015-06-23T07:02:11Z

As most of issues that I raised is solved in #6836, do you mind jump there and give some comments?
BTW, as the https://issues.apache.org/jira/browse/SPARK-8279 suggested, perhaps we'd better focus on the test case udf_round_3 test, which is not covered by #6836, what do you think?

yjshen · 2015-06-23T07:43:06Z

@chenghao-intel, I think the main difference between this and #6836 is whether to make Round act as GenericUDFRound, if we want to support udf_round and udf_round_3.
I suppose following Hive's semantic is a good choice.

SparkQA · 2015-06-23T08:02:33Z

Test build #35521 has finished for PR 6938 at commit be02d5b.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

chenghao-intel · 2015-06-23T08:07:45Z

Yes, #6836 follows the Hive's GenericUDFRound, but we should keep the output / input to o.a.s.s.types.Decimal, not BigDecimal (if it's a DecimalType), otherwise it will causes problem when interact with other Spark SQL expression. See https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/CatalystTypeConverters.scala#L291

yjshen · 2015-06-23T08:18:00Z

Oh, I think there exists misunderstood of the round method and BigDecimalConverter, BDC just as as a implicit type parameter of round to infer the first argument's type, for example Byte, call the corresponding constructor of BigDecimal and use it's setScale method to round, at last, round is using fromBigDecimal to convert the dataType back, i.e. to Byte in our example.

Therefore, I preserve the dataType of children(0), the only exceptions are String and Binary which would be regarded as Double in Hive.

chenghao-intel · 2015-06-23T08:24:15Z

Oh, sorry, you did use the Decimal, for Round.eval, ignore my previous comment.

SparkQA · 2015-06-23T10:43:06Z

Test build #35539 has finished for PR 6938 at commit 40f4a99.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(child: Expression, scale: Expression) extends Expression
- trait BigDecimalConverter[T]

chenghao-intel · 2015-06-23T11:23:25Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala

+
+  def children: Seq[Expression] = Seq(child, scale)
+
+  def nullable: Boolean = true


depends on child.nullable || scala.nullable?

probably more than that, Hive support String ,Double.NaN, Double.Infinity as input, all of these would result in null result.

OK, let's say if the both children are literals e.g. Literal(123.0, FloatType) and Literal(1. IntegerType), still be nullable?

it is not the end of the world to have nullable be more conservative, since it is technically correct to be nullable. however, if there is a way to do a more accurate way to determine nullability, we should do that.

The nullable will be great useful in the expression optimization, we'd better handle it properly.

I mean nullable is sometimes determined at runtime, a not null string, double.NaN is not null themselves, but would eval to null in Round.

SparkQA · 2015-06-23T17:36:47Z

Test build #35555 has finished for PR 6938 at commit 479fa9b.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(child: Expression, scale: Expression) extends Expression
- trait BigDecimalConverter[T]

yjshen · 2015-06-23T17:38:20Z

@rxin @marmbrus, mind reviewing this?

SparkQA · 2015-06-24T09:20:53Z

Test build #35660 has finished for PR 6938 at commit c14f64d.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(child: Expression, scale: Expression) extends Expression
- trait BigDecimalConverter[T]

yjshen · 2015-06-24T10:55:01Z

@chenghao-intel , refactored eval and moved type specific branching out.

SparkQA · 2015-07-14T10:43:25Z

Test build #37221 has finished for PR 6938 at commit 392b65b.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(child: Expression, scale: Expression)

rxin · 2015-07-15T06:17:26Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala

+  override def left: Expression = child
+  override def right: Expression = scale
+
+  override def children: Seq[Expression] = Seq(child, scale)


i don't think you need this

rxin · 2015-07-15T06:29:18Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MathFunctionsSuite.scala

+
+    // round_scale > current_scale would result in precision increase
+    // and not allowed by o.a.s.s.types.Decimal.changePrecision, therefore null
+    (0 to 7).foreach { i =>


I was also using i for array index here?

ok - can you at least move bdResults closer to this loop?

rxin · 2015-07-15T06:30:22Z

OK I'm going to merge this. Please submit a patch to fix the minor comments.

rxin · 2015-07-15T06:34:55Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/math.scala

+    }
+  }
+
+  override def prettyName: String = "round"


you can remove this, since the expression is already named Round

SparkQA · 2015-07-15T07:58:23Z

Test build #37326 has finished for PR 6938 at commit 07a124c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class Round(child: Expression, scale: Expression)

yjshen mentioned this pull request Jun 23, 2015

[SPARK-8206][SQL][WIP]Add function round #6836

Closed

chenghao-intel reviewed Jun 23, 2015
View reviewed changes

yjshen changed the title ~~[SPARK-8279][SQL][WIP]Add math function round~~ [SPARK-8279][SQL]Add math function round Jun 23, 2015

yjshen added 18 commits July 14, 2015 15:13

Add decimal support to Round

56db4bb

more tests on round

7c83e13

add round functions in o.a.s.sql.functions

9be894e

refactor Round's constructor

6cd9a64

codegen versioned eval

2077888

DataFrame API modification

5486b2d

modify checkInputDataTypes using foldable

1b87540

refactor eval and genCode

e6f44c4

revert accidental change

9bd6930

make round's inner method's name more meaningful

b0bff79

rely on implict cast to handle string input

c3b9839

use TypeCollection to specify wanted input and implicit cast

d10be4a

tiny style fix

9555e35

rebase & inputTypes update

8c7a949

refactor round to make it readable

31dfe7c

Add dataframe function test

302a78a

address reviews

61760ee

add negative scale test in DecimalSuite

392b65b

yjshen force-pushed the udf_round_3 branch from 0a7208f to 392b65b Compare July 14, 2015 09:04

rxin reviewed Jul 15, 2015
View reviewed changes

remove useless def children

07a124c

rxin reviewed Jul 15, 2015
View reviewed changes

asfgit closed this in f0e1297 Jul 15, 2015

rxin reviewed Jul 15, 2015
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-8279][SQL]Add math function round #6938

[SPARK-8279][SQL]Add math function round #6938

yjshen commented Jun 22, 2015

rxin commented Jun 22, 2015

marmbrus commented Jun 22, 2015

SparkQA commented Jun 22, 2015

SparkQA commented Jun 22, 2015

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

chenghao-intel Jun 23, 2015

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

SparkQA commented Jun 23, 2015

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

chenghao-intel commented Jun 23, 2015

SparkQA commented Jun 23, 2015

chenghao-intel Jun 23, 2015

yjshen Jun 23, 2015

chenghao-intel Jun 24, 2015

rxin Jun 24, 2015

chenghao-intel Jun 24, 2015

yjshen Jun 24, 2015

SparkQA commented Jun 23, 2015

yjshen commented Jun 23, 2015

SparkQA commented Jun 24, 2015

yjshen commented Jun 24, 2015

SparkQA commented Jul 14, 2015

rxin Jul 15, 2015

rxin Jul 15, 2015

yjshen Jul 15, 2015

rxin Jul 15, 2015

yjshen Jul 15, 2015

rxin commented Jul 15, 2015

rxin Jul 15, 2015

SparkQA commented Jul 15, 2015


		def children: Seq[Expression] = Seq(child, scale)

		def nullable: Boolean = true

[SPARK-8279][SQL]Add math function round #6938

[SPARK-8279][SQL]Add math function round #6938

Conversation

yjshen commented Jun 22, 2015

rxin commented Jun 22, 2015

marmbrus commented Jun 22, 2015

SparkQA commented Jun 22, 2015

SparkQA commented Jun 22, 2015

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

Choose a reason for hiding this comment

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

SparkQA commented Jun 23, 2015

chenghao-intel commented Jun 23, 2015

yjshen commented Jun 23, 2015

chenghao-intel commented Jun 23, 2015

SparkQA commented Jun 23, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Jun 23, 2015

yjshen commented Jun 23, 2015

SparkQA commented Jun 24, 2015

yjshen commented Jun 24, 2015

SparkQA commented Jul 14, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rxin commented Jul 15, 2015

Choose a reason for hiding this comment

SparkQA commented Jul 15, 2015