Skip to content

Commit

Permalink
[SPARK-25609][TESTS] Reduce time of test for SPARK-22226
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

The PR changes the test introduced for SPARK-22226, so that we don't run analysis and optimization on the plan. The scope of the test is code generation and running the above mentioned operation is expensive and useless for the test.

The UT was also moved to the `CodeGenerationSuite` which is a better place given the scope of the test.

## How was this patch tested?

running the UT before SPARK-22226 fails, after it passes. The execution time is about 50% the original one. On my laptop this means that the test now runs in about 23 seconds (instead of 50 seconds).

Closes #22629 from mgaido91/SPARK-25609.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  • Loading branch information
mgaido91 authored and gatorsmile committed Oct 5, 2018
1 parent 3ae4f07 commit 85a9359
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 12 deletions.
Expand Up @@ -346,6 +346,16 @@ class CodeGenerationSuite extends SparkFunSuite with ExpressionEvalHelper {
projection(row)
}

test("SPARK-22226: splitExpressions should not generate codes beyond 64KB") {
val colNumber = 10000
val attrs = (1 to colNumber).map(colIndex => AttributeReference(s"_$colIndex", IntegerType)())
val lit = Literal(1000)
val exprs = attrs.flatMap { a =>
Seq(If(lit < a, lit, a), sqrt(a))
}
UnsafeProjection.create(exprs, attrs)
}

test("SPARK-22543: split large predicates into blocks due to JVM code size limit") {
val length = 600

Expand Down
12 changes: 0 additions & 12 deletions sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Expand Up @@ -2408,18 +2408,6 @@ class DataFrameSuite extends QueryTest with SharedSQLContext {
Seq(Row(7, 1, 1), Row(7, 1, 2), Row(7, 2, 1), Row(7, 2, 2), Row(7, 3, 1), Row(7, 3, 2)))
}

test("SPARK-22226: splitExpressions should not generate codes beyond 64KB") {
val colNumber = 10000
val input = spark.range(2).rdd.map(_ => Row(1 to colNumber: _*))
val df = sqlContext.createDataFrame(input, StructType(
(1 to colNumber).map(colIndex => StructField(s"_$colIndex", IntegerType, false))))
val newCols = (1 to colNumber).flatMap { colIndex =>
Seq(expr(s"if(1000 < _$colIndex, 1000, _$colIndex)"),
expr(s"sqrt(_$colIndex)"))
}
df.select(newCols: _*).collect()
}

test("SPARK-22271: mean overflows and returns null for some decimal variables") {
val d = 0.034567890
val df = Seq(d, d, d, d, d, d, d, d, d, d).toDF("DecimalCol")
Expand Down

0 comments on commit 85a9359

Please sign in to comment.