Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refine UT framework to promote GPU evaluation, Part 1 #10851

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1485,6 +1485,13 @@ val GPU_COREDUMP_PIPE_PATTERN = conf("spark.rapids.gpu.coreDump.pipePattern")
.stringConf
.createWithDefault(false.toString)

val FOLDABLE_NON_LIT_ALLOWED = conf("spark.rapids.sql.test.isFoldableNonLitAllowed")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need this at all. The foldable check was put in place a long time ago when each expression had to have explicit support for scalar values. We refectored the code a while ago and removed that. Just the fact that it works for testing means that the refactor was correct and I think we can just delete the check entirely.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I do it in this PR or in another separate one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with doing it later as this is an internal config, and it might be a large-ish change. But at the same time I don't want to forget about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this issue will be tracked in #10887

.doc("Only to be used in tests. If `true` the foldable expressions that are not literals " +
"will be allowed")
.internal()
.booleanConf
.createWithDefault(false)

val TEST_CONF = conf("spark.rapids.sql.test.enabled")
.doc("Intended to be used by unit tests, if enabled all operations must run on the " +
"GPU or an error happens.")
Expand Down Expand Up @@ -2428,6 +2435,8 @@ class RapidsConf(conf: Map[String, String]) extends Logging {

lazy val isTestEnabled: Boolean = get(TEST_CONF)

lazy val isFoldableNonLitAllowed: Boolean = get(FOLDABLE_NON_LIT_ALLOWED)

/**
* Convert a string value to the injection configuration OomInjection.
*
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1108,7 +1108,7 @@ abstract class BaseExprMeta[INPUT <: Expression](
case _ => ExpressionContext.getRegularOperatorContext(this)
}

val isFoldableNonLitAllowed: Boolean = false
val isFoldableNonLitAllowed: Boolean = conf.isFoldableNonLitAllowed

// There are 4 levels of timezone check in GPU plan tag phase:
// Level 1: Check whether an expression is related to timezone. This is achieved by
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/*** spark-rapids-shim-json-lines
{"spark": "330"}
spark-rapids-shim-json-lines ***/
package org.apache.spark.sql.rapids.suites

import org.apache.spark.sql.catalyst.expressions.JsonExpressionsSuite
import org.apache.spark.sql.rapids.utils.{RapidsJsonConfTrait, RapidsTestsTrait}

class RapidsJsonExpressionsSuite
extends JsonExpressionsSuite with RapidsTestsTrait with RapidsJsonConfTrait {}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ spark-rapids-shim-json-lines ***/
package org.apache.spark.sql.rapids.suites

import org.apache.spark.sql.JsonFunctionsSuite
import org.apache.spark.sql.rapids.utils.RapidsSQLTestsTrait
import org.apache.spark.sql.rapids.utils.{RapidsJsonConfTrait, RapidsSQLTestsTrait}

class RapidsJsonFunctionsSuite extends JsonFunctionsSuite with RapidsSQLTestsTrait {}
class RapidsJsonFunctionsSuite
extends JsonFunctionsSuite with RapidsSQLTestsTrait with RapidsJsonConfTrait {}
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ import org.apache.spark.sql.execution.datasources.{InMemoryFileIndex, NoopCache}
import org.apache.spark.sql.execution.datasources.json.JsonSuite
import org.apache.spark.sql.execution.datasources.v2.json.JsonScanBuilder
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.rapids.utils.RapidsSQLTestsBaseTrait
import org.apache.spark.sql.rapids.utils.{RapidsJsonConfTrait, RapidsSQLTestsBaseTrait}
import org.apache.spark.sql.sources
import org.apache.spark.sql.types.{IntegerType, StructType}
import org.apache.spark.sql.util.CaseInsensitiveStringMap

class RapidsJsonSuite extends JsonSuite with RapidsSQLTestsBaseTrait {

class RapidsJsonSuite
extends JsonSuite with RapidsSQLTestsBaseTrait with RapidsJsonConfTrait {
/** Returns full path to the given file in the resource folder */
override protected def testFile(fileName: String): String = {
getWorkspaceFilePath("sql", "core", "src", "test", "resources").toString + "/" + fileName
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/*** spark-rapids-shim-json-lines
{"spark": "330"}
spark-rapids-shim-json-lines ***/
package org.apache.spark.sql.rapids.utils

import org.scalatest.{BeforeAndAfterAll, Suite}

import org.apache.spark.sql.internal.SQLConf

trait RapidsJsonConfTrait extends BeforeAndAfterAll { this: Suite =>
override def beforeAll(): Unit = {
super.beforeAll()
SQLConf.get.setConfString("spark.rapids.sql.expression.JsonTuple", true.toString)
SQLConf.get.setConfString("spark.rapids.sql.expression.GetJsonObject", true.toString)
SQLConf.get.setConfString("spark.rapids.sql.expression.JsonToStructs", true.toString)
SQLConf.get.setConfString("spark.rapids.sql.expression.StructsToJson", true.toString)
}

override def afterAll(): Unit = {
SQLConf.get.unsetConf("spark.rapids.sql.expression.JsonTuple")
SQLConf.get.unsetConf("spark.rapids.sql.expression.GetJsonObject")
SQLConf.get.unsetConf("spark.rapids.sql.expression.JsonToStructs")
SQLConf.get.unsetConf("spark.rapids.sql.expression.StructsToJson")
super.afterAll()
}
Comment on lines +35 to +41
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not how I thought my previous comment would be addressed. Should not beforeAll and afterAll largely be the same implementation as in
other Unit Tests based off SparkQueryCompareTestSuite

especially the cleanup after?

 override def afterAll(): Unit = { 
   super.afterAll() 
   TrampolineUtil.cleanupAnyExistingSession() 
 } 

And we should just make sure that we reuse for new UT and

Copy link
Collaborator Author

@binmahone binmahone May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RapidsJsonConfTrait is an extra trait to inject more json related configs, and it will call the beforeAll/afterAll methods in its parent traits ( because of trait linerlization)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am talking about the fact that we already have the code cleaning out SparkSession via TrampolineUtil.cleanupAnyExistingSession() . Instead of doing a special logic for this one test we could reuse the more general cleanup code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with Gera offline, we'll use #10886 to track this

}
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,12 @@ object RapidsSQLTestsBaseTrait {
"org.apache.spark.sql.rapids.ExecutionPlanCaptureCallback")
.set("spark.sql.warehouse.dir", warehouse)
.set("spark.sql.cache.serializer", "com.nvidia.spark.ParquetCachedBatchSerializer")
.set("spark.sql.session.timeZone", "UTC")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want to force UTC in the tests? We have partial support for timezones and it's likely more will be added later. Typically what we've done in the past is leave it up to the environment setup to specify the timezone desired to be tested (e.g.: CI scripts will set the TZ environment variable or other JVM settings to setup the timezone) rather than have the tests smash the timezone directly. That way the tests can be run across multiple timezones by changing the environment before running the tests.

Comment applies to a couple of other places below where the timezone is being smashed with UTC.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the consideration of spark.sql.session.timeZone is explained in previous comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GaryShen2008 told me that they're already suggesting our customer to set timezone to UTC to facilitate GPU execution. So at least we're not testing a fictitious scenario

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should enable non-UTC test cases later, but in the first step, setting UTC here is to enable us testing in a fixed config which we support now. It can help us to detect the failure in UTC case. Without this UTC setting, the test cases may pass because of fallback to CPU. Currently we haven't been able to detect the wrong fallback in UT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we're not setting UTC in the environment of the test setup, which lets us reuse these tests. If we're going to leave this as-is, there needs to be a TODO comment linking to a tracking issue to fix this.

Copy link
Collaborator Author

@binmahone binmahone May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened a TODO issue for this. #10874, will add it to code's comment.

.set("spark.rapids.sql.explain", "ALL")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make the tests very verbose. Intentional, or debug code accidentally left in?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both are Intentional.

  • For "spark.sql.session.timeZone", we find many expressions won't go to GPU routine unless timezone is set to UTC. Since we want to test as many test cases as possible in GPU, we'd better use UTC timezone

  • For "spark.rapids.sql.explain", since by default tests/src/test/resources/log4j2.properties already set console appender's log level to error, we won't see verbose logs even if spark.rapids.sql.explain is set to ALL here. The benifit of setting it to ALL is, when we want to check&debug test cases, we only need to change log4j2.proerties, instead of changing both log4j2.proerties and adding .set("spark.rapids.sql.explain", "ALL") in code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we want to test as many test cases as possible in GPU, we'd better use UTC timezone

I'm not questioning that we need to control the timezone for these tests, but rather the discussion is about how to control the timezone. Normally we don't want hardcoding like this, because now the test cannot check if we're doing the right thing in any timezone other that UTC, and like I said, we do have some timezone support with more on the way. Hardcoding this precludes using these tests to verify we're doing the right thing in any other timezone. Having the test environment setup script control that makes the test reusable.

// uncomment below config to run `strict mode`, where fallback to CPU is treated as fail
// .set("spark.rapids.sql.test.enabled", "true")
// .set("spark.rapids.sql.test.allowedNonGpu",
// "SerializeFromObjectExec,DeserializeToObjectExec,ExternalRDDScanExec")
.setAppName("rapids spark plugin running Vanilla Spark UT")

conf
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
spark-rapids-shim-json-lines ***/
package org.apache.spark.sql.rapids.utils

import org.apache.spark.sql.rapids.suites.{RapidsCastSuite, RapidsDataFrameAggregateSuite, RapidsJsonFunctionsSuite, RapidsJsonSuite, RapidsMathFunctionsSuite, RapidsRegexpExpressionsSuite, RapidsStringExpressionsSuite, RapidsStringFunctionsSuite}
import org.apache.spark.sql.rapids.suites.{RapidsCastSuite, RapidsDataFrameAggregateSuite, RapidsJsonExpressionsSuite, RapidsJsonFunctionsSuite, RapidsJsonSuite, RapidsMathFunctionsSuite, RapidsRegexpExpressionsSuite, RapidsStringExpressionsSuite, RapidsStringFunctionsSuite}

// Some settings' line length exceeds 100
// scalastyle:off line.size.limit
Expand All @@ -41,7 +41,28 @@ class RapidsTestSettings extends BackendTestSettings {
.exclude("SPARK-17641: collect functions should not collect null values", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10772"))
.exclude("SPARK-19471: AggregationIterator does not initialize the generated result projection before using it", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10772"))
.exclude("SPARK-24788: RelationalGroupedDataset.toString with unresolved exprs should not fail", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10801"))
enableSuite[RapidsJsonExpressionsSuite]
.exclude("from_json - invalid data", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("from_json - input=empty array, schema=struct, output=single row with null", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("from_json - input=empty object, schema=struct, output=single row with null", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("SPARK-20549: from_json bad UTF-8", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("from_json with timestamp", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json - array", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json - array with single empty row", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json - empty array", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json with timestamp", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("SPARK-21513: to_json support map[string, struct] to json", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("SPARK-21513: to_json support map[struct, struct] to json", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("SPARK-21513: to_json support map[string, integer] to json", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json - array with maps", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("to_json - array with single map", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
.exclude("from_json missing fields", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10849"))
enableSuite[RapidsJsonFunctionsSuite]
.exclude("from_json invalid json", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10852"))
.exclude("to_json - array", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10852"))
.exclude("to_json - map", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10852"))
.exclude("to_json - array of primitive types", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10852"))
.exclude("SPARK-33134: return partial results only for root JSON objects", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10852"))
enableSuite[RapidsJsonSuite]
.exclude("Casting long as timestamp", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10773"))
.exclude("Write timestamps correctly with timestampFormat option and timeZone option", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10773"))
Expand All @@ -58,24 +79,14 @@ class RapidsTestSettings extends BackendTestSettings {
.exclude("SPARK-37360: Timestamp type inference for a mix of TIMESTAMP_NTZ and TIMESTAMP_LTZ", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10773"))
enableSuite[RapidsMathFunctionsSuite]
enableSuite[RapidsRegexpExpressionsSuite]
.exclude("RegexReplace", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10774"))
.exclude("RegexExtract", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10774"))
.exclude("RegexExtractAll", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10774"))
.exclude("SPLIT", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10774"))
enableSuite[RapidsStringExpressionsSuite]
.exclude("concat", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("string substring_index function", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("format_number / FormatNumber", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("SPARK-22498: Concat should not generate codes beyond 64KB", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("SPARK-22549: ConcatWs should not generate codes beyond 64KB", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("SPARK-22550: Elt should not generate codes beyond 64KB", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("StringComparison", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("Substring", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("ascii for string", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("base64/unbase64 for string", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("encode/decode for string", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("SPARK-22603: FormatString should not generate codes beyond 64KB", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("LOCATE", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("LPAD/RPAD", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("REPEAT", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("length for string / binary", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
.exclude("ParseUrl", KNOWN_ISSUE("https://github.com/NVIDIA/spark-rapids/issues/10775"))
enableSuite[RapidsStringFunctionsSuite]
}
Expand Down
Loading