Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-29237][SQL] Prevent real function names in expression example template #25924

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -56,6 +56,8 @@ public String getArguments() {
return arguments;
}

public String getOriginalExamples() { return examples; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that this is worth adding just for testing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for @srowen 's opinion.

@MaxGekk . Instead of adding this API, shall we use the following format in the test case?

val examplesField = classOf[ExpressionInfo].getDeclaredField("examples")
examplesField.setAccessible(true)
val examples = examplesField.get(e).asInstanceOf[String]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or can we use @VisibleForTesting at least?


public String getExamples() {
return replaceFunctionName(examples);
}
Expand Down
Expand Up @@ -450,7 +450,7 @@ case class IntegralDivide(left: Expression, right: Expression) extends DivModLik
Examples:
> SELECT 2 _FUNC_ 1.8;
0.2
> SELECT MOD(2, 1.8);
> SELECT _FUNC_(2, 1.8);
MaxGekk marked this conversation as resolved.
Show resolved Hide resolved
0.2
""")
case class Remainder(left: Expression, right: Expression) extends DivModLike {
Expand Down
Expand Up @@ -1097,7 +1097,7 @@ case class TimeAdd(start: Expression, interval: Expression, timeZoneId: Option[S
usage = "_FUNC_(timestamp, timezone) - Given a timestamp like '2017-07-14 02:40:00.0', interprets it as a time in UTC, and renders that time as a timestamp in the given time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.",
examples = """
Examples:
> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul');
> SELECT _FUNC_('2016-08-31', 'Asia/Seoul');
2016-08-31 09:00:00
""",
since = "1.5.0",
Expand Down
Expand Up @@ -1123,7 +1123,7 @@ case class SubstringIndex(strExpr: Expression, delimExpr: Expression, countExpr:
4
> SELECT _FUNC_('bar', 'foobarbar', 5);
7
> SELECT POSITION('bar' IN 'foobarbar');
> SELECT _FUNC_('bar' IN 'foobarbar');
wangyum marked this conversation as resolved.
Show resolved Hide resolved
4
""",
since = "1.5.0")
Expand Down Expand Up @@ -1732,9 +1732,9 @@ case class Left(str: Expression, len: Expression, child: Expression) extends Run
Examples:
> SELECT _FUNC_('Spark SQL ');
10
> SELECT CHAR_LENGTH('Spark SQL ');
> SELECT _FUNC_('Spark SQL ');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better show examples for binary data instead of demonstrating synonyms of the function name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it would be nice to show to users all synonyms in a separate list somewhere in the description

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. +1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10
> SELECT CHARACTER_LENGTH('Spark SQL ');
> SELECT _FUNC_('Spark SQL ');
10
""",
since = "1.5.0")
Expand Down
15 changes: 15 additions & 0 deletions sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
Expand Up @@ -115,6 +115,21 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession {
}
}

test("using _FUNC_ instead of function names in examples") {
val exampleRe = "(>.*;)".r
val ignoreSet = Set("org.apache.spark.sql.catalyst.expressions.CaseWhen")
spark.sessionState.functionRegistry.listFunction().foreach { funcId =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @MaxGekk . This is super useful.

val info = spark.sessionState.catalog.lookupFunctionInfo(funcId)
val className = info.getClassName
withClue(s"Expression class '$className'") {
val exprExamples = info.getOriginalExamples
if (!exprExamples.isEmpty && !ignoreSet.contains(className)) {
assert(exampleRe.findAllIn(exprExamples).toSet.forall(_.contains("_FUNC_")))
}
}
}
}

test("SPARK-6743: no columns from cache") {
Seq(
(83, 0, 38),
Expand Down