Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39319][CORE][SQL] Make query contexts as a part of SparkThrowable #37209

Closed
wants to merge 34 commits into from

Conversation

MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Jul 17, 2022

What changes were proposed in this pull request?

In the PR, I propose to add new interface QueryContext Spark core, and allow to get an instance of QueryContext from Spark's exceptions of the type SparkThrowable. For instance, QueryContext should help users to figure out where an error occur while executing queries in Spark SQL.

Also this PR adds SqlQueryContext as one of implementation of QueryContext to Spark SQL Origin which contains a context of TreeNodes + textual summary of the error. The context value in Origin will have all necessary structural info about the fragment of SQL query to which an error can be linked.

All Spark's exceptions are modified to accept the optional QueryContext and pre-built text summary. Apparently, SQL expressions init and pass new context to exceptions.

Closes #36702

Why are the changes needed?

In the future, this enriches the information of the error message. With the change, it is possible to have a new pretty printing format error message like

> SELECT * FROM v1;

{
  “errorClass” : [ “DIVIDE_BY_ZERO” ],
  “parameters” : [ { 
                   “name” = “config”, 
                   “value” =spark.sql.ansi.enabled” 
                 }
                 ],
  “sqlState” : “42000”,
  “context” : {
      “objectType” : “VIEW”,
      “objectName” : “default.v1”
      “indexStart” : 36,
      “indexEnd” : 41,
      “fragment” : “1 / 0” }
   }
}

Does this PR introduce any user-facing change?

Yes. The PR changes Spark's exception by replacing the type of queryContext from String to Option[QueryContext]. User's code can fail if it uses queryContext.

How was this patch tested?

By running the modified test suites:

$ build/sbt "test:testOnly *DecimalExpressionSuite"
$ build/sbt "test:testOnly *TreeNodeSuite"

and affected test suites:

$ build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite"

Authored-by: Max Gekk max.gekk@gmail.com
Co-authored-by: Gengliang Wang gengliang@apache.org

@MaxGekk MaxGekk changed the title [WIP][SQL] Make query contexts as a part of SparkThrowable [WIP][SPARK-39319][SQL] Make query contexts as a part of SparkThrowable Jul 19, 2022
@MaxGekk MaxGekk changed the title [WIP][SPARK-39319][SQL] Make query contexts as a part of SparkThrowable [WIP][SPARK-39319][CORE][SQL] Make query contexts as a part of SparkThrowable Jul 19, 2022
@MaxGekk MaxGekk changed the title [WIP][SPARK-39319][CORE][SQL] Make query contexts as a part of SparkThrowable [SPARK-39319][CORE][SQL] Make query contexts as a part of SparkThrowable Jul 19, 2022
@MaxGekk MaxGekk marked this pull request as ready for review July 19, 2022 15:37
@MaxGekk
Copy link
Member Author

MaxGekk commented Jul 19, 2022

@gengliangwang @cloud-fan Could you review this PR, please.

@gengliangwang
Copy link
Member

@MaxGekk Sorry I was focusing on other works today. I will review it tomorrow. Thanks for taking it over!

def getMessage(
errorClass: String,
errorSubClass: String,
messageParameters: Array[String],
queryContext: String = ""): String = {
queryContext: Option[QueryContext]): String = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just need a string parameter here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an interface which extends QueryContext and has an additional method to get textual summary.

@@ -119,7 +119,7 @@ private[spark] class SparkArithmeticException(
errorClass: String,
errorSubClass: Option[String] = None,
messageParameters: Array[String],
queryContext: String = "")
queryContext: Option[QueryContext] = None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pass both QueryContext and summary: String

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to pass one thing. I created some kind of enriched QueryContext which can give us textual summary.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jul 23, 2022

@gengliangwang @cloud-fan Could you review the PR, please. I addressed your comments. The test failures are not related to the changes, it seems.

public interface QueryContext {
// The object type of the query which throws the exception.
// If the exception is directly from the main query, it should be an empty string.
// Otherwise, it should be the exact object type in upper case. For example, a "VIEW".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but we should use javadoc for public APIs, which will show up in our API doc. The same to SparkThrowable. We can fix them all in a followup.

* @since 3.4.0
*/
@Evolving
public interface QueryContextSummary extends QueryContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be a public API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it as a private one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use private[spark]? to make sure it won't show up in the API doc

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Java. Does such syntax work here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's for internal use only, we can write it in Scala.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I don't think we need this interface. Having SQLQueryContext is enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to remove it ...

@MaxGekk
Copy link
Member Author

MaxGekk commented Jul 26, 2022

The failures below:

  • Build modules: sql - other tests
  • Run TPC-DS queries with SF=1

are not related to the changes. @cloud-fan @gengliangwang Could you have a look at the PR, please.

In the log, the failed because of:

failed with exit code 137

which means OOM. Should we bump memory for Build modules: sql - other tests. WDYT, @HyukjinKwon.

@HyukjinKwon
Copy link
Member

TPC-Ds should be fixed now. For sql- other tests, we might need to retrigger it for now.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jul 26, 2022

TPC-Ds should be fixed now. For sql- other tests, we might need to retrigger it for now.

@HyukjinKwon I have rebased on the recent master. Will see. Thank you.

Copy link
Member

@gengliangwang gengliangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@MaxGekk
Copy link
Member Author

MaxGekk commented Jul 26, 2022

Merging to master. Thank you, @gengliangwang @cloud-fan for review.

@MaxGekk MaxGekk closed this in e9eb28e Jul 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants