New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-39319][CORE][SQL] Make query contexts as a part of SparkThrowable
#37209
Conversation
SparkThrowable
SparkThrowable
SparkThrowable
SparkThrowable
SparkThrowable
SparkThrowable
@gengliangwang @cloud-fan Could you review this PR, please. |
@MaxGekk Sorry I was focusing on other works today. I will review it tomorrow. Thanks for taking it over! |
def getMessage( | ||
errorClass: String, | ||
errorSubClass: String, | ||
messageParameters: Array[String], | ||
queryContext: String = ""): String = { | ||
queryContext: Option[QueryContext]): String = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we just need a string parameter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created an interface which extends QueryContext
and has an additional method to get textual summary.
@@ -119,7 +119,7 @@ private[spark] class SparkArithmeticException( | |||
errorClass: String, | |||
errorSubClass: Option[String] = None, | |||
messageParameters: Array[String], | |||
queryContext: String = "") | |||
queryContext: Option[QueryContext] = None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should pass both QueryContext
and summary: String
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to pass one thing. I created some kind of enriched QueryContext which can give us textual summary.
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/trees/TreeNodeSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/SqlQueryContext.scala
Outdated
Show resolved
Hide resolved
Co-authored-by: Gengliang Wang gengliang@apache.org
@gengliangwang @cloud-fan Could you review the PR, please. I addressed your comments. The test failures are not related to the changes, it seems. |
public interface QueryContext { | ||
// The object type of the query which throws the exception. | ||
// If the exception is directly from the main query, it should be an empty string. | ||
// Otherwise, it should be the exact object type in upper case. For example, a "VIEW". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR, but we should use javadoc for public APIs, which will show up in our API doc. The same to SparkThrowable
. We can fix them all in a followup.
* @since 3.4.0 | ||
*/ | ||
@Evolving | ||
public interface QueryContextSummary extends QueryContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be a public API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it as a private one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we use private[spark]
? to make sure it won't show up in the API doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Java. Does such syntax work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's for internal use only, we can write it in Scala.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I don't think we need this interface. Having SQLQueryContext is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try to remove it ...
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/SqlQueryContext.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/SqlQueryContext.scala
Outdated
Show resolved
Hide resolved
The failures below:
are not related to the changes. @cloud-fan @gengliangwang Could you have a look at the PR, please. In the log, the failed because of:
which means OOM. Should we bump memory for |
TPC-Ds should be fixed now. For sql- other tests, we might need to retrigger it for now. |
@HyukjinKwon I have rebased on the recent master. Will see. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Merging to master. Thank you, @gengliangwang @cloud-fan for review. |
What changes were proposed in this pull request?
In the PR, I propose to add new interface
QueryContext
Spark core, and allow to get an instance ofQueryContext
from Spark's exceptions of the typeSparkThrowable
. For instance,QueryContext
should help users to figure out where an error occur while executing queries in Spark SQL.Also this PR adds
SqlQueryContext
as one of implementation ofQueryContext
to Spark SQLOrigin
which contains a context of TreeNodes + textual summary of the error. Thecontext
value inOrigin
will have all necessary structural info about the fragment of SQL query to which an error can be linked.All Spark's exceptions are modified to accept the optional
QueryContext
and pre-built text summary. Apparently, SQL expressions init and pass new context to exceptions.Closes #36702
Why are the changes needed?
In the future, this enriches the information of the error message. With the change, it is possible to have a new pretty printing format error message like
Does this PR introduce any user-facing change?
Yes. The PR changes Spark's exception by replacing the type of
queryContext
fromString
toOption[QueryContext]
. User's code can fail if it usesqueryContext
.How was this patch tested?
By running the modified test suites:
and affected test suites:
Authored-by: Max Gekk max.gekk@gmail.com
Co-authored-by: Gengliang Wang gengliang@apache.org