Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33600][SQL] Group exception messages in execution/datasources/v2 #31619

Closed
wants to merge 15 commits into from

Conversation

karenfeng
Copy link
Contributor

What changes were proposed in this pull request?

This PR groups exception messages in execution/datasources/v2.

Why are the changes needed?

It will largely help with standardization of error messages and its maintenance.

Does this PR introduce any user-facing change?

No. Error messages remain unchanged.

How was this patch tested?

No new tests - pass all original tests to make sure it doesn't break any existing behavior.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@github-actions github-actions bot added the SQL label Feb 23, 2021
@SparkQA
Copy link

SparkQA commented Feb 23, 2021

Test build #135368 has finished for PR 31619 at commit edb529c.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Feb 23, 2021

Test build #135370 has finished for PR 31619 at commit d004f5c.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There exists most Exception occusr in execution stage, so we should put them into QueryExecutionErros.

@@ -125,7 +126,7 @@ class InMemoryTableSessionCatalog extends TestV2SessionCatalogBase[InMemoryTable

newTable
case _ =>
throw new NoSuchTableException(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -83,7 +84,7 @@ private class PartitionIterator[T](reader: PartitionReader[T]) extends Iterator[

override def next(): T = {
if (!hasNext) {
throw new java.util.NoSuchElementException("End of stream")
throw QueryExecutionErrors.endOfStreamError()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoSuchElementException is generated during execute and should be placed in QueryExecutionErrors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is already located in QueryExecutionErrors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is already located in QueryExecutionErrors.

Oh. I see. But fix the other comments please.

}
} else if (!ifExists) {
throw new NoSuchNamespaceException(ns)
throw QueryCompilationErrors.noSuchNamespaceError(ns)
Copy link
Contributor

@beliefer beliefer Feb 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NoSuchNamespaceException is generated during execute and should be placed in QueryExecutionErrors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the separate comment, I grouped these into QueryCompilationErrors as they are a type of AnalysisException. Can you clarify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could look the description of root ticket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the root ticket correctly, all AnalysisExceptions should end up in QueryCompilationErrors:

A general rule of thumb for grouping exceptions:

AnalysisException => QueryCompilationErrors
SparkException, RuntimeException(UnsupportedOperationException, IllegalStateException...) => QueryExecutionErrors
ParseException => QueryParsingErrors

Do you believe this rule of thumb does not apply here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But NoSuchNamespaceException occurs in execution stage. Although it extends AnalysisException.
cc @cloud-fan How about your suggestion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically for exceptions that occur in the execution stage, i.e SparkPlan.execute, we want to treat these exceptions as query execution errors. But for commands, since they are executed eagerly, users can immediately see the errors after submitting the commands. It makes more sense to group them as query compilation errors. What do you think @beliefer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it. Thanks!

@@ -37,7 +37,7 @@ case class DropTableExec(
invalidateCache()
if (purge) catalog.purgeTable(ident) else catalog.dropTable(ident)
} else if (!ifExists) {
throw new NoSuchTableException(ident)
throw QueryCompilationErrors.noSuchTableError(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -74,10 +75,10 @@ case class AtomicReplaceTableExec(
identifier, tableSchema, partitioning.toArray, tableProperties.asJava)
} catch {
case e: NoSuchTableException =>
throw new CannotReplaceMissingTableException(identifier, Some(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -278,7 +277,7 @@ class V2SessionCatalog(catalog: SessionCatalog)
false

case _ =>
throw new NoSuchNamespaceException(namespace)
throw QueryCompilationErrors.noSuchNamespaceError(namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -74,7 +74,7 @@ case class CreateTableAsSelectExec(
return Nil
}

throw new TableAlreadyExistsException(ident)
throw QueryCompilationErrors.tableAlreadyExistsError(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -200,10 +200,10 @@ case class AtomicReplaceTableAsSelectExec(
ident, schema, partitioning.toArray, properties.asJava)
} catch {
case e: NoSuchTableException =>
throw new CannotReplaceMissingTableException(ident, Some(e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -153,7 +153,7 @@ case class ReplaceTableAsSelectExec(
invalidateCache(catalog, table, ident)
catalog.dropTable(ident)
} else if (!orCreate) {
throw new CannotReplaceMissingTableException(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -109,7 +109,7 @@ case class AtomicCreateTableAsSelectExec(
return Nil
}

throw new TableAlreadyExistsException(ident)
throw QueryCompilationErrors.tableAlreadyExistsError(ident)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

…3600

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@karenfeng
Copy link
Contributor Author

There exists most Exception occusr in execution stage, so we should put them into QueryExecutionErros.

Thanks for taking a look @beliefer! My impression for this grouping work is that the QueryCompilationErrors file should contain all AnalysisExceptions, regardless of where the exception was thrown. The errors that I grouped into QueryCompilationErrors wrapped AnalysisException, eg. NoSuchNamespaceException. Could you clarify why you think these exceptions should go into QueryExecutionErrors?

@SparkQA
Copy link

SparkQA commented Mar 1, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40190/

@SparkQA
Copy link

SparkQA commented Mar 1, 2021

Test build #135611 has finished for PR 31619 at commit 21bfce8.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 1, 2021

Test build #135613 has finished for PR 31619 at commit 3c0463a.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 1, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40192/

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 1, 2021

Test build #135614 has finished for PR 31619 at commit 862ce67.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40194/

@SparkQA
Copy link

SparkQA commented Mar 2, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40194/

@SparkQA
Copy link

SparkQA commented Mar 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40193/

@SparkQA
Copy link

SparkQA commented Mar 2, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40193/

@SparkQA
Copy link

SparkQA commented Mar 2, 2021

Test build #135615 has finished for PR 31619 at commit 847ee74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 4, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40350/

@SparkQA
Copy link

SparkQA commented Mar 4, 2021

Test build #135767 has finished for PR 31619 at commit 36c0ada.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40368/

@SparkQA
Copy link

SparkQA commented Mar 5, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40368/

@SparkQA
Copy link

SparkQA commented Mar 5, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40373/

@SparkQA
Copy link

SparkQA commented Mar 5, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40373/

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left a few comments. We can remove the [WIP] in the PR title and change it to [SPARK-33600][SQL]


class CannotReplaceMissingTableException(
tableIdentifier: Identifier,
message: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not want to change the parameters for the class here.

def cannotReplaceMissingTableError(
    tableIdentifier: Identifier, cause: Option[Throwable] = None): Throwable = {
  new CannotReplaceMissingTableException(tableIdentifier, cause)
}

class QueryExecutionException(message: String, cause: Throwable = null)
extends Exception(message, cause)
extends SparkException(message, cause)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to know that there is a QueryExecutionException... We should leave the exception unchanged in PR to avoid unwanted side effects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this to inside catalyst so that it would be visible; I don't think this should be problematic as the package is still the same.

@karenfeng karenfeng changed the title [WIP][SPARK-33600] Group exception messages in execution/datasources/v2 [SPARK-33600][SQL] Group exception messages in execution/datasources/v2 Mar 10, 2021
…3600

Signed-off-by: Karen Feng <karen.feng@databricks.com>
Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40536/

@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40536/

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40562/

@SparkQA
Copy link

SparkQA commented Mar 11, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40562/

Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…3600

Signed-off-by: Karen Feng <karen.feng@databricks.com>
@karenfeng
Copy link
Contributor Author

@cloud-fan, can you take another look?

@shaneknapp
Copy link
Contributor

test this please

@SparkQA
Copy link

SparkQA commented Mar 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40762/

@SparkQA
Copy link

SparkQA commented Mar 18, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40762/

@karenfeng
Copy link
Contributor Author

@cloud-fan, can you take a look?

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 0d91f9c Mar 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants