[SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers#34490
[SPARK-37214][SQL] Fail query analysis earlier with invalid identifiers#34490cloud-fan wants to merge 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
We can't throw NoSuchTableException here. The CatalogV2Utils.loadTable swallows NoSuchTableException and delays the error to CheckAnalysis, and we lost the actual error message.
There was a problem hiding this comment.
There was a problem hiding this comment.
It's a parser error, I moved the test to https://github.com/apache/spark/pull/34490/files#diff-0db82228a944d6070203b6ac834c3c8d9b16b985fc3ff319ed7408fc3b6a6c23R363
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #144926 has finished for PR 34490 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #144932 has finished for PR 34490 at commit
|
| }.getMessage | ||
| assert(message.contains( | ||
| "Table or view not found: a.b.c.tbl")) | ||
| assert(message.contains("requires a single-part namespace")) |
There was a problem hiding this comment.
Hmm, this could be a little misleading since this sounds like SHOW COLUMNS IN is only supported for the session catalog? (although it is currently not supported for v2 catalogs.) This may be OK until we start supporting this command for v2 catalogs.
| s"V2 session catalog requires a single-part namespace: ${ident.quoted}") | ||
| def requiresSinglePartNamespaceError(ns: Seq[String]): Throwable = { | ||
| new AnalysisException( | ||
| "spark_catalog requires a single-part namespace, but got " + ns.mkString("[", ", ", "]")) |
There was a problem hiding this comment.
If you don't mind, can we use SESSION_CATALOG_NAME instead of spark_catalog?
- "spark_catalog requires a single-part namespace, but got " + ns.mkString("[", ", ", "]"))
+ s"${SESSION_CATALOG_NAME} requires a single-part namespace, but got " +
+ ns.mkString("[", ", ", "]"))
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM (with one nit comment and also agree with @imback82 's comment.)
|
BTW, @cloud-fan . According to the JIRA type, is this targeting only Apache Spark 3.3.0? |
This just fixes error message, so we don't have to backport. But since the change is small, I think it should be safe to backport to 3.2 at least. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
thanks for review, merging to master/3.2! |
This is a followup of #31427 , which introduced two issues: 1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace in session catalog must have exactly one name part` before that PR, now we fail very late in `CheckAnalysis` with `NoSuchTableException` 2. The error message is a bit confusing now. We report `Table t not found` even if table `t` exists. This PR fixes the 2 issues. save analysis time and improve error message no updated test Closes #34490 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 8ab9d63) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
Test build #144978 has finished for PR 34490 at commit
|
| comparePlans(parsed2, expected2) | ||
|
|
||
| val v3 = "CREATE TEMPORARY VIEW a.b AS SELECT 1" | ||
| intercept(v3, "It is not allowed to add database prefix") |
There was a problem hiding this comment.
@cloud-fan, seems like this test is flaky:
https://github.com/apache/spark/runs/4209199644
https://github.com/apache/spark/runs/4235793982
https://github.com/apache/spark/runs/4223996038
See also SPARK-37308
There was a problem hiding this comment.
I thought this was an old test so didn't take an action but just noticed that it's pretty new. Probably we should revert this one - seems pretty frequently failing.
There was a problem hiding this comment.
Im gonna partially revert this change alone for now because other tests seem passing fine.
…alyzer error ### What changes were proposed in this pull request? followup of #34490 in branch-3.2, which moves the test for checking `notAllowedToAddDBPrefixForTempViewError` in the parser phase. But it only passes in master. In branch-3.2, the error happens in the analyzer phase. diverge happens in PR #34283 ### Why are the changes needed? fix test ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? pass the restored test. Closes #34633 from linhongliu-db/SPARK-37214-3.2. Authored-by: Linhong Liu <linhong.liu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…flakiness ### What changes were proposed in this pull request? This PR reverts one of the tests added at #34490 which is (pretty seriously) flaky. ### Why are the changes needed? Other tests seem passing fine so it's likely test specific issue. so only test is proposed to be reverted for now. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI in this PR should test this out. Closes #34639 from HyukjinKwon/SPARK-37308. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This is a followup of apache#31427 , which introduced two issues: 1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace in session catalog must have exactly one name part` before that PR, now we fail very late in `CheckAnalysis` with `NoSuchTableException` 2. The error message is a bit confusing now. We report `Table t not found` even if table `t` exists. This PR fixes the 2 issues. save analysis time and improve error message no updated test Closes apache#34490 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 8ab9d63) Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit e55bab5) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…alyzer error ### What changes were proposed in this pull request? followup of apache#34490 in branch-3.2, which moves the test for checking `notAllowedToAddDBPrefixForTempViewError` in the parser phase. But it only passes in master. In branch-3.2, the error happens in the analyzer phase. diverge happens in PR apache#34283 ### Why are the changes needed? fix test ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? pass the restored test. Closes apache#34633 from linhongliu-db/SPARK-37214-3.2. Authored-by: Linhong Liu <linhong.liu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This is a followup of apache#31427 , which introduced two issues: 1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace in session catalog must have exactly one name part` before that PR, now we fail very late in `CheckAnalysis` with `NoSuchTableException` 2. The error message is a bit confusing now. We report `Table t not found` even if table `t` exists. This PR fixes the 2 issues. save analysis time and improve error message no updated test Closes apache#34490 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 8ab9d63) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…alyzer error ### What changes were proposed in this pull request? followup of apache#34490 in branch-3.2, which moves the test for checking `notAllowedToAddDBPrefixForTempViewError` in the parser phase. But it only passes in master. In branch-3.2, the error happens in the analyzer phase. diverge happens in PR apache#34283 ### Why are the changes needed? fix test ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? pass the restored test. Closes apache#34633 from linhongliu-db/SPARK-37214-3.2. Authored-by: Linhong Liu <linhong.liu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This is a followup of apache#31427 , which introduced two issues: 1. When we lookup `spark_catalog.t`, we failed earlier with `The namespace in session catalog must have exactly one name part` before that PR, now we fail very late in `CheckAnalysis` with `NoSuchTableException` 2. The error message is a bit confusing now. We report `Table t not found` even if table `t` exists. This PR fixes the 2 issues. save analysis time and improve error message no updated test Closes apache#34490 from cloud-fan/table. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 8ab9d63) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…alyzer error ### What changes were proposed in this pull request? followup of apache#34490 in branch-3.2, which moves the test for checking `notAllowedToAddDBPrefixForTempViewError` in the parser phase. But it only passes in master. In branch-3.2, the error happens in the analyzer phase. diverge happens in PR apache#34283 ### Why are the changes needed? fix test ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? pass the restored test. Closes apache#34633 from linhongliu-db/SPARK-37214-3.2. Authored-by: Linhong Liu <linhong.liu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This is a followup of #31427 , which introduced two issues:
spark_catalog.t, we failed earlier withThe namespace in session catalog must have exactly one name partbefore that PR, now we fail very late inCheckAnalysiswithNoSuchTableExceptionTable t not foundeven if tabletexists.This PR fixes the 2 issues.
Why are the changes needed?
save analysis time and improve error message
Does this PR introduce any user-facing change?
no
How was this patch tested?
updated test