Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47939][SQL] Implement a new Analyzer rule to move ParameterizedQuery inside ExplainCommand and DescribeQueryCommand #46209

Conversation

vladimirg-db
Copy link
Contributor

@vladimirg-db vladimirg-db commented Apr 24, 2024

What changes were proposed in this pull request?

Mark DescribeQueryCommand and ExplainCommand as SupervisingCommand (they don't expose their wrapped nodes, but supervise them internally). Introduce a new Analyzer rule MoveParameterizedQueriesDown, which moves ParameterizedQuery inside SupervisingCommand for parameters to be resolved correctly.

Why are the changes needed?

Parameterized EXPLAIN and DESCRIBE queries:

  • spark.sql("describe select ?", Array(1)).show();
  • spark.sql("explain select ?", Array(1)).show();
    fail with
    org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNBOUND_SQL_PARAMETER] Found the unbound parameter: _16. Please, fix argsand provide a mapping of the parameter to either a SQL literal or collection constructor functions such asmap(), array(), struct(). SQLSTATE: 42P02; line 1 pos 16; 'Project [unresolvedalias(posparameter(16))] +- OneRowRelation

Does this PR introduce any user-facing change?

Yes, parameterized EXPLAIN and DESCRIBE should start working for users

How was this patch tested?

  • Run sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
  • Run sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala
  • New tests for SQLQuerySuite

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 24, 2024
@vladimirg-db vladimirg-db force-pushed the vladimirg-db/make-explain-and-describe-work-with-parameters branch 9 times, most recently from ced5ad6 to 74dd15c Compare April 25, 2024 17:56
/**
* Transform its supervised plan using `transformer` and returns a copy of `SupervisingCommand`
*/
def withTransformedSupervisedPlan(transformer: (LogicalPlan) => LogicalPlan): LogicalPlan
Copy link
Contributor Author

@vladimirg-db vladimirg-db Apr 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be rewritten as two methods:

def supervisedPlan: LogicalPlan
def withNewSupervisedPlan(newPlan: LogicalPlan): LogicalPlan

But IMO the current impl is more concise

@vladimirg-db vladimirg-db marked this pull request as ready for review April 25, 2024 18:15
@vladimirg-db
Copy link
Contributor Author

@milastdbx @stefankandic @cloud-fan hi! This PR is ready for review

Mark DescribeQueryCommand and ExplainCommand as SupervisingCommand (they don't expose their wrapped nodes, but supervise them internally). Move ParameterizedQuery into SupervisingCommand for parameters to be resolved correctly.
@vladimirg-db vladimirg-db force-pushed the vladimirg-db/make-explain-and-describe-work-with-parameters branch from 6115b5f to ba9f7cb Compare April 26, 2024 13:01
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in 0e52b59 Apr 29, 2024
JacobZheng0927 pushed a commit to JacobZheng0927/spark that referenced this pull request May 11, 2024
…dQuery inside ExplainCommand and DescribeQueryCommand

### What changes were proposed in this pull request?
Mark `DescribeQueryCommand` and `ExplainCommand` as `SupervisingCommand` (they don't expose their wrapped nodes, but supervise them internally). Introduce a new Analyzer rule `MoveParameterizedQueriesDown`, which moves `ParameterizedQuery` inside `SupervisingCommand` for parameters to be resolved correctly.

### Why are the changes needed?
Parameterized `EXPLAIN` and `DESCRIBE` queries:
- `spark.sql("describe select ?", Array(1)).show();`
- `spark.sql("explain select ?", Array(1)).show();`
fail with
`org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNBOUND_SQL_PARAMETER] Found the unbound parameter: _16. Please, fix `args` and provide a mapping of the parameter to either a SQL literal or collection constructor functions such as `map()`, `array()`, `struct()`. SQLSTATE: 42P02; line 1 pos 16; 'Project [unresolvedalias(posparameter(16))] +- OneRowRelation`

### Does this PR introduce _any_ user-facing change?
Yes, parameterized `EXPLAIN` and `DESCRIBE` should start working for users

### How was this patch tested?
- Run `sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala`
- Run `sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala`
- New tests for `SQLQuerySuite`

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#46209 from vladimirg-db/vladimirg-db/make-explain-and-describe-work-with-parameters.

Authored-by: Vladimir Golubev <vladimir.golubev@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants