Skip to content

Conversation

@davidm-db
Copy link
Contributor

What changes were proposed in this pull request?

This PR aims to re-enable SQL Scripting in Spark 4.1, ahead of its RC.

Why are the changes needed?

There was a previous misunderstanding/miscommunication, that I did not notice in time. The feature was disabled because it was thought that it is not ready, due to unclosed JIRA items.

However, it was just a case of not 100% up-to-date items. I have cleaned up and classified remaining work items (children of SPARK-48338) into 4 categories:

  • M0 - basic support
  • M1 - features and changes required to enable SQL Scripting by default
  • M2 - follow-up improvements and additional functionalities that are non-fundamental and should not block M1
  • M3 - potential improvements for the future, need investigation

M0 and M1 are done, meaning the feature is stable, useful and also ready to be used.

M2 will improve some of the aspects of using it by introducing a newer and more user-friendly statements, like SIGNAL/RESIGNAL and GET DIAGNOSTICS, as well as do some very minor optimizations. However, without those, feature is still ready to be used as is.

The only missing aspect is documentation, for which we have a PR (#50592) but it was closed. I'll work on re-submitting that one.

Does this PR introduce any user-facing change?

I am not sure 100% how this classifies - the feature was already present in 4.0, but was not enabled by default.
Anyways, the feature is completely orthogonal to all other standalone statement execution paths, so it is just adding to functionality of Spark.
There is no difference in behavior for any standalone SQL statement compared to when SQL Scripting is turned off.

How was this patch tested?

SQL Scripting has a thorough support in tests.
CI ensures the feature does not affect any other.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Nov 21, 2025
@davidm-db
Copy link
Contributor Author

@cloud-fan
Copy link
Contributor

User can use scripting already by enabling the config, so we need documentation anyway. I don't think this should block 4.1 RC, but if we can finish the documentation before the vote passes, I think re-enabling it is a reasonable choice.

@davidm-db
Copy link
Contributor Author

but if we can finish the documentation before the vote passes, I think re-enabling it is a reasonable choice

@cloud-fan I just created a PR, based on Serge's previous proposal: #53155

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for polishing the feature with the documentation and proposing this, @davidm-db .

As a release manager, I'll block this until I re-verify this again. For now, please consider this as -1 because there is not enough time for the community to re-evaluate your suggestion yet.

Please note that I agree with @cloud-fan 's comment mostly and I don't think this is a blocker for Apache Spark 4.1.0 itself. We can put this and the documentation as a part of Apache Spark 4.2.0 also.

@dongjoon-hyun
Copy link
Member

In addition, please use a different JIRA ID instead of SPARK-54261.

@dongjoon-hyun
Copy link
Member

User can use scripting already by enabling the config, so we need documentation anyway.

Just for the record, I reviewed and merged the documentation PR to the master branch for Apache Spark 4.2.0. For branch-4.1, I tried to deliver the documentation but there is a conflict with branch-4.1. So, the documentation is not in branch-4.1 and there is no documentation PR to fit branch-4.1 neither so far.

@dongjoon-hyun
Copy link
Member

In addition, please use a different JIRA ID instead of SPARK-54261.

Please address the review comment to move forward, @davidm-db .

@davidm-db davidm-db changed the title [SPARK-54261][SQL][FOLLOW UP] Re-enable spark.sql.scripting.enabled by default in 4.1 branch [SPARK-54499][SQL] Re-enable spark.sql.scripting.enabled by default in 4.1 branch Nov 25, 2025
@davidm-db davidm-db force-pushed the davidm-db/enable-sql-scripting-4.1 branch from c4838f6 to 050e3f7 Compare November 25, 2025 07:32
@github-actions github-actions bot added the DOCS label Nov 25, 2025
@davidm-db
Copy link
Contributor Author

@dongjoon-hyun resolved the comment and added a commit to revert the "enablement" section from the docs. sorry, it was middle of the night here when you wrote your last comments here and on the other ticket.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-54499][SQL] Re-enable spark.sql.scripting.enabled by default in 4.1 branch [SPARK-54499][SQL][4.1] Re-enable spark.sql.scripting.enabled by default for Spark 4.1.0 Nov 25, 2025
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

Merged to branch-4.1 for Apache Spark 4.1.0.

Thank you so much for keeping sharing the status and polishing this for 4.1.0, @davidm-db and @cloud-fan .

dongjoon-hyun pushed a commit that referenced this pull request Nov 25, 2025
…fault for Spark 4.1.0

### What changes were proposed in this pull request?

This PR aims to re-enable SQL Scripting in Spark 4.1, ahead of its RC.

### Why are the changes needed?

There was a previous misunderstanding/miscommunication, that I did not notice in time. The feature was disabled because it was thought that it is not ready, due to unclosed JIRA items.

However, it was just a case of not 100% up-to-date items. I have cleaned up and classified remaining work items (children of [SPARK-48338](https://issues.apache.org/jira/browse/SPARK-48338)) into 4 categories:
- M0 - basic support
- M1 - features and changes required to enable SQL Scripting by default
- M2 - follow-up improvements and additional functionalities that are non-fundamental and should not block M1
- M3 - potential improvements for the future, need investigation

M0 and M1 are done, meaning the feature is stable, useful and also ready to be used.

M2 will improve some of the aspects of using it by introducing a newer and more user-friendly statements, like SIGNAL/RESIGNAL and GET DIAGNOSTICS, as well as do some very minor optimizations. However, without those, feature is still ready to be used as is.

The only missing aspect is documentation, for which we have a PR (#50592) but it was closed. I'll work on re-submitting that one.

### Does this PR introduce _any_ user-facing change?

I am not sure 100% how this classifies - the feature was already present in 4.0, but was not enabled by default.
Anyways, the feature is completely orthogonal to all other standalone statement execution paths, so it is just adding to functionality of Spark.
There is no difference in behavior for any standalone SQL statement compared to when SQL Scripting is turned off.

### How was this patch tested?

SQL Scripting has a thorough support in tests.
CI ensures the feature does not affect any other.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #53154 from davidm-db/davidm-db/enable-sql-scripting-4.1.

Authored-by: David Milicevic <david.milicevic@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants