Skip to content

[SPARK-56700][SS] Make DataStreamReader.name public#55651

Closed
ericm-db wants to merge 4 commits into
apache:masterfrom
ericm-db:datastreamreader-name-public
Closed

[SPARK-56700][SS] Make DataStreamReader.name public#55651
ericm-db wants to merge 4 commits into
apache:masterfrom
ericm-db:datastreamreader-name-public

Conversation

@ericm-db
Copy link
Copy Markdown
Contributor

@ericm-db ericm-db commented May 1, 2026

What changes were proposed in this pull request?

Remove the private[sql] access modifier from DataStreamReader.name and add the method as a public abstract API to the DataStreamReader base class.

  • Added abstract def name(sourceName: String): this.type to the API base class (sql/api/.../DataStreamReader.scala)
  • Changed both classic and connect implementations from private[sql] def name to override def name
  • Moved Scaladoc to the base class; implementations use @inheritdoc

Why are the changes needed?

The name method was introduced in SPARK-56453 as private[sql] while the API was being finalized. Now that the feature is ready, making it public allows users to assign names to streaming sources for stable checkpoint metadata and source evolution.

Does this PR introduce any user-facing change?

Yes. DataStreamReader.name(sourceName) is now a public @Experimental API available to all users. Previously it was package-private to org.apache.spark.sql.

How was this patch tested?

Existing tests cover the name functionality. This change only modifies the access level.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.6)

Copy link
Copy Markdown
Contributor

@anishshri-db anishshri-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the CI happy, @ericm-db ?

Scalastyle checks passed.
The scalafmt check failed on sql/connect or sql/connect at following occurrences:

org.apache.maven.plugin.MojoExecutionException: Scalafmt: Unformatted files found
Error:  Failed to execute goal org.antipathy:mvn-scalafmt_2.13:1.1.1713302731.c3d0074:format (default-cli) on project spark-sql-api_2.13: Error formatting Scala files: Scalafmt: Unformatted files found -> [Help 1]

Before submitting your change, please make sure to format your code using the following command:
./build/mvn scalafmt:format -Dscalafmt.skip=false -Dscalafmt.validateOnly=false -Dscalafmt.changedOnly=false -pl sql/api -pl sql/connect/common -pl sql/connect/server -pl sql/connect/shims -pl sql/connect/client/jvm

ericm-db and others added 4 commits May 5, 2026 09:35
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ericm-db ericm-db force-pushed the datastreamreader-name-public branch from fa3df24 to 40bef88 Compare May 5, 2026 16:35
anishshri-db pushed a commit that referenced this pull request May 6, 2026
### What changes were proposed in this pull request?

Remove the `private[sql]` access modifier from `DataStreamReader.name` and add the method as a public abstract API to the `DataStreamReader` base class.

- Added abstract `def name(sourceName: String): this.type` to the API base class (`sql/api/.../DataStreamReader.scala`)
- Changed both classic and connect implementations from `private[sql] def name` to `override def name`
- Moved Scaladoc to the base class; implementations use `inheritdoc`

### Why are the changes needed?

The `name` method was introduced in SPARK-56453 as `private[sql]` while the API was being finalized. Now that the feature is ready, making it public allows users to assign names to streaming sources for stable checkpoint metadata and source evolution.

### Does this PR introduce _any_ user-facing change?

Yes. `DataStreamReader.name(sourceName)` is now a public `Experimental` API available to all users. Previously it was package-private to `org.apache.spark.sql`.

### How was this patch tested?

Existing tests cover the `name` functionality. This change only modifies the access level.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.6)

Closes #55651 from ericm-db/datastreamreader-name-public.

Authored-by: ericm-db <eric.marnadi@databricks.com>
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
(cherry picked from commit 0af3d42)
Signed-off-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants