Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

This PR fixes a bunch of issues with SQL API docs for Scala and Java:

  • Removes the now defunct Connect API docs. The current SQL docs provide enough coverage.
  • Removes the following internal packages from the docs: org.apache.spark.sql.artifact, org.apache.spark.sql.scripting, and org.apache.spark.sql.ml.
  • Fixes the removal of the org.apache.spark.error and org.apache.spark.sql.error packages.
  • Marks a bunch of internal classes in the org.apache.spark.sql as private[sql] to remove them from the docs.
  • Moves the TableValuedFunctionArgument interface from org.apache.spark.sql to org.apache.spark.sql.internal because it kept showing up in the docs.

Why are the changes needed?

Readable docs are important!

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manual inspection of the Java and Scala docs.

Was this patch authored or co-authored using generative AI tooling?

No.

.map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/errors")))
.map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/errors")))
.map(_.filterNot(_.getCanonicalPath.contains("org/apache/hive")))
.map(_.filterNot(_.getCanonicalPath.contains("org/apache/spark/sql/v2/avro")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also exclude org.apache.spark.sql.avro?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the problem is that org.apache.spark.sql.avro.functions is user facing API. I can mark the entire datasource implementation as private[sql], but I'd prefer to move it elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see. Yea we should move it to org.apache.spark.sql.execution.datasources.avro following other file sources.

@hvanhovell
Copy link
Contributor Author

Merging to 4.0/master.

@asfgit asfgit closed this in aefaa66 Feb 6, 2025
asfgit pushed a commit that referenced this pull request Feb 6, 2025
…sues

### What changes were proposed in this pull request?
This PR fixes a bunch of issues with SQL API docs for Scala and Java:
- Removes the now defunct Connect API docs. The current SQL docs provide enough coverage.
- Removes the following internal packages from the docs: `org.apache.spark.sql.artifact`, `org.apache.spark.sql.scripting`, and `org.apache.spark.sql.ml`.
- Fixes the removal of the `org.apache.spark.error` and `org.apache.spark.sql.error` packages.
- Marks a bunch of internal classes in the `org.apache.spark.sql` as `private[sql]` to remove them from the docs.
- Moves the `TableValuedFunctionArgument` interface from `org.apache.spark.sql` to `org.apache.spark.sql.internal` because it kept showing up in the docs.

### Why are the changes needed?
Readable docs are important!

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual inspection of the Java and Scala docs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #49800 from hvanhovell/SPARK-49371.

Authored-by: Herman van Hovell <herman@databricks.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit aefaa66)
Signed-off-by: Herman van Hovell <herman@databricks.com>
zifeif2 pushed a commit to zifeif2/spark that referenced this pull request Nov 14, 2025
…sues

### What changes were proposed in this pull request?
This PR fixes a bunch of issues with SQL API docs for Scala and Java:
- Removes the now defunct Connect API docs. The current SQL docs provide enough coverage.
- Removes the following internal packages from the docs: `org.apache.spark.sql.artifact`, `org.apache.spark.sql.scripting`, and `org.apache.spark.sql.ml`.
- Fixes the removal of the `org.apache.spark.error` and `org.apache.spark.sql.error` packages.
- Marks a bunch of internal classes in the `org.apache.spark.sql` as `private[sql]` to remove them from the docs.
- Moves the `TableValuedFunctionArgument` interface from `org.apache.spark.sql` to `org.apache.spark.sql.internal` because it kept showing up in the docs.

### Why are the changes needed?
Readable docs are important!

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Manual inspection of the Java and Scala docs.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#49800 from hvanhovell/SPARK-49371.

Authored-by: Herman van Hovell <herman@databricks.com>
Signed-off-by: Herman van Hovell <herman@databricks.com>
(cherry picked from commit 72aa207)
Signed-off-by: Herman van Hovell <herman@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants