[SPARK-57116][SQL][PYTHON][DOC] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2)#56135
Closed
zhengruifeng wants to merge 1 commit into
Closed
[SPARK-57116][SQL][PYTHON][DOC] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2)#56135zhengruifeng wants to merge 1 commit into
zhengruifeng wants to merge 1 commit into
Conversation
adb0e1b to
38b0441
Compare
38b0441 to
692d455
Compare
…0 -> 4.1.2)
kll_merge_agg_{bigint,float,double} were annotated as introduced in
4.1.0 across Python, Scala, and the SQL ExpressionDescription, but the
master commit that added them (SPARK-54785, fc15f72 on 2026-01-12)
landed after v4.1.0 was tagged (2025-12-11), and the corresponding
branch-4.1 cherry-pick (a39c1b8) shipped in v4.1.2 (2026-05-16).
No 4.1.0 or 4.1.1 release contains them.
Update the annotation to 4.1.2 across:
- python/pyspark/sql/functions/builtin.py (.. versionadded:: in 3
docstrings)
- sql/api/src/main/scala/org/apache/spark/sql/functions.scala
(@SInCE in 15 overload Scaladocs)
- sql/catalyst/.../aggregate/kllAggregates.scala
(since = in 3 @ExpressionDescription annotations)
692d455 to
88eb1b3
Compare
Member
|
Yeah let's try to avoid backporting them to old branches next time .. |
HyukjinKwon
approved these changes
May 27, 2026
cloud-fan
approved these changes
May 27, 2026
dongjoon-hyun
requested changes
May 27, 2026
Member
There was a problem hiding this comment.
Please make a new JIRA issue because we cannot make a follow-up for the released version, @zhengruifeng .
Apache Spark 4.1.2 is already released.
This is a documentation bug of Apache Spark 4.1.2 and should be fixed at 4.1.3.
Contributor
Author
|
@dongjoon-hyun thanks for pointing it out, filed https://issues.apache.org/jira/browse/SPARK-57116 for this fix |
dongjoon-hyun
approved these changes
May 28, 2026
Member
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM. Thank you, @zhengruifeng .
zhengruifeng
added a commit
that referenced
this pull request
May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)
### What changes were proposed in this pull request?
Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:
- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations
### Why are the changes needed?
All three places currently claim "Since 4.1.0", but:
| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |
The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.
`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.
### Does this PR introduce _any_ user-facing change?
Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.
### How was this patch tested?
Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (model: claude-opus-4-7)
Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.
Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
zhengruifeng
added a commit
that referenced
this pull request
May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)
### What changes were proposed in this pull request?
Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:
- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations
### Why are the changes needed?
All three places currently claim "Since 4.1.0", but:
| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |
The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.
`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.
### Does this PR introduce _any_ user-facing change?
Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.
### How was this patch tested?
Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (model: claude-opus-4-7)
Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.
Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
zhengruifeng
added a commit
that referenced
this pull request
May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)
### What changes were proposed in this pull request?
Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:
- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations
### Why are the changes needed?
All three places currently claim "Since 4.1.0", but:
| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |
The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.
`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.
### Does this PR introduce _any_ user-facing change?
Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.
### How was this patch tested?
Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (model: claude-opus-4-7)
Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.
Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
Contributor
Author
|
thanks, merged to master/4.x/4.2/4.1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Fix the
versionadded/@since/ExpressionDescription.sinceannotation onkll_merge_agg_{bigint,float,double}, changing4.1.0to4.1.2. Touches three files:python/pyspark/sql/functions/builtin.py— 3.. versionadded::docstringssql/api/src/main/scala/org/apache/spark/sql/functions.scala— 15@sinceScaladocs (5 overloads × 3 types)sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala— 3@ExpressionDescription(since = ...)annotationsWhy are the changes needed?
All three places currently claim "Since 4.1.0", but:
kll_merge_agg_*?a39c1b8e5e2)fc15f726eab)The introducing commit on master (SPARK-54785 / #53548,
fc15f726eabon 2026-01-12) landed after v4.1.0 was tagged, and the branch-4.1 cherry-pick (a39c1b8e5e2) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.4.1.2is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.Does this PR introduce any user-facing change?
Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to
4.1.2.How was this patch tested?
Doc/annotation-only change. Manually verified each touched site is one of the 15
kll_merge_agg_*overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations onkll_sketch_agg_*(which legitimately shipped in v4.1.0) are left unchanged.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (model: claude-opus-4-7)