Skip to content

[SPARK-57116][SQL][PYTHON][DOC] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2)#56135

Closed
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:spark-doc-fixes-dev1
Closed

[SPARK-57116][SQL][PYTHON][DOC] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2)#56135
zhengruifeng wants to merge 1 commit into
apache:masterfrom
zhengruifeng:spark-doc-fixes-dev1

Conversation

@zhengruifeng
Copy link
Copy Markdown
Contributor

@zhengruifeng zhengruifeng commented May 27, 2026

What changes were proposed in this pull request?

Fix the versionadded / @since / ExpressionDescription.since annotation on kll_merge_agg_{bigint,float,double}, changing 4.1.0 to 4.1.2. Touches three files:

  • python/pyspark/sql/functions/builtin.py — 3 .. versionadded:: docstrings
  • sql/api/src/main/scala/org/apache/spark/sql/functions.scala — 15 @since Scaladocs (5 overloads × 3 types)
  • sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala — 3 @ExpressionDescription(since = ...) annotations

Why are the changes needed?

All three places currently claim "Since 4.1.0", but:

Tag Date Contains kll_merge_agg_*?
v4.1.0 2025-12-11 No
v4.1.1 2026-01-02 No
v4.1.2 2026-05-16 Yes (via cherry-pick a39c1b8e5e2)
v4.2.0-preview2 2026-02-05 Yes (via master commit fc15f726eab)

The introducing commit on master (SPARK-54785 / #53548, fc15f726eab on 2026-01-12) landed after v4.1.0 was tagged, and the branch-4.1 cherry-pick (a39c1b8e5e2) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.

4.1.2 is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.

Does this PR introduce any user-facing change?

Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to 4.1.2.

How was this patch tested?

Doc/annotation-only change. Manually verified each touched site is one of the 15 kll_merge_agg_* overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on kll_sketch_agg_* (which legitimately shipped in v4.1.0) are left unchanged.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (model: claude-opus-4-7)

@zhengruifeng zhengruifeng force-pushed the spark-doc-fixes-dev1 branch from adb0e1b to 38b0441 Compare May 27, 2026 06:15
@zhengruifeng zhengruifeng changed the title [PYTHON][DOCS] 4.2 audit follow-ups: document TwsTester, fix kll_merge_agg_* versionadded [PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.2.0) May 27, 2026
@zhengruifeng zhengruifeng deleted the spark-doc-fixes-dev1 branch May 27, 2026 06:20
@zhengruifeng zhengruifeng restored the spark-doc-fixes-dev1 branch May 27, 2026 06:20
@zhengruifeng zhengruifeng reopened this May 27, 2026
@zhengruifeng zhengruifeng force-pushed the spark-doc-fixes-dev1 branch from 38b0441 to 692d455 Compare May 27, 2026 06:21
@zhengruifeng zhengruifeng changed the title [PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.2.0) [PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) May 27, 2026
@zhengruifeng zhengruifeng changed the title [PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) [SPARK-57095][PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) May 27, 2026
…0 -> 4.1.2)

kll_merge_agg_{bigint,float,double} were annotated as introduced in
4.1.0 across Python, Scala, and the SQL ExpressionDescription, but the
master commit that added them (SPARK-54785, fc15f72 on 2026-01-12)
landed after v4.1.0 was tagged (2025-12-11), and the corresponding
branch-4.1 cherry-pick (a39c1b8) shipped in v4.1.2 (2026-05-16).
No 4.1.0 or 4.1.1 release contains them.

Update the annotation to 4.1.2 across:
  - python/pyspark/sql/functions/builtin.py (.. versionadded:: in 3
    docstrings)
  - sql/api/src/main/scala/org/apache/spark/sql/functions.scala
    (@SInCE in 15 overload Scaladocs)
  - sql/catalyst/.../aggregate/kllAggregates.scala
    (since = in 3 @ExpressionDescription annotations)
@zhengruifeng zhengruifeng force-pushed the spark-doc-fixes-dev1 branch from 692d455 to 88eb1b3 Compare May 27, 2026 06:31
@zhengruifeng zhengruifeng changed the title [SPARK-57095][PYTHON][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) [SPARK-57095][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) May 27, 2026
@zhengruifeng zhengruifeng changed the title [SPARK-57095][DOCS] Fix versionadded for kll_merge_agg_* (4.1.0 -> 4.1.2) [SQL][PYTHON][DOCS] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2) May 27, 2026
@zhengruifeng zhengruifeng marked this pull request as ready for review May 27, 2026 06:43
@zhengruifeng zhengruifeng changed the title [SQL][PYTHON][DOCS] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2) [SPARK-54785][SQL][PYTHON][DOCS][FOLLOWUP] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2) May 27, 2026
@HyukjinKwon
Copy link
Copy Markdown
Member

Yeah let's try to avoid backporting them to old branches next time ..

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make a new JIRA issue because we cannot make a follow-up for the released version, @zhengruifeng .

Image

Apache Spark 4.1.2 is already released.

This is a documentation bug of Apache Spark 4.1.2 and should be fixed at 4.1.3.

@zhengruifeng zhengruifeng changed the title [SPARK-54785][SQL][PYTHON][DOCS][FOLLOWUP] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2) [SPARK-57116][SQL][PYTHON][DOC] Fix versionadded/@since for kll_merge_agg_* (4.1.0 -> 4.1.2) May 28, 2026
@zhengruifeng
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun thanks for pointing it out, filed https://issues.apache.org/jira/browse/SPARK-57116 for this fix

Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @zhengruifeng .

zhengruifeng added a commit that referenced this pull request May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)

### What changes were proposed in this pull request?

Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:

- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations

### Why are the changes needed?

All three places currently claim "Since 4.1.0", but:

| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |

The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.

`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.

### Does this PR introduce _any_ user-facing change?

Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.

### How was this patch tested?

Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (model: claude-opus-4-7)

Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
zhengruifeng added a commit that referenced this pull request May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)

### What changes were proposed in this pull request?

Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:

- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations

### Why are the changes needed?

All three places currently claim "Since 4.1.0", but:

| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |

The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.

`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.

### Does this PR introduce _any_ user-facing change?

Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.

### How was this patch tested?

Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (model: claude-opus-4-7)

Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
zhengruifeng added a commit that referenced this pull request May 28, 2026
…_agg_* (4.1.0 -> 4.1.2)

### What changes were proposed in this pull request?

Fix the `versionadded` / `since` / `ExpressionDescription.since` annotation on `kll_merge_agg_{bigint,float,double}`, changing `4.1.0` to `4.1.2`. Touches three files:

- `python/pyspark/sql/functions/builtin.py` — 3 `.. versionadded::` docstrings
- `sql/api/src/main/scala/org/apache/spark/sql/functions.scala` — 15 `since` Scaladocs (5 overloads × 3 types)
- `sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/kllAggregates.scala` — 3 `ExpressionDescription(since = ...)` annotations

### Why are the changes needed?

All three places currently claim "Since 4.1.0", but:

| Tag | Date | Contains `kll_merge_agg_*`? |
|---|---|---|
| v4.1.0 | 2025-12-11 | No |
| v4.1.1 | 2026-01-02 | No |
| v4.1.2 | 2026-05-16 | **Yes** (via cherry-pick `a39c1b8e5e2`) |
| v4.2.0-preview2 | 2026-02-05 | Yes (via master commit `fc15f726eab`) |

The introducing commit on master (SPARK-54785 / #53548, `fc15f726eab` on 2026-01-12) landed *after* v4.1.0 was tagged, and the branch-4.1 cherry-pick (`a39c1b8e5e2`) shipped first in v4.1.2. No 4.1.0/4.1.1 release contains these functions.

`4.1.2` is the earliest stable release in which they are available, so that is the correct value across all three annotation sites.

### Does this PR introduce _any_ user-facing change?

Documentation-only change. The rendered Python API ref, Scaladoc, and SQL function reference currently render "New in version 4.1.0" / "Since: 4.1.0", which is misleading; this PR corrects all of them to `4.1.2`.

### How was this patch tested?

Doc/annotation-only change. Manually verified each touched site is one of the 15 `kll_merge_agg_*` overload Scaladocs, the 3 KllMergeAgg expression descriptions, or the 3 Python docstrings — sibling annotations on `kll_sketch_agg_*` (which legitimately shipped in v4.1.0) are left unchanged.

### Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (model: claude-opus-4-7)

Closes #56135 from zhengruifeng/spark-doc-fixes-dev1.

Authored-by: Ruifeng Zheng <ruifengz@apache.org>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
(cherry picked from commit 63f9c88)
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
@zhengruifeng
Copy link
Copy Markdown
Contributor Author

thanks, merged to master/4.x/4.2/4.1

@zhengruifeng zhengruifeng deleted the spark-doc-fixes-dev1 branch May 28, 2026 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants