[SPARK-56120][PYTHON][TESTS][FOLLOWUP] Make _WindowAggArrowBenchMixin scenarios lazy by viirya · Pull Request #56167 · apache/spark

viirya · 2026-05-28T01:50:35Z

What changes were proposed in this pull request?

Convert _WindowAggArrowBenchMixin in python/benchmarks/bench_eval_type.py
to the lazy _scenario_configs + @staticmethod _build_scenario(name) pattern
used by every other mixin in the file, matching the immediately-following
_WindowAggPandasBenchMixin.

Why are the changes needed?

SPARK-56244 follow-up (commit 1c807ad) removed eager _scenarios = _build_scenarios() from all mixins so that importing the benchmark module no longer materializes every scenario's Arrow data -- a prerequisite for accurate per-scenario peakmem_* readings under ASV (ASV reports the max RSS observed in the worker process, so any import-time allocation inflates every subsequent peakmem result).

SPARK-56120 (78aaf11728b, merged the day after the follow-up) reintroduced the eager pattern in _WindowAggArrowBenchMixin, leaving it as the only mixin in the file still doing class-body data construction. As a result, WindowAggArrowUDFPeakmemBench readings are dominated by the global import-time allocation rather than the per-scenario footprint.

Measured locally with tracemalloc:

before: import peak = 394.54 MiB
after: import peak = 29.17 MiB

Does this PR introduce any user-facing change?

No. Test-only change in the benchmark module.

How was this patch tested?

Imported python.benchmarks.bench_eval_type and asserted the lazy structure is in place (_scenario_configs present, _scenarios absent, _build_scenario is a staticmethod).
Ran WindowAggArrowUDFTimeBench.setup + time_worker for (many_groups_sm, few_groups_sm) x (sum_udf, mean_multi_udf).
Ran WindowAggArrowUDFPeakmemBench.setup + peakmem_worker for many_groups_sm/sum_udf.
Compared import-time peak memory before/after (numbers above).

Was this patch authored or co-authored using generative AI tooling?

Yes. Generated-by: Claude Code (claude-opus-4-7)

… scenarios lazy ### What changes were proposed in this pull request? Convert `_WindowAggArrowBenchMixin` in `python/benchmarks/bench_eval_type.py` to the lazy `_scenario_configs` + `@staticmethod _build_scenario(name)` pattern used by every other mixin in the file, matching the immediately-following `_WindowAggPandasBenchMixin`. ### Why are the changes needed? SPARK-56244 follow-up (commit 1c807ad) removed eager `_scenarios = _build_scenarios()` from all mixins so that importing the benchmark module no longer materializes every scenario's Arrow data -- a prerequisite for accurate per-scenario `peakmem_*` readings under ASV (ASV reports the max RSS observed in the worker process, so any import-time allocation inflates every subsequent peakmem result). SPARK-56120 (`78aaf11728b`, merged the day after the follow-up) reintroduced the eager pattern in `_WindowAggArrowBenchMixin`, leaving it as the only mixin in the file still doing class-body data construction. As a result, `WindowAggArrowUDFPeakmemBench` readings are dominated by the global import-time allocation rather than the per-scenario footprint. Measured locally with `tracemalloc`: - before: import peak = 394.54 MiB - after: import peak = 29.17 MiB ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Imported `python.benchmarks.bench_eval_type` and asserted the lazy structure is in place (`_scenario_configs` present, `_scenarios` absent, `_build_scenario` is a staticmethod). - Ran `WindowAggArrowUDFTimeBench.setup` + `time_worker` for `(many_groups_sm, few_groups_sm) x (sum_udf, mean_multi_udf)`. - Ran `WindowAggArrowUDFPeakmemBench.setup` + `peakmem_worker` for `many_groups_sm/sum_udf`. - Compared import-time peak memory before/after (numbers above). ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-7)

dongjoon-hyun

+1, LGTM.

…scenarios lazy ### What changes were proposed in this pull request? Convert `_WindowAggArrowBenchMixin` in `python/benchmarks/bench_eval_type.py` to the lazy `_scenario_configs` + `staticmethod _build_scenario(name)` pattern used by every other mixin in the file, matching the immediately-following `_WindowAggPandasBenchMixin`. ### Why are the changes needed? SPARK-56244 follow-up (commit 1c807ad) removed eager `_scenarios = _build_scenarios()` from all mixins so that importing the benchmark module no longer materializes every scenario's Arrow data -- a prerequisite for accurate per-scenario `peakmem_*` readings under ASV (ASV reports the max RSS observed in the worker process, so any import-time allocation inflates every subsequent peakmem result). SPARK-56120 (`78aaf11728b`, merged the day after the follow-up) reintroduced the eager pattern in `_WindowAggArrowBenchMixin`, leaving it as the only mixin in the file still doing class-body data construction. As a result, `WindowAggArrowUDFPeakmemBench` readings are dominated by the global import-time allocation rather than the per-scenario footprint. Measured locally with `tracemalloc`: - before: import peak = 394.54 MiB - after: import peak = 29.17 MiB ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Imported `python.benchmarks.bench_eval_type` and asserted the lazy structure is in place (`_scenario_configs` present, `_scenarios` absent, `_build_scenario` is a staticmethod). - Ran `WindowAggArrowUDFTimeBench.setup` + `time_worker` for `(many_groups_sm, few_groups_sm) x (sum_udf, mean_multi_udf)`. - Ran `WindowAggArrowUDFPeakmemBench.setup` + `peakmem_worker` for `many_groups_sm/sum_udf`. - Compared import-time peak memory before/after (numbers above). ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-7) Closes #56167 from viirya/SPARK-56120-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 7da33f3) Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>

viirya · 2026-05-28T06:43:53Z

Merged to master/branch-4.x/branch-4.2.

Thanks @dongjoon-hyun

…ings ### What changes were proposed in this pull request? Reduce duplication between Arrow and Pandas sibling mixins in `python/benchmarks/bench_eval_type.py` by making the Pandas variant subclass the Arrow variant, mirroring the existing iter-subclasses-noniter pattern. Applies to two pairs: - `_ScalarPandasBenchMixin` now subclasses `_ScalarArrowBenchMixin` - `_GroupedAggPandasBenchMixin` now subclasses `_GroupedAggArrowBenchMixin` The shared `_build_scenario` and `_write_scenario` are pulled up into the Arrow base, with the eval type parameterized via the `_eval_type` class attribute (the Scalar pair already used this; this PR extends the same pattern to GroupedAgg). `_build_scenario` is converted from `staticmethod` to `classmethod` so subclasses read their own `_scenario_configs`. As a follow-on benefit, `_GroupedAggArrowIterBenchMixin` (already a subclass of the Arrow base) drops its now-redundant copy of `_write_scenario`. Net diff: +27 / -102 lines. ### Why are the changes needed? Before: each sibling pair had two near-identical `_write_scenario` bodies, differing only in the hard-coded `PythonEvalType.SQL_...` constant and the UDF set. Any change to the protocol-writing logic (runner_conf, eval_conf, payload framing) had to be applied in lock-step across both sibling halves -- a known footgun. The pattern already used by the iter subclasses (override `_eval_type` + `_udfs`, inherit everything else) generalizes cleanly to the Arrow/Pandas axis. The Window and CogroupedMap pairs are intentionally left out of this PR to avoid conflicting with two in-flight PRs (#56167 makes the Window Arrow mixin lazy; #56171 renames `wide_values` to `wide_cols` in the Cogroup pair). Both pairs can be folded into the same base-class pattern in a follow-up once those land. ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Structural: confirmed `_eval_type` resolves correctly via MRO for all seven affected mixins (Scalar Arrow / Arrow-Iter / Pandas / Pandas-Iter, GroupedAgg Arrow / Arrow-Iter / Pandas). - Confirmed `_ScalarPandasBenchMixin._scenario_configs` still holds the pandas-sized row counts (1M `pure_ints`), not the Arrow base's 5M -- the main MRO-resolution risk of switching `_build_scenario` to `classmethod`. - Ran `setup` + `time_worker` end-to-end for all 7 `*TimeBench` classes (including both UDF arities for the GroupedAgg variants). - Ran `peakmem_worker` for one bench class per pair. ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-7) Closes #56173 from viirya/SPARK-55724-sibling-base-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>

…ings ### What changes were proposed in this pull request? Reduce duplication between Arrow and Pandas sibling mixins in `python/benchmarks/bench_eval_type.py` by making the Pandas variant subclass the Arrow variant, mirroring the existing iter-subclasses-noniter pattern. Applies to two pairs: - `_ScalarPandasBenchMixin` now subclasses `_ScalarArrowBenchMixin` - `_GroupedAggPandasBenchMixin` now subclasses `_GroupedAggArrowBenchMixin` The shared `_build_scenario` and `_write_scenario` are pulled up into the Arrow base, with the eval type parameterized via the `_eval_type` class attribute (the Scalar pair already used this; this PR extends the same pattern to GroupedAgg). `_build_scenario` is converted from `staticmethod` to `classmethod` so subclasses read their own `_scenario_configs`. As a follow-on benefit, `_GroupedAggArrowIterBenchMixin` (already a subclass of the Arrow base) drops its now-redundant copy of `_write_scenario`. Net diff: +27 / -102 lines. ### Why are the changes needed? Before: each sibling pair had two near-identical `_write_scenario` bodies, differing only in the hard-coded `PythonEvalType.SQL_...` constant and the UDF set. Any change to the protocol-writing logic (runner_conf, eval_conf, payload framing) had to be applied in lock-step across both sibling halves -- a known footgun. The pattern already used by the iter subclasses (override `_eval_type` + `_udfs`, inherit everything else) generalizes cleanly to the Arrow/Pandas axis. The Window and CogroupedMap pairs are intentionally left out of this PR to avoid conflicting with two in-flight PRs (#56167 makes the Window Arrow mixin lazy; #56171 renames `wide_values` to `wide_cols` in the Cogroup pair). Both pairs can be folded into the same base-class pattern in a follow-up once those land. ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Structural: confirmed `_eval_type` resolves correctly via MRO for all seven affected mixins (Scalar Arrow / Arrow-Iter / Pandas / Pandas-Iter, GroupedAgg Arrow / Arrow-Iter / Pandas). - Confirmed `_ScalarPandasBenchMixin._scenario_configs` still holds the pandas-sized row counts (1M `pure_ints`), not the Arrow base's 5M -- the main MRO-resolution risk of switching `_build_scenario` to `classmethod`. - Ran `setup` + `time_worker` end-to-end for all 7 `*TimeBench` classes (including both UDF arities for the GroupedAgg variants). - Ran `peakmem_worker` for one bench class per pair. ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-7) Closes #56173 from viirya/SPARK-55724-sibling-base-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 40e9a14) Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>

…p Arrow/Pandas siblings ### What changes were proposed in this pull request? Follow-up to SPARK-57137. Extend the Arrow/Pandas sibling base-mixin pattern to the remaining two pairs in `python/benchmarks/bench_eval_type.py`: - `_WindowAggPandasBenchMixin` now subclasses `_WindowAggArrowBenchMixin` - `_CogroupedMapPandasBenchMixin` now subclasses `_CogroupedMapArrowBenchMixin` The shared `_build_scenario` / `_write_scenario` are pulled up into the Arrow base, with the eval type parameterized via the `_eval_type` class attribute and `_build_scenario` converted from `staticmethod` to `classmethod` so subclasses read their own `_scenario_configs` (the same mechanism SPARK-57137 used for the Scalar and GroupedAgg pairs). - Window: the Pandas half drops `_scenario_configs` entirely (identical to the Arrow variant) and keeps only `_eval_type`, its UDFs, and `params`. - Cogroup: the Arrow `_udfs` values are normalized to `(func, n_args)` tuples to match the Pandas sibling, so the inherited `_write_scenario` works unchanged. The Pandas half keeps its own scaled-down `_scenario_configs` and the extra 3-arg `key_identity_udf` variant. Net diff: +39 / -102 lines. ### Why are the changes needed? These two pairs were intentionally left out of SPARK-57137 to avoid conflicting with two in-flight PRs (#56167 made the Window Arrow mixin scenarios lazy; #56171 renamed `wide_values` to `wide_cols` in the Cogroup pair). Both have since merged, so the pairs can now be folded into the same base-class pattern. Before this change, each sibling pair carried two near-identical `_write_scenario` bodies differing only in the `PythonEvalType.SQL_...` constant (and, for Cogroup, the UDF set) -- a known footgun where any protocol-writing change had to be applied in lock-step across both halves. ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Structural: confirmed `_eval_type` resolves correctly via MRO for all four affected mixins; confirmed `_CogroupedMapPandasBenchMixin._scenario_configs` still holds the scaled-down pandas row counts (not the Arrow base's), the main MRO-resolution risk of switching `_build_scenario` to `classmethod`. - Ran `setup` + `time_worker` end-to-end for all four affected `*TimeBench` classes across every UDF (including the Pandas-only 3-arg `key_identity_udf`). - Ran `peakmem_worker` (disk-replay path) for the Window and Cogroup Pandas classes. - Confirmed the generated wire bytes are byte-identical to the pre-refactor output for Window and Cogroup Arrow; the Cogroup Pandas UDF pickles differ only in the embedded class-hierarchy reference (same length, identical execution), matching what SPARK-57137 produced for the Scalar/GroupedAgg pairs. ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-8) Closes #56194 from viirya/SPARK-55724-window-cogroup-base. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>

…p Arrow/Pandas siblings ### What changes were proposed in this pull request? Follow-up to SPARK-57137. Extend the Arrow/Pandas sibling base-mixin pattern to the remaining two pairs in `python/benchmarks/bench_eval_type.py`: - `_WindowAggPandasBenchMixin` now subclasses `_WindowAggArrowBenchMixin` - `_CogroupedMapPandasBenchMixin` now subclasses `_CogroupedMapArrowBenchMixin` The shared `_build_scenario` / `_write_scenario` are pulled up into the Arrow base, with the eval type parameterized via the `_eval_type` class attribute and `_build_scenario` converted from `staticmethod` to `classmethod` so subclasses read their own `_scenario_configs` (the same mechanism SPARK-57137 used for the Scalar and GroupedAgg pairs). - Window: the Pandas half drops `_scenario_configs` entirely (identical to the Arrow variant) and keeps only `_eval_type`, its UDFs, and `params`. - Cogroup: the Arrow `_udfs` values are normalized to `(func, n_args)` tuples to match the Pandas sibling, so the inherited `_write_scenario` works unchanged. The Pandas half keeps its own scaled-down `_scenario_configs` and the extra 3-arg `key_identity_udf` variant. Net diff: +39 / -102 lines. ### Why are the changes needed? These two pairs were intentionally left out of SPARK-57137 to avoid conflicting with two in-flight PRs (#56167 made the Window Arrow mixin scenarios lazy; #56171 renamed `wide_values` to `wide_cols` in the Cogroup pair). Both have since merged, so the pairs can now be folded into the same base-class pattern. Before this change, each sibling pair carried two near-identical `_write_scenario` bodies differing only in the `PythonEvalType.SQL_...` constant (and, for Cogroup, the UDF set) -- a known footgun where any protocol-writing change had to be applied in lock-step across both halves. ### Does this PR introduce _any_ user-facing change? No. Test-only change in the benchmark module. ### How was this patch tested? - Structural: confirmed `_eval_type` resolves correctly via MRO for all four affected mixins; confirmed `_CogroupedMapPandasBenchMixin._scenario_configs` still holds the scaled-down pandas row counts (not the Arrow base's), the main MRO-resolution risk of switching `_build_scenario` to `classmethod`. - Ran `setup` + `time_worker` end-to-end for all four affected `*TimeBench` classes across every UDF (including the Pandas-only 3-arg `key_identity_udf`). - Ran `peakmem_worker` (disk-replay path) for the Window and Cogroup Pandas classes. - Confirmed the generated wire bytes are byte-identical to the pre-refactor output for Window and Cogroup Arrow; the Cogroup Pandas UDF pickles differ only in the embedded class-hierarchy reference (same length, identical execution), matching what SPARK-57137 produced for the Scalar/GroupedAgg pairs. ### Was this patch authored or co-authored using generative AI tooling? Yes. Generated-by: Claude Code (claude-opus-4-8) Closes #56194 from viirya/SPARK-55724-window-cogroup-base. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> (cherry picked from commit 481f8a9) Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>

dongjoon-hyun approved these changes May 28, 2026

View reviewed changes

viirya mentioned this pull request May 28, 2026

[SPARK-57137][PYTHON][TESTS] Share base mixin across Arrow/Pandas siblings #56173

Closed

viirya closed this in 7da33f3 May 28, 2026

viirya deleted the SPARK-56120-followup branch May 28, 2026 06:43

viirya mentioned this pull request May 29, 2026

[SPARK-57138][PYTHON][TEST] Share base mixin across Window and Cogroup Arrow/Pandas siblings #56194

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56120][PYTHON][TESTS][FOLLOWUP] Make _WindowAggArrowBenchMixin scenarios lazy#56167

[SPARK-56120][PYTHON][TESTS][FOLLOWUP] Make _WindowAggArrowBenchMixin scenarios lazy#56167
viirya wants to merge 1 commit into
apache:masterfrom
viirya:SPARK-56120-followup

viirya commented May 28, 2026

Uh oh!

dongjoon-hyun left a comment

Uh oh!

viirya commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

viirya commented May 28, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

viirya commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants