Skip to content

[SPARK-55899][PYTHON][TEST] Add ASV microbenchmark for SQL_ARROW_BATCHED_UDF#54702

Closed
Yicong-Huang wants to merge 2 commits intoapache:masterfrom
Yicong-Huang:SPARK-55724/bench/arrow-batch-udf
Closed

[SPARK-55899][PYTHON][TEST] Add ASV microbenchmark for SQL_ARROW_BATCHED_UDF#54702
Yicong-Huang wants to merge 2 commits intoapache:masterfrom
Yicong-Huang:SPARK-55724/bench/arrow-batch-udf

Conversation

@Yicong-Huang
Copy link
Contributor

@Yicong-Huang Yicong-Huang commented Mar 9, 2026

What changes were proposed in this pull request?

Add ASV microbenchmarks for SQL_ARROW_BATCHED_UDF.

Why are the changes needed?

Part of SPARK-55724. Establishes baseline performance metrics for SQL_ARROW_BATCHED_UDF before future refactoring work.

Does this PR introduce any user-facing change?

No. Benchmark files only.

How was this patch tested?

COLUMNS=120 asv run --python=same --bench "ArrowBatched" --attribute "repeat=(3,5,5.0)":

ArrowBatchedUDFTimeBench (SQL_ARROW_BATCHED_UDF):

=================== ============== =============== ===============
--                                       udf
------------------- ----------------------------------------------
      scenario       identity_udf   stringify_udf   nullcheck_udf
=================== ============== =============== ===============
  sm_batch_few_col    62.1+-0.2ms      66.1+-0.8ms      61.2+-0.1ms
 sm_batch_many_col    154+-0.4ms       155+-0.4ms       154+-0.3ms
  lg_batch_few_col    148+-0.3ms       157+-0.4ms       147+-0.5ms
 lg_batch_many_col     623+-2ms         624+-2ms         620+-3ms
     pure_ints        220+-0.5ms       231+-0.7ms        220+-6ms
    pure_floats       224+-0.8ms        262+-1ms        225+-0.7ms
    pure_strings       414+-1ms        415+-0.6ms        404+-1ms
    mixed_types        311+-1ms        318+-0.8ms       308+-0.7ms
=================== ============== =============== ===============

ArrowBatchedUDFPeakmemBench (SQL_ARROW_BATCHED_UDF):

=================== ============== =============== ===============
--                                       udf
------------------- ----------------------------------------------
      scenario       identity_udf   stringify_udf   nullcheck_udf
=================== ============== =============== ===============
  sm_batch_few_col       119M            119M            118M
 sm_batch_many_col       123M            123M            123M
  lg_batch_few_col       124M            124M            122M
 lg_batch_many_col       159M            160M            159M
     pure_ints           122M            123M            122M
    pure_floats          124M            125M            123M
    pure_strings         125M            125M            124M
    mixed_types          123M            124M            123M
=================== ============== =============== ===============

Was this patch authored or co-authored using generative AI tooling?

No

@Yicong-Huang Yicong-Huang changed the title [SPARK-55724][PYTHON][TEST] Add ASV microbenchmark for SQL_ARROW_BATCHED_UDF [SPARK-55899][PYTHON][TEST] Add ASV microbenchmark for SQL_ARROW_BATCHED_UDF Mar 9, 2026
@zhengruifeng
Copy link
Contributor

merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants