Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47310][SS] Add micro-benchmark for merge operations for multiple values in value portion of state store #45865

Closed
wants to merge 6 commits into from

Conversation

anishshri-db
Copy link
Contributor

@anishshri-db anishshri-db commented Apr 3, 2024

What changes were proposed in this pull request?

Add microbenchmark for merge operations for multiple values in value portion of state store

Why are the changes needed?

Micro-benchmark to understand performance with/without rows tracking around merge operations

As shown in the results, merge without tracking is consistently 3x faster

merging 10000 rows with 10 values per key (10000 rows to overwrite - rate 100):  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true)                                                    519            533           7          0.0       51916.6       1.0X
RocksDB (trackTotalNumberOfRows: false)                                                   171            177           3          0.1       17083.9       3.0X

GH Actions here:

Difference is even more running locally (> 7x faster without tracking)

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test only change

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 3, 2024
@anishshri-db anishshri-db changed the title [SPARK-47310] Add microbenchmark for merge operations for multiple values in value portion of state store [SPARK-47310][SS] Add microbenchmark for merge operations for multiple values in value portion of state store Apr 3, 2024
@anishshri-db anishshri-db marked this pull request as draft April 3, 2024 23:51
@anishshri-db anishshri-db marked this pull request as ready for review April 4, 2024 21:29
@anishshri-db
Copy link
Contributor Author

@sahnib @HeartSaVioR - PTAL, thx !

@anishshri-db anishshri-db changed the title [SPARK-47310][SS] Add microbenchmark for merge operations for multiple values in value portion of state store [SPARK-47310][SS] Add micro-benchmark for merge operations for multiple values in value portion of state store Apr 4, 2024
Copy link
Contributor

@sahnib sahnib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding this.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants