Skip to content

Commit

Permalink
[SPARK-47310][SS] Add micro-benchmark for merge operations for multip…
Browse files Browse the repository at this point in the history
…le values in value portion of state store

### What changes were proposed in this pull request?
Add microbenchmark for merge operations for multiple values in value portion of state store

### Why are the changes needed?
Micro-benchmark to understand performance with/without rows tracking around merge operations

As shown in the results, merge without tracking is consistently 3x faster

```
merging 10000 rows with 10 values per key (10000 rows to overwrite - rate 100):  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true)                                                    519            533           7          0.0       51916.6       1.0X
RocksDB (trackTotalNumberOfRows: false)                                                   171            177           3          0.1       17083.9       3.0X
```

GH Actions here:
- https://github.com/anishshri-db/spark/actions/runs/8559698160
- https://github.com/anishshri-db/spark/actions/runs/8559694994

Difference is even more running locally (> 7x faster without tracking)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test only change

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #45865 from anishshri-db/task/SPARK-47310.

Authored-by: Anish Shrigondekar <anish.shrigondekar@databricks.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
  • Loading branch information
anishshri-db authored and HeartSaVioR committed Apr 5, 2024
1 parent 18072b5 commit 1efbf43
Show file tree
Hide file tree
Showing 3 changed files with 265 additions and 79 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,33 +6,66 @@ OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
putting 10000 rows (10000 rows to overwrite - rate 100): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------------------
In-memory 9 10 1 1.1 894.7 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 2 0.2 4064.6 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 15 1 0.7 1466.8 0.6X
In-memory 9 10 1 1.1 936.2 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 1 0.2 4068.9 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1500.4 0.6X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
putting 10000 rows (5000 rows to overwrite - rate 50): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------
In-memory 9 10 0 1.1 893.1 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 40 1 0.3 3959.7 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1510.8 0.6X
In-memory 9 11 1 1.1 929.8 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 41 1 0.3 3955.7 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1497.3 0.6X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
putting 10000 rows (1000 rows to overwrite - rate 10): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------
In-memory 9 9 0 1.1 872.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 39 40 1 0.3 3887.2 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1532.3 0.6X
In-memory 9 10 1 1.1 907.5 1.0X
RocksDB (trackTotalNumberOfRows: true) 39 40 1 0.3 3886.5 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1497.2 0.6X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
putting 10000 rows (0 rows to overwrite - rate 0): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
---------------------------------------------------------------------------------------------------------------------------------
In-memory 9 10 1 1.1 874.5 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 41 1 0.3 3967.1 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1526.2 0.6X
In-memory 9 10 1 1.1 904.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 39 40 1 0.3 3859.8 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1497.2 0.6X


================================================================================================
merge rows
================================================================================================

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
merging 10000 rows with 10 values per key (10000 rows to overwrite - rate 100): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true) 519 533 7 0.0 51916.6 1.0X
RocksDB (trackTotalNumberOfRows: false) 171 177 3 0.1 17083.9 3.0X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
merging 10000 rows with 10 values per key (5000 rows to overwrite - rate 50): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true) 506 521 7 0.0 50644.0 1.0X
RocksDB (trackTotalNumberOfRows: false) 170 176 3 0.1 17022.0 3.0X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
merging 10000 rows with 10 values per key (1000 rows to overwrite - rate 10): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true) 493 508 6 0.0 49319.3 1.0X
RocksDB (trackTotalNumberOfRows: false) 169 175 3 0.1 16897.6 2.9X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
merging 10000 rows with 10 values per key (0 rows to overwrite - rate 0): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true) 495 508 6 0.0 49462.5 1.0X
RocksDB (trackTotalNumberOfRows: false) 169 175 3 0.1 16896.6 2.9X


================================================================================================
Expand All @@ -43,33 +76,33 @@ OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
trying to delete 10000 rows from 10000 rows(10000 rows are non-existing - rate 100): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 0 1 0 20.9 47.9 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 40 1 0.3 3956.8 0.0X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.6 1541.9 0.0X
In-memory 0 1 0 26.3 38.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 39 41 1 0.3 3942.0 0.0X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1529.2 0.0X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
trying to delete 10000 rows from 10000 rows(5000 rows are non-existing - rate 50): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 8 10 1 1.3 773.4 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 41 1 0.2 4024.1 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1537.8 0.5X
In-memory 8 9 1 1.3 790.4 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 41 1 0.2 4036.7 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1536.9 0.5X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
trying to delete 10000 rows from 10000 rows(1000 rows are non-existing - rate 10): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 8 10 1 1.2 817.7 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 1 0.2 4111.7 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.6 1540.5 0.5X
In-memory 8 10 1 1.2 847.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 1 0.2 4099.8 0.2X
RocksDB (trackTotalNumberOfRows: false) 16 16 0 0.6 1563.3 0.5X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
trying to delete 10000 rows from 10000 rows(0 rows are non-existing - rate 0): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 8 10 1 1.2 820.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 1 0.2 4133.0 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1526.2 0.5X
In-memory 9 10 1 1.2 859.4 1.0X
RocksDB (trackTotalNumberOfRows: true) 41 42 1 0.2 4118.9 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 1 0.7 1507.8 0.6X


================================================================================================
Expand All @@ -80,32 +113,30 @@ OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
evicting 10000 rows (maxTimestampToEvictInMillis: 9999) from 10000 rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 8 9 0 1.2 805.5 1.0X
RocksDB (trackTotalNumberOfRows: true) 39 40 1 0.3 3888.6 0.2X
RocksDB (trackTotalNumberOfRows: false) 15 16 0 0.7 1538.2 0.5X
In-memory 8 9 1 1.2 831.0 1.0X
RocksDB (trackTotalNumberOfRows: true) 40 40 1 0.3 3956.6 0.2X
RocksDB (trackTotalNumberOfRows: false) 16 16 0 0.6 1571.3 0.5X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
evicting 5000 rows (maxTimestampToEvictInMillis: 4999) from 10000 rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 8 8 0 1.3 754.7 1.0X
RocksDB (trackTotalNumberOfRows: true) 21 22 0 0.5 2091.7 0.4X
RocksDB (trackTotalNumberOfRows: false) 9 9 0 1.1 916.1 0.8X
In-memory 8 8 1 1.3 787.6 1.0X
RocksDB (trackTotalNumberOfRows: true) 21 22 0 0.5 2112.6 0.4X
RocksDB (trackTotalNumberOfRows: false) 9 9 0 1.1 932.9 0.8X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
evicting 1000 rows (maxTimestampToEvictInMillis: 999) from 10000 rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 7 8 1 1.4 692.8 1.0X
RocksDB (trackTotalNumberOfRows: true) 7 7 0 1.5 654.6 1.1X
RocksDB (trackTotalNumberOfRows: false) 4 4 0 2.4 423.8 1.6X
In-memory 7 8 0 1.4 715.7 1.0X
RocksDB (trackTotalNumberOfRows: true) 7 7 0 1.5 676.3 1.1X
RocksDB (trackTotalNumberOfRows: false) 4 5 0 2.3 442.3 1.6X

OpenJDK 64-Bit Server VM 21.0.2+13-LTS on Linux 6.5.0-1016-azure
AMD EPYC 7763 64-Core Processor
evicting 0 rows (maxTimestampToEvictInMillis: -1) from 10000 rows: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------
In-memory 0 0 0 24.2 41.2 1.0X
RocksDB (trackTotalNumberOfRows: true) 3 3 0 3.4 290.1 0.1X
RocksDB (trackTotalNumberOfRows: false) 3 3 0 3.4 290.6 0.1X


In-memory 0 0 0 23.8 41.9 1.0X
RocksDB (trackTotalNumberOfRows: true) 3 3 0 3.2 309.5 0.1X
RocksDB (trackTotalNumberOfRows: false) 3 3 0 3.2 309.9 0.1X

0 comments on commit 1efbf43

Please sign in to comment.