ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

cxzl25 · 2024-04-25T03:27:39Z

What changes were proposed in this pull request?

This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark.

Why are the changes needed?

#1909 (review)

How was this patch tested?

local test

java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet  -compress zstd -data taxi

Benchmark                                  (compression)  (dataset)  (format)  Mode  Cnt          Score       Error  Units
SparkBenchmark.partialRead                          zstd       taxi   parquet  avgt    5      17211.731 ± 11836.315  us/op
SparkBenchmark.partialRead:bytesPerRecord           zstd       taxi   parquet  avgt    5          0.002                  #
SparkBenchmark.partialRead:ops                      zstd       taxi   parquet  avgt    5         10.000                  #
SparkBenchmark.partialRead:perRecord                zstd       taxi   parquet  avgt    5          0.001 ±     0.001  us/op
SparkBenchmark.partialRead:records                  zstd       taxi   parquet  avgt    5  113791180.000                  #

Was this patch authored or co-authored using generative AI tooling?

No

dongjoon-hyun

+1, LGTM. Thank you, @cxzl25 .

…mark ### What changes were proposed in this pull request? This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark. ### Why are the changes needed? #1909 (review) ### How was this patch tested? local test ```bash java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet -compress zstd -data taxi ``` ``` Benchmark (compression) (dataset) (format) Mode Cnt Score Error Units SparkBenchmark.partialRead zstd taxi parquet avgt 5 17211.731 ± 11836.315 us/op SparkBenchmark.partialRead:bytesPerRecord zstd taxi parquet avgt 5 0.002 # SparkBenchmark.partialRead:ops zstd taxi parquet avgt 5 10.000 # SparkBenchmark.partialRead:perRecord zstd taxi parquet avgt 5 0.001 ± 0.001 us/op SparkBenchmark.partialRead:records zstd taxi parquet avgt 5 113791180.000 # ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #1912 from cxzl25/ORC-1704. Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dc634cb) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun · 2024-04-25T04:12:02Z

Merged to main/2.0.

2.13

e24892b

github-actions bot added BUILD JAVA labels Apr 25, 2024

dongjoon-hyun approved these changes Apr 25, 2024

View reviewed changes

dongjoon-hyun added this to the 2.0.1 milestone Apr 25, 2024

dongjoon-hyun closed this in dc634cb Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

cxzl25 commented Apr 25, 2024

dongjoon-hyun left a comment

dongjoon-hyun commented Apr 25, 2024

ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

Conversation

cxzl25 commented Apr 25, 2024

What changes were proposed in this pull request?

Why are the changes needed?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Apr 25, 2024