Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-1704: Migration to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark #1912

Closed
wants to merge 1 commit into from

Conversation

cxzl25
Copy link
Contributor

@cxzl25 cxzl25 commented Apr 25, 2024

What changes were proposed in this pull request?

This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark.

Why are the changes needed?

#1909 (review)

How was this patch tested?

local test

java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet  -compress zstd -data taxi
Benchmark                                  (compression)  (dataset)  (format)  Mode  Cnt          Score       Error  Units
SparkBenchmark.partialRead                          zstd       taxi   parquet  avgt    5      17211.731 ± 11836.315  us/op
SparkBenchmark.partialRead:bytesPerRecord           zstd       taxi   parquet  avgt    5          0.002                  #
SparkBenchmark.partialRead:ops                      zstd       taxi   parquet  avgt    5         10.000                  #
SparkBenchmark.partialRead:perRecord                zstd       taxi   parquet  avgt    5          0.001 ±     0.001  us/op
SparkBenchmark.partialRead:records                  zstd       taxi   parquet  avgt    5  113791180.000                  #

Was this patch authored or co-authored using generative AI tooling?

No

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @cxzl25 .

@dongjoon-hyun dongjoon-hyun added this to the 2.0.1 milestone Apr 25, 2024
dongjoon-hyun pushed a commit that referenced this pull request Apr 25, 2024
…mark

### What changes were proposed in this pull request?
This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark.

### Why are the changes needed?
#1909 (review)

### How was this patch tested?
local test

```bash
java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet  -compress zstd -data taxi
```

```
Benchmark                                  (compression)  (dataset)  (format)  Mode  Cnt          Score       Error  Units
SparkBenchmark.partialRead                          zstd       taxi   parquet  avgt    5      17211.731 ± 11836.315  us/op
SparkBenchmark.partialRead:bytesPerRecord           zstd       taxi   parquet  avgt    5          0.002                  #
SparkBenchmark.partialRead:ops                      zstd       taxi   parquet  avgt    5         10.000                  #
SparkBenchmark.partialRead:perRecord                zstd       taxi   parquet  avgt    5          0.001 ±     0.001  us/op
SparkBenchmark.partialRead:records                  zstd       taxi   parquet  avgt    5  113791180.000                  #
```

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #1912 from cxzl25/ORC-1704.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit dc634cb)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member

Merged to main/2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants