-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test Spark 4.0.0-SNAPSHOT #1909
base: main
Are you sure you want to change the base?
Conversation
java/bench/spark/pom.xml
Outdated
<exclude>META-INF/DUMMY.DSA</exclude> | ||
<exclude>META-INF/*.SF</exclude> | ||
<exclude>META-INF/*.DSA</exclude> | ||
<exclude>META-INF/*.RSA</exclude> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, Spark has fixed this problem by upgrading the arrow-vector
version, so there is no modification here.
[SPARK-47981][BUILD] Upgrade Arrow
to 16.0.0
@@ -74,7 +74,7 @@ | |||
@BenchmarkMode(Mode.AverageTime) | |||
@OutputTimeUnit(TimeUnit.MICROSECONDS) | |||
@AutoService(OrcBenchmark.class) | |||
@Fork(jvmArgsAppend = "--add-opens=java.base/sun.nio.ch=ALL-UNNAMED") | |||
@Fork(jvmArgsAppend = {"--add-opens=java.base/sun.nio.ch=ALL-UNNAMED", "--add-opens=java.base/sun.util.calendar=ALL-UNNAMED"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caused by: java.lang.IllegalAccessException: symbolic reference class is not accessible: class sun.util.calendar.ZoneInfo, from interface org.apache.spark.sql.catalyst.util.SparkDateTimeUtils (unnamed module @2b71fc7e)
at java.base/java.lang.invoke.MemberName.makeAccessException(MemberName.java:955)
at java.base/java.lang.invoke.MethodHandles$Lookup.checkSymbolicClass(MethodHandles.java:3686)
at java.base/java.lang.invoke.MethodHandles$Lookup.resolveOrFail(MethodHandles.java:3646)
at java.base/java.lang.invoke.MethodHandles$Lookup.findVirtual(MethodHandles.java:2680)
at org.apache.spark.sql.catalyst.util.SparkDateTimeUtils.$init$(SparkDateTimeUtils.scala:206)
at org.apache.spark.sql.catalyst.util.DateTimeUtils$.<clinit>(DateTimeUtils.scala:41)
java/bench/spark/src/java/org/apache/orc/bench/spark/SparkBenchmark.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for testing this. 😄
I'd recommend to create a JIRA for migration to Scala 2.13 of Apache Spark 3.5.1 first. :)
…mark ### What changes were proposed in this pull request? This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark. ### Why are the changes needed? #1909 (review) ### How was this patch tested? local test ```bash java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet -compress zstd -data taxi ``` ``` Benchmark (compression) (dataset) (format) Mode Cnt Score Error Units SparkBenchmark.partialRead zstd taxi parquet avgt 5 17211.731 ± 11836.315 us/op SparkBenchmark.partialRead:bytesPerRecord zstd taxi parquet avgt 5 0.002 # SparkBenchmark.partialRead:ops zstd taxi parquet avgt 5 10.000 # SparkBenchmark.partialRead:perRecord zstd taxi parquet avgt 5 0.001 ± 0.001 us/op SparkBenchmark.partialRead:records zstd taxi parquet avgt 5 113791180.000 # ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #1912 from cxzl25/ORC-1704. Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…mark ### What changes were proposed in this pull request? This PR aims to migrate to Scala 2.13 of Apache Spark 3.5.1 at SparkBenchmark. ### Why are the changes needed? #1909 (review) ### How was this patch tested? local test ```bash java -jar spark/target/orc-benchmarks-spark-2.1.0-SNAPSHOT.jar spark data -format=parquet -compress zstd -data taxi ``` ``` Benchmark (compression) (dataset) (format) Mode Cnt Score Error Units SparkBenchmark.partialRead zstd taxi parquet avgt 5 17211.731 ± 11836.315 us/op SparkBenchmark.partialRead:bytesPerRecord zstd taxi parquet avgt 5 0.002 # SparkBenchmark.partialRead:ops zstd taxi parquet avgt 5 10.000 # SparkBenchmark.partialRead:perRecord zstd taxi parquet avgt 5 0.001 ± 0.001 us/op SparkBenchmark.partialRead:records zstd taxi parquet avgt 5 113791180.000 # ``` ### Was this patch authored or co-authored using generative AI tooling? No Closes #1912 from cxzl25/ORC-1704. Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit dc634cb) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Hi, @cxzl25 . Sorry for asking this, but could you rebase this PR once more? |
Thank you! |
java/bench/spark/src/java/org/apache/orc/bench/spark/SparkBenchmark.java
Outdated
Show resolved
Hide resolved
…nchmark runs on JDK17 ### What changes were proposed in this pull request? This PR aims to fix `sun.util.calendar` IllegalAccessException when SparkBenchmark runs on JDK17. ### Why are the changes needed? #1909 (comment) ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes #1919 from cxzl25/ORC-1707. Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…nchmark runs on JDK17 ### What changes were proposed in this pull request? This PR aims to fix `sun.util.calendar` IllegalAccessException when SparkBenchmark runs on JDK17. ### Why are the changes needed? #1909 (comment) ### How was this patch tested? GA ### Was this patch authored or co-authored using generative AI tooling? No Closes #1919 from cxzl25/ORC-1707. Authored-by: sychen <sychen@ctrip.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 5bb2346) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Why are the changes needed?
How was this patch tested?
GA
Was this patch authored or co-authored using generative AI tooling?
No