-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-25658][SQL][TEST] Refactor HashByteArrayBenchmark to use main method #22652
Conversation
Test build #97040 has finished for PR 22652 at commit
|
retest this please |
Test build #97043 has finished for PR 22652 at commit
|
retest this please |
Test build #97047 has finished for PR 22652 at commit
|
* {{{ | ||
* 1. without sbt: bin/spark-submit --class <this class> <spark sql test jar> | ||
* 2. build/sbt "sql/test:runMain <this class>" | ||
* 3. generate result: SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain <this class>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql/test
-> catalyst/test
.
If we use sql/test
, the result will be generated in sql
module instead of catalyst
module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for both line 32 and 33.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please check line 31, too.
Could you review and merge wangyum#17 ? |
Test build #97059 has finished for PR 22652 at commit
|
Test build #97060 has finished for PR 22652 at commit
|
Test build #97066 has finished for PR 22652 at commit
|
import org.apache.spark.sql.catalyst.expressions.{HiveHasher, XXH64} | ||
import org.apache.spark.unsafe.Platform | ||
import org.apache.spark.unsafe.hash.Murmur3_x86_32 | ||
|
||
/** | ||
* Synthetic benchmark for MurMurHash 3 and xxHash64. | ||
* To run this benchmark: | ||
* {{{ | ||
* 1. without sbt: bin/spark-submit --class <this class> <spark catalyst test jar> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a correct guide? BenchmarkBase
is in a different jar file, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we missed this because we thought this is a legacy guide which has been worked before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right:
LM-SHC-16502798:spark yumwang$ bin/spark-submit --class org.apache.spark.sql.HashByteArrayBenchmark ./sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar18/10/07 07:35:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/benchmark/BenchmarkBase
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
......
The correct usage should be:
bin/spark-submit --class org.apache.spark.sql.HashByteArrayBenchmark --jars ./core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar ./sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar
Test build #97075 has finished for PR 22652 at commit
|
Test build #97074 has finished for PR 22652 at commit
|
retest this please |
Test build #97077 has finished for PR 22652 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @wangyum .
+1, LGTM. Merged to master.
…method ## What changes were proposed in this pull request? Refactor `HashByteArrayBenchmark` to use main method. 1. use `spark-submit`: ```console bin/spark-submit --class org.apache.spark.sql.HashByteArrayBenchmark --jars ./core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar ./sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar ``` 2. Generate benchmark result: ```console SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "catalyst/test:runMain org.apache.spark.sql.HashByteArrayBenchmark" ``` ## How was this patch tested? manual tests Closes apache#22652 from wangyum/SPARK-25658. Lead-authored-by: Yuming Wang <wgyumg@gmail.com> Co-authored-by: Yuming Wang <yumwang@ebay.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
Refactor
HashByteArrayBenchmark
to use main method.spark-submit
:bin/spark-submit --class org.apache.spark.sql.HashByteArrayBenchmark --jars ./core/target/spark-core_2.11-3.0.0-SNAPSHOT-tests.jar ./sql/catalyst/target/spark-catalyst_2.11-3.0.0-SNAPSHOT-tests.jar
SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "catalyst/test:runMain org.apache.spark.sql.HashByteArrayBenchmark"
How was this patch tested?
manual tests