-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32130][SQL][FOLLOWUP] Enable timestamps inference in JsonBenchmark #28981
[SPARK-32130][SQL][FOLLOWUP] Enable timestamps inference in JsonBenchmark #28981
Conversation
@dongjoon-hyun May I ask you to review this PR. |
Thank you for a quick follow-up~ |
read date from files 16583 16612 29 0.6 1658.3 0.2X | ||
timestamp strings 3927 3963 40 2.5 392.7 0.7X | ||
parse timestamps from Dataset[String] 52827 53004 243 0.2 5282.7 0.0X | ||
infer timestamps from Dataset[String] 101108 101644 769 0.1 10110.8 0.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
from_json(date) 43755 43782 27 0.2 4375.5 0.1X | ||
read timestamp text from files 2616 2641 34 3.8 261.6 1.0X | ||
read timestamps from files 37481 37517 58 0.3 3748.1 0.1X | ||
infer timestamps from files 84774 84964 201 0.1 8477.4 0.0X |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. I checked both places in the benchmark code and test result.
…mark ### What changes were proposed in this pull request? Set the JSON option `inferTimestamp` to `true` for the cases that measure perf of timestamp inference. ### Why are the changes needed? The PR #28966 disabled timestamp inference by default. As a consequence, some benchmarks don't measure perf of timestamp inference from JSON fields. This PR explicitly enable such inference. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By re-generating results of `JsonBenchmark`. Closes #28981 from MaxGekk/json-inferTimestamps-disable-by-default-followup. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 42f01e3) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Merged to master/3.0. This is irrelevant to Jenkins unit tests. |
What changes were proposed in this pull request?
Set the JSON option
inferTimestamp
totrue
for the cases that measure perf of timestamp inference.Why are the changes needed?
The PR #28966 disabled timestamp inference by default. As a consequence, some benchmarks don't measure perf of timestamp inference from JSON fields. This PR explicitly enable such inference.
Does this PR introduce any user-facing change?
No
How was this patch tested?
By re-generating results of
JsonBenchmark
.