-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33635][SS] Adjust the order of check in KafkaTokenUtil.needTokenUpdate to remedy perf regression #31056
Conversation
…enUpdate to remedy perf regression
I just made the smallest change as a short-term solution (the problem will persist for end users using Kafka delegation token), as Spark 3.1.0 RC vote is happening and I don't want to drag the release too much. It would be ideal if we deal with long-term solution (making overhead of |
The further optimization should be depending on the possibility of "changes" in SparkConf on the fly. If we don't allow it (or at least Kafka/token related configuration), we just need to call |
In addition, the time in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a quick fix, it looks good as it just reorders the conditions. Even we will deal with long-term solution, I think it is still no harm to have the quick fix now.
Test build #133711 has finished for PR 31056 at commit
|
Kubernetes integration test starting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @HeartSaVioR and @viirya . I agree with you with this fix.
Merged to master/3.1/3.0.
…enUpdate to remedy perf regression ### What changes were proposed in this pull request? This PR proposes to adjust the order of check in KafkaTokenUtil.needTokenUpdate, so that short-circuit applies on the non-delegation token cases (insecure + secured without delegation token) and remedies the performance regression heavily. ### Why are the changes needed? There's a serious performance regression between Spark 2.4 vs Spark 3.0 on read path against Kafka data source. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually ran a reproducer (https://github.com/codegorillauk/spark-kafka-read with modification to just count instead of writing to Kafka topic) with measuring the time. > the branch applying the change with adding measurement https://github.com/HeartSaVioR/spark/commits/debug-SPARK-33635-v3.0.1 > the branch only adding measurement https://github.com/HeartSaVioR/spark/commits/debug-original-ver-SPARK-33635-v3.0.1 > the result (before the fix) count: 10280000 Took 41.634007047 secs 21/01/06 13:16:07 INFO KafkaDataConsumer: debug ver. 17-original 21/01/06 13:16:07 INFO KafkaDataConsumer: Total time taken to retrieve: 82118 ms > the result (after the fix) count: 10280000 Took 7.964058475 secs 21/01/06 13:08:22 INFO KafkaDataConsumer: debug ver. 17 21/01/06 13:08:22 INFO KafkaDataConsumer: Total time taken to retrieve: 987 ms Closes #31056 from HeartSaVioR/SPARK-33635. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit fa93090) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
…enUpdate to remedy perf regression ### What changes were proposed in this pull request? This PR proposes to adjust the order of check in KafkaTokenUtil.needTokenUpdate, so that short-circuit applies on the non-delegation token cases (insecure + secured without delegation token) and remedies the performance regression heavily. ### Why are the changes needed? There's a serious performance regression between Spark 2.4 vs Spark 3.0 on read path against Kafka data source. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually ran a reproducer (https://github.com/codegorillauk/spark-kafka-read with modification to just count instead of writing to Kafka topic) with measuring the time. > the branch applying the change with adding measurement https://github.com/HeartSaVioR/spark/commits/debug-SPARK-33635-v3.0.1 > the branch only adding measurement https://github.com/HeartSaVioR/spark/commits/debug-original-ver-SPARK-33635-v3.0.1 > the result (before the fix) count: 10280000 Took 41.634007047 secs 21/01/06 13:16:07 INFO KafkaDataConsumer: debug ver. 17-original 21/01/06 13:16:07 INFO KafkaDataConsumer: Total time taken to retrieve: 82118 ms > the result (after the fix) count: 10280000 Took 7.964058475 secs 21/01/06 13:08:22 INFO KafkaDataConsumer: debug ver. 17 21/01/06 13:08:22 INFO KafkaDataConsumer: Total time taken to retrieve: 987 ms Closes #31056 from HeartSaVioR/SPARK-33635. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> (cherry picked from commit fa93090) Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
cc @HyukjinKwon since he is the release manager of Spark 3.1.0. |
Kubernetes integration test status success |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late LGTM.
Since |
What changes were proposed in this pull request?
This PR proposes to adjust the order of check in KafkaTokenUtil.needTokenUpdate, so that short-circuit applies on the non-delegation token cases (insecure + secured without delegation token) and remedies the performance regression heavily.
Why are the changes needed?
There's a serious performance regression between Spark 2.4 vs Spark 3.0 on read path against Kafka data source.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually ran a reproducer (https://github.com/codegorillauk/spark-kafka-read with modification to just count instead of writing to Kafka topic) with measuring the time.
https://github.com/HeartSaVioR/spark/commits/debug-SPARK-33635-v3.0.1
https://github.com/HeartSaVioR/spark/commits/debug-original-ver-SPARK-33635-v3.0.1
count: 10280000
Took 41.634007047 secs
21/01/06 13:16:07 INFO KafkaDataConsumer: debug ver. 17-original
21/01/06 13:16:07 INFO KafkaDataConsumer: Total time taken to retrieve: 82118 ms
count: 10280000
Took 7.964058475 secs
21/01/06 13:08:22 INFO KafkaDataConsumer: debug ver. 17
21/01/06 13:08:22 INFO KafkaDataConsumer: Total time taken to retrieve: 987 ms