-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-34115][CORE] Check SPARK_TESTING as lazy val to avoid slowdown #31244
Conversation
…s with many environment variables.
cc @dongjoon-hyun and @holdenk FYI. seems like this affects particularly K8S that automatically creates many envs according to the JIRA. |
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
Jenkins, ok to test |
Test build #134225 has started for PR 31244 at commit |
Kubernetes integration test starting |
Kubernetes integration test status success |
retest this please |
Kubernetes integration test starting |
Kubernetes integration test status success |
Test build #134228 has finished for PR 31244 at commit
|
Merged to master, branch-3.1 and branch-3.0. It is technically an improvement but I think it's really safe to backport. |
Thanks for your contribution @nob13 |
### What changes were proposed in this pull request? Check SPARK_TESTING as lazy val to avoid slow down when there are many environment variables ### Why are the changes needed? If there are many environment variables, sys.env slows is very slow. As Utils.isTesting is called very often during Dataframe-Optimization, this can slow down evaluation very much. An example for triggering the problem can be found in the bug ticket https://issues.apache.org/jira/browse/SPARK-34115 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? With the example provided in the ticket. Closes #31244 from nob13/bug/34115. Lead-authored-by: Norbert Schultz <norbert.schultz@reactivecore.de> Co-authored-by: Norbert Schultz <noschultz@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> (cherry picked from commit c3d8352) Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request? Check SPARK_TESTING as lazy val to avoid slow down when there are many environment variables ### Why are the changes needed? If there are many environment variables, sys.env slows is very slow. As Utils.isTesting is called very often during Dataframe-Optimization, this can slow down evaluation very much. An example for triggering the problem can be found in the bug ticket https://issues.apache.org/jira/browse/SPARK-34115 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? With the example provided in the ticket. Closes #31244 from nob13/bug/34115. Lead-authored-by: Norbert Schultz <norbert.schultz@reactivecore.de> Co-authored-by: Norbert Schultz <noschultz@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> (cherry picked from commit c3d8352) Signed-off-by: HyukjinKwon <gurwls223@apache.org>
Looks good to me, too.
nit: Could you update the PR description accordingly? |
Thank you, @nob13 and @HyukjinKwon and @maropu |
### What changes were proposed in this pull request? Check SPARK_TESTING as lazy val to avoid slow down when there are many environment variables ### Why are the changes needed? If there are many environment variables, sys.env slows is very slow. As Utils.isTesting is called very often during Dataframe-Optimization, this can slow down evaluation very much. An example for triggering the problem can be found in the bug ticket https://issues.apache.org/jira/browse/SPARK-34115 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? With the example provided in the ticket. Closes apache#31244 from nob13/bug/34115. Lead-authored-by: Norbert Schultz <norbert.schultz@reactivecore.de> Co-authored-by: Norbert Schultz <noschultz@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
What changes were proposed in this pull request?
Check SPARK_TESTING as lazy val to avoid slow down when there are many environment variables
Why are the changes needed?
If there are many environment variables, sys.env slows is very slow. As Utils.isTesting is called very often during Dataframe-Optimization, this can slow down evaluation very much.
An example for triggering the problem can be found in the bug ticket https://issues.apache.org/jira/browse/SPARK-34115
Does this PR introduce any user-facing change?
No
How was this patch tested?
With the example provided in the ticket.