-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20247][CORE] Add jar but this jar is missing later shouldn't affect jobs that doesn't use this jar #17558
Conversation
Test build #75590 has started for PR 17558 at commit |
@wangyum what if the task requires that jar? From your fix what I got is that you catch the exception and make it warning log instead, but what if that task requires the jar, will your fix suppress the exception or defer the exception to others like |
Test FAILed. |
agreed, why would the jar be missing? |
|
@wangyum , the fix of your PR is more like a bug fix, whereas the comment above is actually a feature request, these two things are not completely matched. I would suggest to focus on shriftserver side to address your requirement, not a bandaid like fix in the executor side. |
gentle ping @wangyum, is there any opinion on ^ ? |
@HyukjinKwon |
Just for clarity, this approach is broken - removal of jar will not cause already loaded classes in jvm to be removed. Pursuing this approach is brittle and will definitely break when classes from the jar have already been used : causing runtime failures. Proper resolution would be for appropriate classloader to be dropped and new one created with the replaced jar (refer to how j2ee containers handle this scenario for war files (for example)). |
I just checked out the spark branch-2.1 branch. I can build everything successfully if I do not run the tests, but if I run the tests, I see this unit test failure:
Is the there a problem with the build, or do is there something I can do to fix this locally? |
Just to make sure, are they consistently being failed or just flaky tests? |
I ran it again, and got a different failure this time. Still in the core module, but not sure if its before or after the tests that failed the first time.
I'll try again. It takes a long time to run each time. Over 20 minutes just to get to the failed test, and its not even 1/3 of the way through all the tests. |
The 3rd time I ran, it ran for 42 minutes, and failed further on in catalyst tests. Like you say, it does seem that the tests are flaky, but why? The failures seem so random.
|
The 4th time it failed here again:
|
It continues to fail with one of the above errors. Here is the command I use to build. |
I ran one more time and the tests hung on serializer manager integration.
I will checkout based on the 2.1.1 tag next. That should be stable. |
What changes were proposed in this pull request?
Catch exception when jar is missing, as SPARK-20247 described.
How was this patch tested?
unit tests and manual tests