New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-19453] Interrupted class loading can lead to NoClassDefFoundError #19
Conversation
I get what I think is the expected stack trace when the new test fails:
|
I feel like the right solution would be that when we catch |
Tried
without success. |
} | ||
} | ||
}); | ||
Thread.sleep(2500); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that the first two class loads succeed but the third fails. A better test would use semaphores rather than timing (cf. the test before this one).
Thank you for a pull request! Please check this document for how the Jenkins project handles pull requests |
So the problem here is that if the thread gets interrupted while |
I think the crux of the issue is that, in general, a failed class loading will never get retried. It is certainly the case when another class loading has triggered a recurisve class loading as in this test case. In #19 Jesse proposed to fix this by dropping the whole RemoteClassLoader. But this will not work since the said RemoteClassLoader cloud have loaded other classes that might be already running. Dropping a classloader will not drop these classes, and we end up just creating another classloader that loads incompatible classes. That is a disaster waiting to happen. In this fix, we simply make findClass non-interruptible. That is, if a blocking remote operation gets interrupted, we catch that and simply retry and refuse to give up. We'll remember to set the interrupt flag back on, however, so that the interrupt signal doesn't disappear. The net effect is as if the delivery of the interrupt was bit late than usual. The way I see it, this is the only way to fix this. The nesting of control structures here is horrible, but I can't think of any better way to write this.
Future<byte[]> img = cr.classImage.resolve(channel, name.replace('.', '/') + ".class"); | ||
if (img.isDone()) { | ||
boolean interrupted = false; | ||
try {// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to add a comment here?
I think your approach is right. Since unfortunately Java class loading is not atomic with respect to static initializers—i.e. the JVM fails to cleanly roll back to the prior state if some classes are loaded but then an initializer blocks loading of one—probably the best we can do is to just doggedly force the class to be loaded. |
I think the crux of the issue is that, in general, a failed class loading will never get retried. It is certainly the case when another class loading has triggered a recurisve class loading as in this test case. In #19 Jesse proposed to fix this by dropping the whole RemoteClassLoader. But this will not work since the said RemoteClassLoader cloud have loaded other classes that might be already running. Dropping a classloader will not drop these classes, and we end up just creating another classloader that loads incompatible classes. That is a disaster waiting to happen. In this fix, we simply make findClass non-interruptible. That is, if a blocking remote operation gets interrupted, we catch that and simply retry and refuse to give up. We'll remember to set the interrupt flag back on, however, so that the interrupt signal doesn't disappear. The net effect is as if the delivery of the interrupt was bit late than usual. The way I see it, this is the only way to fix this. The nesting of control structures here is horrible, but I can't think of any better way to write this.
Rebased and integrated toward 2.33. |
JENKINS-19453: interrupting class loading irrecoverably breaks the class being loaded. Filing for evaluation; so far just have a failing test.