-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close EPoll file descriptors during CRaC snapshotting #13308
Conversation
Motivation: This change intends to support an application using Netty with native epoll (e.g. Quarkus app) to perform the Checkpoint and Restore on JVM implementing this, specifically using OpenJDK CRaC or future versions of OpenJDK. Package org.crac is a facade that either forwards the invocation to actual implementation or provides a no-op implementation. Modification: The biggest risk factor here is that formerly final fields with file descriptors can change; theoretically this could affect performance in absence of checkpoint/restore process. However most likely a modern JVM is smart enough to assume that the field does not change, and optimistically consider it constant anyway. Result: File descriptors are closed before checkpoint and re-opened after restore. Eventloop thread is blocked during the snapshotting process to avoid using closed FDs. Signed-off-by: Radim Vansa <rvansa@azul.com>
@rvansa we can't depend on crac as an non-optional dependency as it is GPL . See https://github.com/openjdk/crac/blob/crac/LICENSE |
@normanmaurer Understood, I'll check for those licensing issues. I'll close this for now and reopen when I have the answer. Thanks! |
@rvansa never mind it seems like the facade is ok https://github.com/CRaC/org.crac/blob/master/LICENSE ? |
Yes, you're right, the facade is BSD. Main CRaC project inherits OpenJDK, so GPL 2 with classpath exception. |
Btw. is there a list for organization-level contributors for Netty? As I no longer work for Red Hat, but moved to Azul Systems I should check if they're already on that list. |
@rvansa you mean in terms of CLA ? |
Yes, I probably shouldn't sign it as individual contributor. |
@rvansa azul is not on the list yet... Azul can sign it here: https://docs.google.com/forms/d/e/1FAIpQLSfXTK6SnWWFbR50DFhoZq2floCFOUBH3kG8sZP77im5Rknctg/viewform?formkey=dG9wTmhoeGNGd1MtdFdtVXl4TlVSNlE6MQ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @rvansa !! I would expect that the other transport need a similar treatment (at least NIO? No idea if Mac users can benefit by CRaC)
Also I think we would want to use some sort of "indirection" as we want to keep our core transports "zero dependency" |
Hi @franz1981 , NIO-based stuff is handled directly in JDK. The reference implementation in https://github.com/openjdk/crac is limited to Linux only as this uses a slightly modified version of CRIU to perform the actual snapshot after getting things ready from the JVM side. @normanmaurer Do you think that shading the |
@rvansa What about the file descriptors we have in open channels? What about the native memory we've allocated? How are these handled in the JVM? |
@chrisvest The application is responsible for closing any open file descriptors via the same notifications as I present here, E.g. in Quarkus using the native EPoll we do that over here [1]. Without the native epoll implementation that was sufficient, but with native epoll I had to apply the tweak presented in this PR. Native memory is fine, CRIU will snapshot that and it will be restored transparently. In fact, CRIU could restore even some file descriptors, but in general keeping anything that communicates with anything out of the process (even e.g. the filesystem) is at risk. Therefore CRaC is more restrictive; the aim is to be able to migrate to a different machine and/or replicate the snapshoted process. |
Thanks for the pull request, @rvansa! I would expect some automated test to be part of this pull request, though. |
@trustin Running a full-blown test would require downloading a custom JDK and CRIU binary, and requires Linux (CRaC currently does not support Macs). Possibly this could be simplified using a Docker image, but in any case it would be pretty heavyweight. |
@rvansa just curious: why not using an external mechanism to do it? |
@franz1981 I am not sure if I follow. In the use case I have the existing connections are closed but in addition to that this PR needs to close these extra file descriptors. By blocking the eventloop I make sure that there's noone that could observe the internal, closed state. |
What I mean @rvansa is that you can open up the relevant methods (to close fds and mark the state as no pending work can ever be submitted and/or to recreate fds) on the event loop (group) in order to do the same from outside (as an @unstable API) and be sure that all event loops has reached this "quiescent" state without preventing each others to make progress.
that's my point; in term of responsibility:
Maybe the person that perform the fix is the same but in one case you have a talking Unstable API and in the other a supported snapshotting mechanism. |
Exactly, my goal is that anyone using Netty as dependency does not have to care about implementation details. If this would be somehow exposed into API (exposing some suspend/resume methods but not implementing the Resource and registering for checkpoint), I don't see how this could be satisfied. My goal is to really not push any burden downstream. I don't think that we can solve the most generic case of multiple interdependent eventloops in here. I would have no objections to making those In my previous comment I was referring to a situation where 2 client connection pools used the same event loop, so trying to block the eventloop in there 2 times did not work. If, in that case, I could guarantee that the eventloop would be blocked anyway (it was not as that was not epoll impl. but NIO), some parts of the code would not be needed at all. I see that you're asking for a generic solution to a problem we don't have, and don't know, we only theoretize about it. Without a proper problem definition it's hard to sketch a solution. Therefore I propose to fix the problem we can already define, and add a little bit of flexibility through enabling subclasses to override the behaviour. Would that work for you? |
I would prefer to keep any external deps outside Netty, as said in the last 2 points in my previous comment.
If CraC become the standard the facto/a solution supported universally on the JDk, then I see no harm to bring such dep in. Beware; mine is not the pov of the prj lead/maintainer, I'm just a committer, so @normanmaurer can give his pov on this one |
I agree with @franz1981 here... I would like to keep the core zero-dependency. Can't we maybe fix this with some sort of SPI ? |
SPI solution might be possible, though any thing that requires user intervention will complicate the life for users:
I think that all this is too much fuss, so I have doubts if it would happen. Before becoming part of future OpenJDK, do you think that substantial user interest might convince you into integration as a direct dependency? |
@rvansa yeah if there is enough interest we can revisit... For now I would prefer to not add this dependency. Like I said if we can make it work with SPI or something like that I would be ok with it. |
Based on the previous discussion, I'll close this for now and eventually prepare another PR that will allow the FDs to be closed externally. |
Motivation:
This change intends to support an application using Netty with native epoll (e.g. Quarkus app) to perform the Checkpoint and Restore on JVM implementing this, specifically using OpenJDK CRaC or future versions of OpenJDK. Package org.crac is a facade that either forwards the invocation to actual implementation or provides a no-op implementation.
Modification:
The biggest risk factor here is that formerly final fields with file descriptors can change; theoretically this could affect performance in absence of checkpoint/restore process. However most likely a modern JVM is smart enough to assume that the field does not change, and optimistically consider it constant anyway.
Result:
File descriptors are closed before checkpoint and re-opened after restore. Eventloop thread is blocked during the snapshotting process to avoid using closed FDs.