Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close EPoll file descriptors during CRaC snapshotting #13308

Closed
wants to merge 1 commit into from

Conversation

rvansa
Copy link
Contributor

@rvansa rvansa commented Mar 30, 2023

Motivation:

This change intends to support an application using Netty with native epoll (e.g. Quarkus app) to perform the Checkpoint and Restore on JVM implementing this, specifically using OpenJDK CRaC or future versions of OpenJDK. Package org.crac is a facade that either forwards the invocation to actual implementation or provides a no-op implementation.

Modification:

The biggest risk factor here is that formerly final fields with file descriptors can change; theoretically this could affect performance in absence of checkpoint/restore process. However most likely a modern JVM is smart enough to assume that the field does not change, and optimistically consider it constant anyway.

Result:

File descriptors are closed before checkpoint and re-opened after restore. Eventloop thread is blocked during the snapshotting process to avoid using closed FDs.

Motivation:

This change intends to support an application using Netty with native epoll (e.g. Quarkus app) to perform the Checkpoint and Restore on JVM implementing this, specifically using OpenJDK CRaC or future versions of OpenJDK. Package org.crac is a facade that either forwards the invocation to actual implementation or provides a no-op implementation.

Modification:

The biggest risk factor here is that formerly final fields with file descriptors can change; theoretically this could affect performance in absence of checkpoint/restore process. However most likely a modern JVM is smart enough to assume that the field does not change, and optimistically consider it constant anyway.

Result:

File descriptors are closed before checkpoint and re-opened after restore. Eventloop thread is blocked during the snapshotting process to avoid using closed FDs.

Signed-off-by: Radim Vansa <rvansa@azul.com>
@normanmaurer
Copy link
Member

@rvansa we can't depend on crac as an non-optional dependency as it is GPL . See https://github.com/openjdk/crac/blob/crac/LICENSE

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

@normanmaurer Understood, I'll check for those licensing issues. I'll close this for now and reopen when I have the answer. Thanks!

@rvansa rvansa closed this Mar 30, 2023
@normanmaurer
Copy link
Member

@rvansa never mind it seems like the facade is ok https://github.com/CRaC/org.crac/blob/master/LICENSE ?

@rvansa rvansa reopened this Mar 30, 2023
@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

Yes, you're right, the facade is BSD. Main CRaC project inherits OpenJDK, so GPL 2 with classpath exception.

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

Btw. is there a list for organization-level contributors for Netty? As I no longer work for Red Hat, but moved to Azul Systems I should check if they're already on that list.

@normanmaurer
Copy link
Member

@rvansa you mean in terms of CLA ?

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

Yes, I probably shouldn't sign it as individual contributor.

@normanmaurer
Copy link
Member

Copy link
Contributor

@franz1981 franz1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @rvansa !! I would expect that the other transport need a similar treatment (at least NIO? No idea if Mac users can benefit by CRaC)

@normanmaurer
Copy link
Member

Also I think we would want to use some sort of "indirection" as we want to keep our core transports "zero dependency"

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

Hi @franz1981 , NIO-based stuff is handled directly in JDK. The reference implementation in https://github.com/openjdk/crac is limited to Linux only as this uses a slightly modified version of CRIU to perform the actual snapshot after getting things ready from the JVM side.

@normanmaurer Do you think that shading the org.crac facade into the JAR would be acceptable? It is a minimalistic 'indirection' on its own.

@chrisvest
Copy link
Contributor

@rvansa What about the file descriptors we have in open channels? What about the native memory we've allocated?

How are these handled in the JVM?

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

@chrisvest The application is responsible for closing any open file descriptors via the same notifications as I present here, E.g. in Quarkus using the native EPoll we do that over here [1]. Without the native epoll implementation that was sufficient, but with native epoll I had to apply the tweak presented in this PR.

Native memory is fine, CRIU will snapshot that and it will be restored transparently. In fact, CRIU could restore even some file descriptors, but in general keeping anything that communicates with anything out of the process (even e.g. the filesystem) is at risk. Therefore CRaC is more restrictive; the aim is to be able to migrate to a different machine and/or replicate the snapshoted process.

[1] https://github.com/quarkusio/quarkus/blob/main/extensions/vertx-http/runtime/src/main/java/io/quarkus/vertx/http/runtime/VertxHttpRecorder.java#L1285-L1303

@trustin
Copy link
Member

trustin commented Mar 30, 2023

Thanks for the pull request, @rvansa! I would expect some automated test to be part of this pull request, though.

@rvansa
Copy link
Contributor Author

rvansa commented Mar 30, 2023

@trustin Running a full-blown test would require downloading a custom JDK and CRIU binary, and requires Linux (CRaC currently does not support Macs). Possibly this could be simplified using a Docker image, but in any case it would be pretty heavyweight.
Alternatively I could mock CRaC mechanics and invoke the beforeCheckpoint/afterRestore manually. Asserting that the file descriptors are really closed would be possible using /proc/self/fd directory contents.
Which type of test would you prefer?

@franz1981
Copy link
Contributor

@rvansa just curious: why not using an external mechanism to do it?
If you are outside the event loop threads requesting all the existing connections to close should still work and would likely benefit from being in a quiescient state (ie no pending work, by definition) in a more natural way, because external request are always enqueued in a FIFO manner.

@rvansa
Copy link
Contributor Author

rvansa commented May 16, 2023

@franz1981 I am not sure if I follow. In the use case I have the existing connections are closed but in addition to that this PR needs to close these extra file descriptors. By blocking the eventloop I make sure that there's noone that could observe the internal, closed state.
I realized this has the limitation if anyone tries to do a similar thing on a different place in the application, waiting for the signal from a task enqueued to the eventloop (as a part of the C/R) - this waiting would block forever. If this proves to be a problem this would need a solution using centralized per-event-loop component.

@franz1981
Copy link
Contributor

franz1981 commented May 16, 2023

What I mean @rvansa is that you can open up the relevant methods (to close fds and mark the state as no pending work can ever be submitted and/or to recreate fds) on the event loop (group) in order to do the same from outside (as an @unstable API) and be sure that all event loops has reached this "quiescent" state without preventing each others to make progress.

If this proves to be a problem this would need a solution using centralized per-event-loop component.

that's my point; in term of responsibility:

  • who is in charge to fix a concurrency/racing cond problem if you make use on unstable APi from outside?
  • who is in charge if in Netty we expose some hook to react to something on CRaC but we made it wrong?

Maybe the person that perform the fix is the same but in one case you have a talking Unstable API and in the other a supported snapshotting mechanism.
What's clear to me if, whatever is external (via unsafe methods) or internal (via this proposal) we need to be sure it just work, while here I see we would like your comment to be solved upfront, if possible.

@rvansa
Copy link
Contributor Author

rvansa commented May 17, 2023

whatever is external (via unsafe methods) or internal (via this proposal) we need to be sure it just work

Exactly, my goal is that anyone using Netty as dependency does not have to care about implementation details. If this would be somehow exposed into API (exposing some suspend/resume methods but not implementing the Resource and registering for checkpoint), I don't see how this could be satisfied. My goal is to really not push any burden downstream.

I don't think that we can solve the most generic case of multiple interdependent eventloops in here. I would have no objections to making those openFileDescriptors/closeFileDescriptors protected (and maybe tag with @UnstableApi if you think it's needed) - that way the beforeCheckpoint/afterRestore can be overridden in a more complex use case.

In my previous comment I was referring to a situation where 2 client connection pools used the same event loop, so trying to block the eventloop in there 2 times did not work. If, in that case, I could guarantee that the eventloop would be blocked anyway (it was not as that was not epoll impl. but NIO), some parts of the code would not be needed at all.

I see that you're asking for a generic solution to a problem we don't have, and don't know, we only theoretize about it. Without a proper problem definition it's hard to sketch a solution. Therefore I propose to fix the problem we can already define, and add a little bit of flexibility through enabling subclasses to override the behaviour. Would that work for you?

@franz1981
Copy link
Contributor

franz1981 commented May 23, 2023

Would that work for you?

I would prefer to keep any external deps outside Netty, as said in the last 2 points in my previous comment.
Meaning:

  • provide the hooks to allow users interested into it to use them (and/or build a default solution that's not maintained by the Netty team, in external repo, to alleviate the life of users, as per the your comment on downstream burden) and marking such hooks as UnstableAPI
  • writes tests that makes sure such APIs does what they mean and work correctly respectful to the known states of the event-loop(s)

If CraC become the standard the facto/a solution supported universally on the JDk, then I see no harm to bring such dep in.

Beware; mine is not the pov of the prj lead/maintainer, I'm just a committer, so @normanmaurer can give his pov on this one

@normanmaurer
Copy link
Member

I agree with @franz1981 here... I would like to keep the core zero-dependency. Can't we maybe fix this with some sort of SPI ?

@rvansa
Copy link
Contributor Author

rvansa commented May 23, 2023

SPI solution might be possible, though any thing that requires user intervention will complicate the life for users:

  • They would have to start thinking about it first. They will get an error about an anonymous file descriptor being open during the checkpoint; finding that this originates from Netty requires non-trivial debugging as it's opened in native code. There would have to be a special tool they would use that detects that they have Netty amongst dependencies and suggest to add something that fixes it as another dependency.
  • Version of Netty and the fix could run out of sync.
  • Only adding a dependency is not enough, because the classes are loaded lazily. Either the application would have to explicitly call into it, or the SPI would need to actively check if an implementation (the fix) is present (e.g. through a ServiceLoader).

I think that all this is too much fuss, so I have doubts if it would happen. Before becoming part of future OpenJDK, do you think that substantial user interest might convince you into integration as a direct dependency?

@normanmaurer
Copy link
Member

@rvansa yeah if there is enough interest we can revisit... For now I would prefer to not add this dependency. Like I said if we can make it work with SPI or something like that I would be ok with it.

@rvansa
Copy link
Contributor Author

rvansa commented May 30, 2023

Based on the previous discussion, I'll close this for now and eventually prepare another PR that will allow the FDs to be closed externally.

@rvansa rvansa closed this May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants