-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to checkpoint container with -nvproxy
after the introduction of driverABI
#9649
Comments
This makes sense. The driver ABI should be savable. Happy to review your PR. Although this would imply that the container must be restored on a host with the same nvidia driver version. If the driver version can change, then the ABI would need to be rebuilt (which requires extra work). |
Gotcha. This is true in our case (for the most part :) ). |
This allows containers started with `-nvproxy` to be checkpointed, essentially ignoring the state of the `driverABI`. The downside of this path is "[...] this would imply that the container must be restored on a host with the same nvidia driver version." Closes: google#9649 (comment) cc @ayushr2
Submitted the patch here. Thanks a lot for the review. |
Description
Similar to #9363, the
driverABI
struct doesn't implementSaverLoader
.I applied a similar patch to #9385 and am able to checkpoint containers with
-nvproxy
successfully (still testing restore; patch below). I'm happy to submit a PR but wondering if this makes sense and what are the implications of not saving this state.The patch would be made here.
Does this make sense?
The text was updated successfully, but these errors were encountered: