Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-dump support #808

Merged
merged 4 commits into from
Jan 18, 2022
Merged

Conversation

adrianreber
Copy link
Contributor

CRIU supports the concept of pre-copy migration. Instead of creating a
complete checkpoint of a process it is possible to only write the memory
of the process to disk while the process keeps on running. The first
memory only checkpoint can then be transferred to the migration
destination. During the transfer time it is possible take a second
checkpoint which will only write the memory pages to disk that have
changed since the previous checkpoint. This way it can be possible that
the second checkpoint is much smaller while the process keeps on running
which also means that the amount of data which needs to be transferred
to the migration destination might be smaller and thus the migration
downtime can be reduced. This only makes sense if the number of memory
pages which are changing is rather small. There is no limit on the
number pre-copy iterations.

This commit takes the interface as implemented in runc and implements it
for crun. Podman already uses the pre-dump as implemented by runc.

This commit also makes sure that the underlying software stack supports
the pre-dump mechanism. CRIU uses the kernel's dirty page tracking and
it is not available on all architectures (aarch64 does not implement it)
or might not be enabled in the kernel. If the user wants to use pre-dump
on a system without dirty page tracking crun will fail early and inform
the user.

This crun pre-dump implementation relies on libcriu interfaces which are
not yet part of the latest release (3.16.1). So at least 3.16.2 or 3.17
is required to use pre-dump in combination with crun.

@adrianreber adrianreber force-pushed the 2021-12-13-pre-dump-support branch 4 times, most recently from dbcb7c2 to 9162635 Compare December 14, 2021 11:01
Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@giuseppe
Copy link
Member

the PR is still in Draft, is there anything more left to do before we can merge?

@adrianreber
Copy link
Contributor Author

the PR is still in Draft, is there anything more left to do before we can merge?

There are still a few libcriu changes under review and I would like to wait merging this until it is merged on the CRIU side:

I do not expect the API to change on the CRIU side during review, but still, if it works for you, we should wait for a couple of days.

@giuseppe
Copy link
Member

I do not expect the API to change on the CRIU side during review, but still, if it works for you, we should wait for a couple of days.

sure, that works for me.

CRIU supports the concept of pre-copy migration. Instead of creating a
complete checkpoint of a process it is possible to only write the memory
of the process to disk while the process keeps on running. The first
memory only checkpoint can then be transferred to the migration
destination. During the transfer time it is possible take a second
checkpoint which will only write the memory pages to disk that have
changed since the previous checkpoint. This way it can be possible that
the second checkpoint is much smaller while the process keeps on running
which also means that the amount of data which needs to be transferred
to the migration destination might be smaller and thus the migration
downtime can be reduced. This only makes sense if the number of memory
pages which are changing is rather small. There is no limit on the
number pre-copy iterations.

This commit takes the interface as implemented in runc and implements it
for crun. Podman already uses the pre-dump as implemented by runc.

This commit also makes sure that the underlying software stack supports
the pre-dump mechanism. CRIU uses the kernel's dirty page tracking and
it is not available on all architectures (aarch64 does not implement it)
or might not be enabled in the kernel. If the user wants to use pre-dump
on a system without dirty page tracking crun will fail early and inform
the user.

This crun pre-dump implementation relies on libcriu interfaces which are
not yet part of the latest release (3.16.1). So at least 3.16.2 or 3.17
is required to use pre-dump in combination with crun.

Signed-off-by: Adrian Reber <areber@redhat.com>
This documents the two pre-copy migration related checkpoint options
--pre-dump and --parent-path.
This adds the memhog command to tests/init. This command requires a
parameter telling init how many MB of memory should be allocated.

In addition to allocation the memory and writing each page once, the
memhog command changes a memory page each 0.1 seconds. This is useful to
test crun's pre-dump support.

Signed-off-by: Adrian Reber <areber@redhat.com>
This extends the checkpoint/restore test to also test the newly
implemented pre-dump support.

Signed-off-by: Adrian Reber <areber@redhat.com>
@giuseppe
Copy link
Member

@adrianreber I am planning to cut a release in the next days (most likely tomorrow).

Would it be safer to wait next release to merge these changes?

@adrianreber
Copy link
Contributor Author

@adrianreber I am planning to cut a release in the next days (most likely tomorrow).

Would it be safer to wait next release to merge these changes?

Yes. One of the depending PRs has been merged and the other one is really close, but this can wait for the next release.

@giuseppe
Copy link
Member

@adrianreber deps are merged. Can we merge this one as well?

@adrianreber adrianreber marked this pull request as ready for review January 18, 2022 16:21
@adrianreber
Copy link
Contributor Author

@adrianreber deps are merged. Can we merge this one as well?

Yes. This will be still disabled in CI because it depends on a new release, but we are already talking about a new release. As soon that is available crun CI should pick it up.

@rhatdan rhatdan merged commit 38e1b5e into containers:main Jan 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants