-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't dump pages which only contain zero bytes #2331
base: criu-dev
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
<?xml version = "1.0" encoding = "UTF-8"?> | ||
<!DOCTYPE suite SYSTEM "http://testng.org/testng-1.0.dtd" > | ||
|
||
<suite name = "Suite2"> | ||
<parameter name="checkpointOpt" value="--skip-zero-pages"/> | ||
<parameter name="restoreOpt" value=""/> | ||
|
||
<test name = "test1-FileRead"> | ||
<parameter name="testname" value="FileRead"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test2-ReadWrite"> | ||
<parameter name="testname" value="ReadWrite"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test3-MemoryMappings"> | ||
<parameter name="testname" value="MemoryMappings"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test4-MultipleFileRead"> | ||
<parameter name="testname" value="MultipleFileRead"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test5-MultipleFileWrite"> | ||
<parameter name="testname" value="MultipleFileWrite"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test6-Sockets"> | ||
<parameter name="testname" value="Sockets"/> | ||
<parameter name="checkpointOpt" value="--tcp-established --skip-zero-pages"/> | ||
<parameter name="restoreOpt" value="--tcp-established"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test7-SocketsListen"> | ||
<parameter name="testname" value="SocketsListen"/> | ||
<parameter name="checkpointOpt" value="--tcp-established --skip-zero-pages"/> | ||
<parameter name="restoreOpt" value="--tcp-established"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test8-SocketsConnect"> | ||
<parameter name="testname" value="SocketsConnect"/> | ||
<parameter name="checkpointOpt" value="--tcp-established --skip-zero-pages"/> | ||
<parameter name="restoreOpt" value="--tcp-established"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test9-SocketsMultiple"> | ||
<parameter name="testname" value="SocketsMultiple"/> | ||
<parameter name="checkpointOpt" value="--tcp-established --skip-zero-pages"/> | ||
<parameter name="restoreOpt" value="--tcp-established"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
</test> | ||
|
||
<test name = "test10-SocketsData"> | ||
<parameter name="testname" value="SocketsData"/> | ||
<parameter name="checkpointOpt" value="--tcp-established --skip-zero-pages"/> | ||
<parameter name="restoreOpt" value="--tcp-established"/> | ||
<classes> | ||
<class name = "org.criu.java.tests.CheckpointRestore"/> | ||
</classes> | ||
|
||
</test> | ||
|
||
</suite> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1052,6 +1052,7 @@ def __init__(self, opts): | |
self.__sat = bool(opts['sat']) | ||
self.__dedup = bool(opts['dedup']) | ||
self.__mdedup = bool(opts['noauto_dedup']) | ||
self.__skip_zero_pages = bool(opts['skip_zero_pages']) | ||
self.__user = bool(opts['user']) | ||
self.__rootless = bool(opts['rootless']) | ||
self.__leave_stopped = bool(opts['stop']) | ||
|
@@ -1381,6 +1382,9 @@ def dump(self, action, opts=[]): | |
if self.__dedup: | ||
a_opts += ["--auto-dedup"] | ||
|
||
if self.__skip_zero_pages: | ||
a_opts += ["--skip-zero-pages"] | ||
|
||
a_opts += ["--timeout", "10"] | ||
|
||
criu_dir = os.path.dirname(os.getcwd()) | ||
|
@@ -2083,7 +2087,7 @@ def run_test(self, name, desc, flavor): | |
'dedup', 'sbs', 'freezecg', 'user', 'dry_run', 'noauto_dedup', | ||
'remote_lazy_pages', 'show_stats', 'lazy_migrate', 'stream', | ||
'tls', 'criu_bin', 'crit_bin', 'pre_dump_mode', 'mntns_compat_mode', | ||
'rootless') | ||
'rootless', 'skip_zero_pages') | ||
arg = repr((name, desc, flavor, {d: self.__opts[d] for d in nd})) | ||
|
||
if self.__use_log: | ||
|
@@ -2697,6 +2701,9 @@ def get_cli_args(): | |
rp.add_argument("--noauto-dedup", | ||
help="Manual deduplicate images on iterations", | ||
action='store_true') | ||
rp.add_argument("--skip-zero-pages", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @simonis Would you be able to also enable testing with existing ZDTM tests using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @simonis It was great meeting you at FOSDEM and your talk was very good! We can use something like the following patch to run the existing ZDTM tests with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @avagin Thank you for the review! Would it be sufficient to run the following tests? diff --git a/scripts/ci/run-ci-tests.sh b/scripts/ci/run-ci-tests.sh
index ef7e869e0..8f5e25d03 100755
--- a/scripts/ci/run-ci-tests.sh
+++ b/scripts/ci/run-ci-tests.sh
@@ -268,6 +268,9 @@ make -C test/others/rpc/ run
./test/zdtm.py run -t zdtm/transition/maps007 --pre 2 --page-server --dedup
./test/zdtm.py run -t zdtm/transition/maps007 --pre 2 --pre-dump-mode read
+# Run tests with --skip-zero-pages
+./test/zdtm.py run --skip-zero-pages -T '.*maps0.*'
+
./test/zdtm.py run -t zdtm/transition/pid_reuse --pre 2 # start time based pid reuse detection
./test/zdtm.py run -t zdtm/transition/pidfd_store_sk --rpc --pre 2 # pidfd based pid reuse detection
|
||
help="Don't dump pages containing only zero bytes", | ||
action='store_true') | ||
rp.add_argument("--nocr", | ||
help="Do not CR anything, just check test works", | ||
action='store_true') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea to read process memory twice. We have to avoid this.
btw: #2292 solves the same problem in a more optimal way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it would be better if we can avoid reading process memory twice.
I think this is different. In #2292, we exclude pages with zero PFN (
PAGE_IS_PFNZERO
), while this option skips zero-filled memory (e.g., memory that has been filled with zeros usingmemset()
would be skipped with this option).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a performance or a correctness issue? If it's just about performance, I think the benefit might justify the additional overhead and after all the feature is on by default.
In the case this is a correctness problem, do you have a suggestion how we can avoid this?
PS: and yes, @rst0git is right - this change is about skipping regular pages which are filled with only zero bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we need a zdtm test, which can reproduce such a zero page without PAGE_IS_PFNZERO but with zero data. Maybe, we even want to fix kernel to report such a page as "PAGE_IS_PFNZERO" instead.
I agree with Andrei that manually checking page content with memcmp is an anti-pattern. Nack from me, at list in current state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also agree. Reading a page second time and using
memcmp()
sounds not optimal.I still like the idea of not including empty pages in the checkpoint, but it sounds difficult.
If the kernel could track it, that would be nice. Not sure the kernel has a better alternative than
memcmp()
to find a zeroed memory page.At this point I think it would be nice to see some numbers. How much faster is restoring if something like this PR is applied. Although I don't like the
memcmp()
it would only be used if the corresponding command-line option is explicitly selected by the user. Maybe that makes it acceptable. Can the second reading of the page be avoided?Maybe some post-processing of the checkpoint image would be an alternative. Remove the empty pages after checkpointing and have support during restore to handle pages like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we need to read process pages to do this check? Why can't we do that before dumping these pages into the image (page_xfer_dump_pages)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I thought it does not work, upd: that was stupid of me not to enable --skip-zero-pages, with option it works fine, sorry.
Second, If Java put so much effort to have those zeroed pages in RSS isn't it a bad idea to restore those pages like they are "PAGE_IS_PFNZERO" ones? =)
upd2: If we want to preserve them in RSS, we can remember those special zero-filled pages in images on dump without saving their data and then on restore put them to RSS by writing zeroes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Snorch, Java (or OpenJDK based JVMs to be more specific) was not designed and optimized for cloud/container use cases but rather for large, monolithic application servers. In the old days, pretouching/zeroing memory was a way to pre-allocate physical memory and not potentially get it from swap later. For current cloud/container use cases the huge memory footprint can be problem. With check-pointing, a small image is more important and COWing a PAGE_IS_PFNZERO page is much faster then loading and populating it from disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your suggestion @avagin. I'm short of time for the next week because of FOSDEM/Jfokus but I'll try to come up with a new version which moves the zero check to
page_xfer_dump_pages()
afterwards.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonis FOSDEM was my favorite conference when I lived close by. btw @mihalicyn are there too, he is one of criu maintainers. He will be happy to help with any questions.