forked from checkpoint-restore/criu
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
criu/plugin: Add environment variable to cap size of buffers. #10
Open
fdavid-amd
wants to merge
382
commits into
criu-dev
Choose a base branch
from
fdavid/cap-buffer-size
base: criu-dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This kernel feature contained some bugs initially. Those logs are useful in identifing what the underlaying issue is and which kernel patch to backport. Signed-off-by: Michal Clapinski <mclapinski@google.com>
Signed-off-by: Adrian Reber <areber@redhat.com>
As our pr_* functions are complex and can call different system calls inside before actual printing (e.g. gettimeofday for timestamps) actual errno at the time of printing may be changed. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
As our pr_* functions are complex and can call different system calls inside before actual printing (e.g. gettimeofday for timestamps) actual errno at the time of printing may be changed. Let's just use %s + strerror(errno) instead of %m with pr_* functions to be explicit that errno to string transformation happens before calling anything else. Note: tcp_repair_off is called from pie with no pr_perror defined due to CR_NOGLIBC set and if I use errno variable there I get "Unexpected undefined symbol: `__errno_location'. External symbol in PIE?", so it seems there is no way to print errno there, so let's just skip it. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
We see that libbsd redefines __has_include to be always true, which breaks such checks for rseq. The idea behind this patch is remove the use of libbsd functions and always export our replacement functions. Using __strlcat and __strlcpy everywhere in existing code: git grep --files-with-matches "strlcat" | xargs sed -i 's/strlcat/__strlcat/g' git grep --files-with-matches "strlcpy" | xargs sed -i 's/strlcpy/__strlcpy/g' Fixes: checkpoint-restore#2036 Suggested-by: Andrei Vagin <avagin@google.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
… bsd headers We see that libbsd redefines __has_include to be always true, which breaks such checks for rseq. The idea behind this patch is to put all uses of libbsd functions to separate c files and only export wrapper functions for them. Using __setproctitle and __setproctitle_init everywhere in existing code: git grep --files-with-matches "setproctitle" | xargs sed -i 's/setproctitle/__setproctitle/g' git grep --files-with-matches "setproctitle_init" | xargs sed -i 's/setproctitle_init/__setproctitle_init/g' Fixes: checkpoint-restore#2036 Suggested-by: Andrei Vagin <avagin@google.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This patch sets VMA_AREA_REGULAR on hugetlb and anon shmem VMAs since they can be handled the same way as other kinds of regular memory. Co-authored-by: Ivanq <imachug@yandex.ru> Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
If trying to open /proc/$pid/map_files/x-x for a given VMA fails with EPERM (can happen in unprivileged mode when running in a non-init user ns), fall back to reading the content from /proc/$pid/mem. Co-authored-by: Ivanq <imachug@yandex.ru> Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
Co-authored-by: Ivanq <imachug@yandex.ru> Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
We see that when lint is called for push action git has only one last commit which makes make indent with git-clang-format fail to operate. Fix it by increasing fetch depth to one more commit. Fixes: checkpoint-restore#2066 Fixes: d6db333 ("clang-format: rework make indent to check specific commits") Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Fixes: 237bd26 ("remap: Rename global lock", 2017-05-18) Signed-off-by: Michał Mirosław <emmir@google.com>
Make the code a bit more readable by uncovering a while loop from a if() goto sequence. Signed-off-by: Michał Mirosław <emmir@google.com>
Checking errno in outer function is really strange, also saving errno of mount syscall after calling pr_perror is completely wrong. So let's try to simplify things. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Simplify code a bit: make exit codes of those functions more transparent, rename ret to exit_code. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This is done to follow 'Linux kernel coding style', same change was added to .clang-format in linux kernel source recently: torvalds/linux@d7f6604341c74 We don't change it in current code base but let's follow it in all future uses. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
This patch applies the '-ffreestanding' flag that was introduced with checkpoint-restore#1726 to MIPS. Fixes: checkpoint-restore#1725 Signed-off-by: Radostin Stoyanov <radostin@redhat.com>
Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
If we don't have access to map_files and instead have to get the data from /proc/$pid/mem we can close and reset the fd before passing it to do_dump_one_shmem() which can then check it before trying to seek past holes, eliminating the need for a separate seek_data_supported boolean. Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
CAP_CHECKPOINT_RESTORE does not give access to /proc/$pid/map_files in user namespaces. In order to test that CRIU in unprivileged mode can dump and restore anonymous shared memory pages we will run the maps00 tests in a user namespace. Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
If we can't access a map_files entry directly and instead have to follow the link and access the file via a filesystem path we need to properly deal with files on btrfs subvolumes. Signed-off-by: Younes Manton <ymanton@ca.ibm.com>
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The IP_FREEBIND option is supported for RAW sockets, why not save it while we do this for other ip sockets anyway? One difference is that for SOCK_RAW there is no fallback between IP_FREEBIND and IPV6_FREEBIND, see: https://github.com/torvalds/linux/blob/ef4d3ea40565a781c25847e9cb96c1bd9f462bc6/net/ipv6/ipv6_sockglue.c#L1497 So let's have explicit IPV6_FREEBIND for ipv6. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
We see systemd-resolved relying on these options, and after migration the options are lost and systemd-resolved stops serving dns requests. The socket options make kernel add cmsg with destination address to packets, see more how systemd-resolved uses them: https://github.com/systemd/systemd/blob/00a60eaf5fcb3a0e415349aa649f2699550d26b0/src/resolve/resolved-manager.c#L826 Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Just creates ipv4/ipv6 raw/dgram sockets with IP_PKTINFO and IP_FREEBIND socket options enabled/disabled and checks that these options persist. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
The --ghost-fiemap option was introduced with checkpoint-restore#1963. It enables an optimized algorithm based on fiemap ioctl that can reduce the number of syscalls used to checkpoint highly sparse ghost files. This option is enabled by default. It can be disabled with --no-ghost-fiemap when using SEEK_HOLE/SEEK_DATA is preferred. In addition, an automatic fallback to SEEK_HOLE/SEEK_DATA is used for filesystems that do not supporting fiemap. Co-authored-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
If we build tags for our repo: [criu]$ make tags GEN tags And then run codespell, we get an error: [criu]$ codespell ./tags:3755: struc ==> struct Let's exclude tags file from codespell search, this would add usability to `make lint`. Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
When error happens on file dumping stage the only information about the task we dumping is its PID. For debug purpose show task's @comm early. It proves useful when trying to understand which of dumped applications is "guilty" in brokern dump when pid is not there anymore. Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
In Python 3 b'' == '' is False. This causes the info action to fail with File "/usr/lib/python3.11/site-packages/crit-3.17-py3.11.egg/pycriu/images/images.py", line 178, in count size, = struct.unpack('i', buf) ^^^^^^^^^^^^^^^^^^^^^^^ struct.error: unpack requires a buffer of 4 bytes Reported-by: Sankalp Acharya (@sankalp-12) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
When -- after restore -- sockets can't communicate, the test times out while waiting on recvfrom(). Since the communication is local, send() works instantaneously - so mark sockets with SOCK_NONBLOCK and report failure if the message is not received immediately. Signed-off-by: Michał Mirosław <emmir@google.com>
All test logs are flooded with the "userns is supported" messages... Signed-off-by: Andrei Vagin <avagin@gmail.com>
Currently page_size() returns unsigned int value that is after "bitwise not" is promoted to unsigned long value e.g. in uffd.c handle_page_fault. Since the value is unsigned promotion is done with 0 MSB that results in lost of MSB pagefault address bits. So make page_size to return unsigned long to avoid such situation. Signed-off-by: Vladislav Khmelevsky <och95@yandex.ru>
Currently most of the times we don't have problems with VVAR segment and lazy restore because when VDSO is parked there is an munmap call that calls UFFDIO_UNREGISTER on the destination address. But we don't want to enable userfaultfd for VDSO and VVAR at the first place.
Signed-off-by: Michal Clapinski <mclapinski@google.com>
It means CRIU has to close it when it is not needed. It looks more logically correct and matches the behaviour of the RESTORE_EXT_FILE callback. Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
This patch adds the `libdrm-dev` package to the list of CRIU dependencies installed in CI to build CRIU with amdgpu plugin. Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
amdgpu_plugin.c:930:6: error: variable 'buffer' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized] if (ret) { ^~~ amdgpu_plugin.c:988:8: note: uninitialized use occurs here xfree(buffer); Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
One memfd can be shared by a few restored files. Only of these files is restored with a file created with memfd_open. Others are restored by reopening memfd files via /proc/self/fd/. It seems unnecessary for restoring memfd memory mappings. We can always use the origin file. Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
The "ColumnLimit: 120" is not only allowing lines to be longer than 80 characters but it also forces line wrapping at 120 characters. If total expression length is more than 120 characters, clang-format will try to wrap it as close to 120 as it can, it would not even allow to wrap at 80 characters if we really want it. But as we all know 80 characters is Linux kernel coding style default and as far as our coding style is based on it it is really strange to prohibit wrapping lines at 80 characters... Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
GCC's lto source: > To avoid this problem the compiler must assume that it sees the > whole program when doing link-time optimization. Strictly > speaking, the whole program is rarely visible even at link-time. > Standard system libraries are usually linked dynamically or not > provided with the link-time information. In GCC, the whole > program option (@option{-fwhole-program}) asserts that every > function and variable defined in the current compilation > unit is static, except for function @code{main} (note: at > link time, the current unit is the union of all objects compiled > with LTO). Since some functions and variables need to > be referenced externally, for example by another DSO or from an > assembler file, GCC also provides the function and variable > attribute @code{externally_visible} which can be used to disable > the effect of @option{-fwhole-program} on a specific symbol. As far as I read gcc's source, ipa_comdats() will avoid placing symbols that are either already in a user-defined section or have externally_visible attribute into new optimized gcc sections. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Andrei Vagin <avagin@gmail.com>
fork_and_ptrace_attach has to fork a child with CLONE_UNTRACED, so that strace doesn't trace it. Signed-off-by: Andrei Vagin <avagin@gmail.com>
read_ns_sys_file() can return an error, but we are trying to parse a buffer before checking a return code. CID 417395 (#3 of 3): String not null terminated (STRING_NULL) 2. string_null: Passing unterminated string buf to strtol, which expects a null-terminated string. Signed-off-by: Andrei Vagin <avagin@gmail.com>
This check is redundant as line 201 checks for this condition. Signed-off-by: Taemin Ha <taeminha@cs.utexas.edu> Signed-off-by: Andrei Vagin <avagin@gmail.com>
The is_native field is a boolean. Therefore, else if() should can be changed to a simple else{}. Signed-off-by: Taemin Ha <taeminha@cs.utexas.edu> Signed-off-by: Andrei Vagin <avagin@gmail.com>
The condition meant to check fd2 instead of fd1, which is checked in line 24. Signed-off-by: Taemin Ha <taeminha@cs.utexas.edu> Signed-off-by: Andrei Vagin <avagin@gmail.com>
line 131 checks if (ret >= 0). line 133 could be replaced by a simple else statement Signed-off-by: Taemin Ha <taeminha@cs.utexas.edu> Signed-off-by: Andrei Vagin <avagin@gmail.com>
Eventpollentry's fields are set only when ret == 3 or ret == 6. The remaining cases can be grouped together to an error Signed-off-by: Taemin Ha <taemin.ha@utexas.edu> Signed-off-by: Andrei Vagin <avagin@gmail.com>
At this point the correct position is already restored, so reading from the fd results in the position being moved forward by 5 bytes. Fixes: 9191f87 ("criu/files-reg.c: add build-id validation functionality") Signed-off-by: Michal Clapinski <mclapinski@google.com>
Signed-off-by: Michal Clapinski <mclapinski@google.com>
Signed-off-by: Michal Clapinski <mclapinski@google.com>
cgroup_ifpriomap test needs net_prio cgroup, which might not be available. Make the .checkskip script check it. Signed-off-by: Michał Mirosław <emmir@google.com>
Newer versions of pip use an isolated virtual environment when building Python projects. However, when the source code of CRIT is copied into the isolated environment, the symlink for `../lib/py` (pycriu) becomes invalid. As a workaround, we used the `--no-build-isolation` option for `pip install`. However, this functionality has issues in some versions of PIP [1, 2]. To fix this problem, this patch adds separate packages for pycriu and crit, and each package is installed independently. [1] pypa/pip#8221 [2] pypa/pip#8165 (comment) Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Do not use $(USERCFLAGS) for anything other than what the user provide. Signed-off-by: Marcus Folkesson <marcus.folkesson@gmail.com>
A friendly reminder that this PR had no activity for 30 days. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The amdgpu plugin would create a memory buffer at the size of the largest VRAM bo (buffer object). On some systems, VRAM size exceeds RAM size, so the largest bo might be larger than the available memory.
Add an environment variable KFD_MAX_BUFFER_SIZE, which caps the size of this buffer. By default, it is set to 0, and has no effect. When active, any bo larger than its value will be saved to/restored from file in multiple passes.