Skip to content
Permalink
Branch: criu-dev
Commits on Mar 25, 2016
  1. rework criu check logic

    SaiedKazemi authored and xemul committed Mar 16, 2016
    The "criu check" command to check if the kernel is properly configured
    to run criu is broken.
    
    The "criu check --ms" command used to be the way to tell criu to check
    only for features that have been merged upstream.  But recent kernels
    have a set of features whose presence doesn't necessarily mean that
    dump or restore will fail but rather *may* fail depending on whether
    the process tree uses those features.
    
    This patch deprecates --ms and introduces --extra, --experimental,
    and --all.  See "criu --help" or "man criu" for more info.
    
    Typical use cases are:
    
    	$ sudo criu check
    	<zero or more warnings and errors...>
    	Looks good.
    	$ echo $?
    	0
    
    	$ sudo criu check --extra
    	<zero or more warnings and errors...>
    	Looks good.
    	$ echo $?
    	1
    
    	$ sudo criu check --extra
    	<one or more warnings...>
    	Looks good but some kernel features are missing
    	which, depending on your process tree, may cause
    	dump or restore failure.
    	$ echo $?
    	1
    
    	$ sudo criu check --feature list
    	mnt_id aio_remap timerfd tun userns fdinfo_lock seccomp_suspend \
    		seccomp_filters loginuid cgroupns
    
    	$ sudo criu check --feature mnt_id
    	Warn  (cr-check.c:283): fdinfo doesn't contain the mnt_id field
    	$ echo $?
    	1
    
    	$ sudo criu check --feature tun
    	tun is supported
    	$ echo $?
    	0
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Commits on Mar 23, 2016
  1. netfilter: add -n to iptables and ip6tables calls

    SaiedKazemi authored and xemul committed Mar 12, 2016
    To preload netfilter modules, criu runs "iptables -L" and "ip6tables -L"
    before starting to dump or restore a process tree.
    
    On systems with many entries, the above commands without the -n option
    take a long time because of lengthy DNS lookups.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Commits on Aug 18, 2015
  1. Add OverlayFS support to docker_cr.sh

    SaiedKazemi authored and xemul committed Aug 17, 2015
    The main purpose of this patch is to add OverlayFS support to docker_cr.sh
    for external checkpoint and restore.  It also does a bit of cleaning
    and minor enhancements.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jul 13, 2015
  1. Need bigger log buffer to avoid message truncation

    SaiedKazemi authored and xemul committed Jul 6, 2015
    The help message of CRIU has grown in size and is truncated because the
    size of the private buffer in log.c is too small.  This patch increases
    the size of the buffer.
    
    [ The "bad" message is the --help output one ]
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 7, 2015
  1. Do not fail if /tmp does not exist

    SaiedKazemi authored and xemul committed Apr 6, 2015
    Currently if /tmp does not exist, CRIU fails because it will not be
    able to create a temporary directory there.  But when checkpointing
    and restoring containers, we cannot rely on the existence of /tmp.
    For such containers, we should use root (/).  The temporary directory
    will be removed after CRIU is done.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Acked-by: Andrew Vagin <avagin@odin.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 18, 2015
  1. Dismantle cgyard in non-detached restore mode.

    SaiedKazemi authored and xemul committed Mar 16, 2015
    If the --restore-detached command line option is not specified during
    restore, CRIU should unmount and remove the temporary cgyard directory
    tree before waiting for the restored process to exit.  Otherwise, all
    the temporary cgyard mount points will remain mounted and visible.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Feb 16, 2015
  1. Do not call listen() when SO_REUSEADDR is off

    SaiedKazemi authored and xemul committed Feb 16, 2015
    For an established TCP connection, the send queue is restored in two
    steps: in step (1), we retransmit the data that was sent before but not
    yet acknowledged, and in step (2), we transmit the data that was never
    sent outside before.  The TCP_REPAIR option is disabled before step (2)
    and re-enabled after step (2) (without this patch).
    
    If the amount of data to be sent in step (2) is large, the TCP_REPAIR
    flag on the socket can remain off for some time (O(milliseconds)).  If a
    listen() is called on another socket bound to the same port during this
    time window, it fails. This is because -- turning TCP_REPAIR off clears
    the SO_REUSEADDR flag on the socket.
    
    This patch adds a mutex (reuseaddr_lock) per port number, so that a
    listen() on a port number does not happen while SO_REUSEADDR for another
    socket on the same port is off.
    
    Thanks to Amey Deshpande <ameyd@google.com> for debugging.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Feb 9, 2015
  1. Ignore mnt_id value for AUFS file descriptors.

    SaiedKazemi authored and xemul committed Feb 9, 2015
    Starting with version 3.15, the kernel provides a mnt_id field in
    /proc/<pid>/fdinfo/<fd>.  However, the value provided by the kernel for
    AUFS file descriptors obtained by opening a file in /proc/<pid>/map_files
    is incorrect.
    
    Below is an example for a Docker container running Nginx.  The mntid
    program below mimics CRIU by opening a file in /proc/1/map_files and
    using the descriptor to obtain its mnt_id.  As shown below, mnt_id is
    set to 22 by the kernel but it does not exist in the mount namespace of
    the container.  Therefore, CRIU fails with the error:
    
    	"Unable to look up the 22 mount"
    
    In the global namespace, 22 is the root of AUFS (/var/lib/docker/aufs).
    
    This patch sets the mnt_id of these AUFS descriptors to -1, mimicing
    pre-3.15 kernel behavior.
    
    	$ docker ps
    	CONTAINER ID        IMAGE                    ...
    	3850a63ee857        nginx-streaming:latest   ...
    	$ docker exec -it 38 bash -i
    	root@3850a63ee857:/# ps -e
    	  PID TTY          TIME CMD
    	    1 ?        00:00:00 nginx
    	    7 ?        00:00:00 nginx
    	   31 ?        00:00:00 bash
    	   46 ?        00:00:00 ps
    	root@3850a63ee857:/# ./mntid 1
    	open("/proc/1/map_files/400000-4b8000") = 3
    	cat /proc/49/fdinfo/3
    	pos:	0
    	flags:	0100000
    	mnt_id:	22
    	root@3850a63ee857:/# awk '{print $1 " " $2}' /proc/1/mountinfo
    	87 58
    	103 87
    	104 87
    	105 104
    	106 104
    	107 104
    	108 87
    	109 87
    	110 87
    	111 87
    	root@3850a63ee857:/# exit
    	$ grep 22 /proc/self/mountinfo
    	22 21 8:1 /var/lib/docker/aufs /var/lib/docker/aufs ...
    	44 22 0:35 / /var/lib/docker/aufs/mnt/<ID> ...
    	$
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jan 22, 2015
  1. Rework fixup_aufs_vma_fd() for non-AUFS links

    SaiedKazemi authored and xemul committed Jan 21, 2015
    This patch reworks fixup_aufs_vma_fd() to let symbolic links in
    /proc/<pid>/map_files that are not pointing to AUFS branch names follow
    the non-AUFS applcation logic.
    
    The use case that prompted this commit was an application mapping
    /dev/zero as shared and writeable which shows up in map_files as:
    
    lrw------- ... 7fc5c5a5f000-7fc5c5a60000 -> /dev/zero (deleted)
    
    If the AUFS support code reads the link, it will have to strip off the
    " (deleted)" string added by the kernel but core CRIU code already
    does this.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jan 19, 2015
  1. Fix AUFS pathname handling when branch is not exposed

    SaiedKazemi authored and xemul committed Jan 19, 2015
    The code that fixes up AUFS pathnames associated with vma entries (see
    commit d8b41b6) should handle cases where an entry does not expose
    the branch pathname (e.g., pointing to a device like /dev/zero).
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jan 12, 2015
  1. Allow the veth-pair option to specify a bridge

    SaiedKazemi authored and xemul committed Jan 12, 2015
    When restoring a pair of veth devices that had one end inside a namespace
    or container and the other end outside, CRIU creates a new veth pair,
    puts one end in the namespace/container, and names the other end from
    what's specified in the --veth-pair IN=OUT command line option.
    
    This patch allows for appending a bridge name to the OUT string in the
    form of OUT@<BRIDGE-NAME> in order for CRIU to move the outside veth to
    the named bridge.  For example, --veth-pair eth0=veth1@br0 tells CRIU
    to name the peer of eth0 veth1 and move it to bridge br0.
    
    This is a simple and handy extension of the --veth-pair option that
    obviates the need for an action script although one can still do the same
    (and possibly more) if they prefer to use action scripts.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. Try to determine the bind mount file for dockerinit

    SaiedKazemi authored and xemul committed Jan 2, 2015
    This patch adds code to the contrib/docker_cr.sh helper script for trying
    to determine the external bind mount file for dockerinit if the user
    has not explicitly specified it via the DOCKERINIT_BINARY environment
    variable.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Dec 19, 2014
  1. Explain how an inherit fd can be reused.

    SaiedKazemi authored and xemul committed Dec 19, 2014
    Add comment to inherit_fd_reused() explaining how an inherit fd may be
    closed or reused outside the inherit fd logic.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Dec 10, 2014
  1. tests: Pipe.c Test Program

    SaiedKazemi authored and xemul committed Dec 9, 2014
    Hi Saied,
    
    This patch adds your test in the criu test system.
    
    Signed-off-by: Andrey Vagin <avagin@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. Add inherit fd support

    SaiedKazemi authored and xemul committed Dec 9, 2014
    There are cases where a process's file descriptor cannot be restored
    from the checkpoint images.  For example, a pipe file descriptor with
    one end in the checkpointed process and the other end in a separate
    process (that was not part of the checkpointed process tree) cannot be
    restored because after checkpoint the pipe will be broken.
    
    There are also cases where the user wants to use a new file during
    restore instead of the original file at checkpoint time.  For example,
    the user wants to change the log file of a process from /path/to/oldlog
    to /path/to/newlog.
    
    In these cases, criu's caller should set up a new file descriptor to be
    inherited by the restored process and specify the file descriptor with the
    --inherit-fd command line option.  The argument of --inherit-fd has the
    format fd[%d]:%s, where %d tells criu which of its own file descriptors
    to use for restoring the file identified by %s.
    
    As a debugging aid, if the argument has the format debug[%d]:%s, it tells
    criu to write out the string after colon to the file descriptor %d.  This
    can be used, for example, as an easy way to leave a "restore marker"
    in the output stream of the process.
    
    It's important to note that inherit fd support breaks applications
    that depend on the state of the file descriptor being inherited.  So,
    consider inherit fd only for specific use cases that you know for sure
    won't break the application.
    
    For examples please visit http://criu.org/Category:HOWTO.
    
    v2: Added a check in send_fd_to_self() to avoid closing an inherit fd.
        Also, as an extra measure of caution, added checks in the inherit fd
        look up functions to make sure that the inherit fd hasn't been reused.
        The patch also includes minor cosmetic changes.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Sep 9, 2014
  1. Add a convenience shell script for Docker container C/R

    SaiedKazemi authored and xemul committed Sep 3, 2014
    Since the command line for checkpointing and restoring Docker containers
    is very long and there are some manual steps involved before restoring
    a container, it's much easier to use a shell script to automate the work.
    
    One would simply do:
    
    $ sudo docker_cr.sh -c
    $ sudo docker_cr.sh -r
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Acked-by: Filipe Brandenburger <filbranden@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Aug 27, 2014
  1. Use --root instead of --aufs-root

    SaiedKazemi authored and xemul committed Aug 26, 2014
    When dumping Docker containers using the AUFS graph driver, we can
    use the --root option instead of --aufs-root for specifying the
    container's root.  This patch obviates the need for --aufs-root
    and makes dump CLI more consistent with restore CLI.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Aug 21, 2014
  1. Added AUFS support.

    SaiedKazemi authored and xemul committed Aug 20, 2014
    The AUFS support code handles the "bad" information that we get from
    the kernel in /proc/<pid>/map_files and /proc/<pid>/mountinfo files.
    For details see comments in sysfs_parse.c.
    
    The main motivation for this work was dumping and restoring Docker
    containers which by default use the AUFS graph driver.  For dump,
    --aufs-root <container_root> should be added to the command line options.
    For restore, there is no need for AUFS-specific command line options
    but the container's AUFS filesystem should already be set up before
    calling criu restore.
    
    [ xemul: With AUFS files sometimes, in particular -- in case of a
      mapping of an executable file (likekely the one created at elf load),
      in the /proc/pid/map_files/xxx link target we see not the path
      by which the file is seen in AUFS, but the path by which AUFS
      accesses this file from one of its "branches". In order to fix
      the path we get the info about branches from sysfs and when we
      meet such a file, we cut the branch part of the path. ]
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jun 25, 2014
  1. cg: skip name= in cgroup named hierarchies

    SaiedKazemi authored and xemul committed Jun 25, 2014
    Skip the string "name=" when recreating cgroups directories in cgyard.
    For example, systemd's entries in cgroup.img are:
    
    	name: "name=systemd"
    	path: "/user/1000.user/4.session"
    
    When creating systemd subdir, named= should not be part of the name.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Jun 17, 2014
  1. pie: A quick workaround for PR_SET_DUMPABLE == 2 restore error.

    SaiedKazemi authored and xemul committed Jun 17, 2014
    [ xemul: It's a temporary workaround not to lock the -rc2 release.
      Once we have some better solution, this will be rolled back. ]
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 11, 2014
  1. pie: Fix pie.lds.S.in script to work with the gold linker

    SaiedKazemi authored and xemul committed Mar 11, 2014
    The pie.lds.S.in needs two minor changes to work with the gold (/usr/bin/gold)
    linker. These changes are compatible with /usr/bin/ld and make linker script
    more portable.
    
    The first change is adding a comma before /DISCARD/ so that the grammar
    won't be ambiguous. A Otherwise, gold treats it as a part of the assignment
    and would generate a syntax error about the "unexpected ':'".
    
    The second change is moving initialization of __export_parasite_args
    to inside the SECTIONS command because it references the dot symbol.
    Otherwise, gold would generate the error "invalid reference to dot
    symbol outside of SECTIONS clause.
    
    Signed-off-by: Saied Kazemi <saied@google.com>
    Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
You can’t perform that action at this time.