Skip to content
Permalink
Branch: criu-dev
Commits on Apr 27, 2015
  1. add RPC options for for --enable-fs and --skip_mount

    utrace authored and xemul committed Apr 25, 2015
    Subject.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. fix parse_mnt_flags() to dump/restore STRICTATIME correctly

    utrace authored and xemul committed Apr 24, 2015
    CRIU always retores the mounts as MNT_RELATIME. This is because the
    kernel uses this mode by default, so we need to pass MS_STRICTATIME
    explicitely if we didn't see "noatime" or "MS_RELATIME".
    
    While at it, make mnt_opt2flag[] and sb_opt2flag "static", otherwise
    gcc actually creates these arrays on stack even if there are "const".
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 24, 2015
  1. simplify the asprintf() failure handling in add_fsname_auto()

    utrace authored and xemul committed Apr 23, 2015
    Contrary to what I naively thought, the contents of fsauto_names
    is undefined if asprintf(&fsauto_names) and this was fixed by
    a052e0b "check return code of asprintf".
    
    But we can simplify this code a bit. If we rely on return value from
    asprintf(), we can simply nullify fsauto_names on failure and avoid
    the assymetrical "return false".
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 22, 2015
  1. report fd/path if dump_one_reg_file()->lookup_nsid_by_mnt_id() fails

    utrace authored and xemul committed Apr 21, 2015
    "Unable to look up the %d mount" doesn't really help to understand
    the problem, add a bit more info.
    
    And perhaps it makes more sense to change dump_task_files_seized()
    to report fd/path if dump_one_file() fails.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. report the potential selinux problem if mmap_seized() fails

    utrace authored and xemul committed Apr 21, 2015
    selinux can deny mmap(PROT_WRITE | PROT_EXEC) and in this case it is
    not clear why CRIU fails, "Can't allocate memory for parasite blob"
    doesn't tell too much. Add a pr_warn() hint for the user.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Cyrill Gorcunov<gorcunov@openvz.org>
    Acked-by: Ruslan Kuprieiev <rkuprieiev@cloudlinux.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 21, 2015
  1. make it possible to use --enable-fs more than once

    utrace authored and xemul committed Apr 20, 2015
    Change add_fsname_auto() to join multiple --enable-fs options.
    
    Note: "all" always wins, and "--enable-fs foo,all,bar" results
    in fsauto_names = "all" too.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. fix the wrong usage of strtok() in fsname_is_auto()

    utrace authored and xemul committed Apr 20, 2015
    I am stupid. fsname_is_auto() can't use strtok(), the 2nd call will
    see zeroes instead of commas in fsauto_names.
    
    Add the css_contains() helper and change fsname_is_auto() to use it.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 14, 2015
  1. sanitize ->mntinfo_list initialization in collect_mntinfo()

    utrace authored and xemul committed Apr 14, 2015
    Currently this doesn't matter correctness-wise (with or without the
    previous changes), but imho collect_mntinfo() needs a cleanup. We
    should not return with ->mntinfo_list pointing to the freed memory
    on failure, even if currently this failure is fatal and nobody will
    ever use this pointer.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. kill the awful check before mntinfo_add_list() in collect_mntns()

    utrace authored and xemul committed Apr 14, 2015
    This check was added by commit aebfabb "mnt: add --ext-mount-map
    auto option", but unless I am totally confused it actually belongs
    to the (already reverted) 246367e "add walk_all flag to
    walk_namespaces".
    
    Remove it. It is no longer needed and it was very unobvious.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  3. revert 246367e "add walk_all flag to walk_namespaces"

    utrace authored and xemul committed Apr 14, 2015
    We no longer need to populate ext_ns->mnt.mntinfo_list until
    resolve_external_mounts(). We can rely on find_ext_ns_id() which
    does collect_mntinfo() on demand.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  4. change find_ext_ns_id() to do collect_mntinfo(ext_ns) on demand

    utrace authored and xemul committed Apr 14, 2015
    Currently we rely on the fact that ->mntinfo_list was already
    collected by walk_namespaces(walk_all => true), but we are going
    to change this.
    
    This patch simply adds collect_mntinfo(ns) into find_ext_ns_id() if
    ->mntinfo_list == NULL. This is all we need for this ns_id if it was
    not initialized by collect_mnt_namespaces().
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  5. introduce find_ext_ns_id()

    utrace authored and xemul committed Apr 14, 2015
    Preparation. Extract the "search the criu's mount info" code from
    resolve_external_mounts() into the new simple helper, find_ext_ns_id().
    
    Also change resolve_external_mounts() to check ext_ns == NULL rather
    than !opts.autodetect_ext_mounts. Cosmetic.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  6. do_new_mount() should clear all do_change_type() bits

    utrace authored and xemul committed Apr 14, 2015
    do_new_mount() clears MS_SHARED but this is not enough. It should clear
    all bits processed in restore_shared_options().
    
    The patch also adds MS_UNBINDABLE to MS_CHANGE_TYPE_MASK even if it is
    not currently used. Just to match the kernel's do_change_type() check.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  7. propogate the error if resolve_external_mounts() fails

    utrace authored and xemul committed Apr 14, 2015
    collect_mnt_namespaces() returns with ret=0 if resolve_external_mounts()
    fails.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  8. simplify the "ignore filesystem-subtype" logic

    utrace authored and xemul committed Apr 14, 2015
    We can simply overwrite the dot symbol right after the kernel reports
    it to us.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Tycho Andersen <tycho.andersen@canonical.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 10, 2015
  1. make resolve_source() more friendly to FSTYPE__AUTO mounts

    utrace authored and xemul committed Apr 10, 2015
    resolve_source() insists on kdev_major() == 0, and this makes sense.
    However, at least FSTYPE__AUTO can try to use mi->source as a block
    device and pray it will work.
    
    [ Also bout this change from Oleg:
    
    Let me send another (last) functional change before the promised
    cleanups we discussed.
    
    To remind, without this patch I still can't dump/restore /home and
    /boot on my testing machine. --enable-fs xfs "works" in a sense that
    "dump" succeeds. But "restore" fails.
    
    However. Lets forget this for the moment. To me resolve_source() looks
    just wrong. Sure, I agree, it is not safe to blindly use mi->source if
    kdev_major() != 0. But this means that we should not have dumped this
    mountpoint, simply because we can't restore it.
    
    Yes, currently this works because fstypes[] contains only the diskless
    filesystems, but still.
    
    So this probably needs more cleanups too, and this patch doesn't make
    this logic look better.
    
    To me, we should do something like
    
    	static char *resolve_source(struct mount_info *mi)
    	{
    		if (kdev_major(mi->s_dev) == 0)
    			/*
    			 * Anonymous block device. Kernel creates them for
    			 * diskless mounts.
    			 */
    			return mi->source;
    
    		if (mi->fstype->code != FSTYPE__AUTO) {
    			pr_err("OOPS! something is wrong!!!\n");
    			return NULL;
    		}
    
    		// OK, this is FSTYPE__AUTO, it should "just work"
    		// by definition. Or the user should blame himself.
    
    		struct stat st;
    
    		if (stat(mi->source, &st) || !S_ISBLK(st.st_mode) ||
    		    major(st.st_rdev) != kdev_major(mi->s_dev) ||
    		    minor(st.st_rdev) != kdev_minor(mi->s_dev))
    			pr_warn("Hmm, can't verify blkdev. Lets see if mount will work...\n");
    
    		return mi->source;
    	}
    
    But this patch only does a minimal change to make FSTYPE__AUTO work
    with blkdev.
    
    ]
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. introduce --enable-fs cli option

    utrace authored and xemul committed Apr 9, 2015
    Finally add --enable-fs option to specify the comma separated list of
    filesystem names which should be treated as FSTYPE_AUTO.
    
    Note: obviously this option is not safe, use at your own risk. "dump"
    will always succeed if the mntpoint is auto, but "restore" can fail or
    do something wrong if mount(src, mountpoint, flags, options) can not
    actually "just work" as FSTYPE_AUTO logic expects.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  3. dump/restore fstype->name if FSTYPE__AUTO

    utrace authored and xemul committed Apr 9, 2015
    Add the new mnt_entry->fsname member and change dump_one_mountpoint()
    to save pm->fstype->name if fstype == FSTYPE__AUTO.
    
    Change collect_mnt_from_image() to pass this ->fsname to decode_fstype()
    which falls back to __find_fstype_by_name(fsname, true) if FSTYPE__AUTO.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  4. introduce __fsname_is_auto(name, force_auto)

    utrace authored and xemul committed Apr 9, 2015
    Simple preparation to simplify the review of the next patch. Turn
    find_fstype_by_name(name) into __find_fstype_by_name(name, force_auto)
    and reimplement find_fstype_by_name() as a trivial wrapper on top.
    
    This allows "restore" to specify that this particular fsname was treated
    as FSTYPE__AUTO by "dump".
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  5. introduce fsname_is_auto(name) and FSTYPE__AUTO

    utrace authored and xemul committed Apr 9, 2015
    The comment in find_fstype_by_name() says:
    
    	just mounting anything is wrong
    
    and this is true in general, but:
    
    	almost every fs has its own features
    
    this is not true in a sense that a lot of supported filesystems do not
    need any special processing: FSTYPE__PROC, FSTYPE__SYSFS, and more. More
    importantly, this logic does not allow to spicify from the command line
    that (say) currently unsupported hugetlbfs can "just work", do_new_mount()
    should only pass the right name/options.
    
    This patch adds the new FSTYPE__AUTO code, find_fstype_by_name(name) adds
    the new entry if fsname_is_auto(name) returns true. We do not care that
    different fstype's can have the same FSTYPE__AUTO code, fstype->code has
    no meaning unless we need to do something special with this fs, but in
    this case it should not be FSTYPE__AUTO by definition.
    
    Note: currently find_fstype_by_name() just returns true, it is obviously
    pointless to "dump" until we teach "restore" to handle FSTYPE__AUTO.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  6. reserve the extra room in fstypes[]

    utrace authored and xemul committed Apr 9, 2015
    Preparation. Enlarge fstypes[] to make it possible to add the new
    fstype's dynamically.
    
    This means ths find_fstype_by_name() and decode_fstype() need the
    additional ->name == NULL check to terminate the search.
    
    Also change them to start with "i == 1", we rely on the fact that
    fstypes[0] is FSTYPE__UNSUPPORTED anyway.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Apr 3, 2015
  1. introduce --skip-mnt cli option

    utrace authored and xemul committed Apr 2, 2015
    Which obviously can be used to "ignore" the mounts we do not want or
    need to dump. The user should know what he does.
    
    Note: this patch changes parse_mountinfo() to check should_skip_mount().
    This is because imo we want to filter out the unwanted mounts asap, af
    if they do not exist. This increases the chances the dumping will fail
    if something else depends on this mount. Say, another mountpoint or an
    opened file.
    
    Perhaps it makes sense to teach should_skip_mount() to use fnmatch()
    and/or look at the optional "(fs|mnt)=" prefix to skip by fsname too.
    
    To me it would be better to force the user of this option to understand
    what it does. Say, if "dump" fails because the child mount can't find
    the skipped parent, he should add another --skip-mnt option or do not
    dump. Otherwise, if we do this automagically the user can probably be
    surpised, he might even miss the fact that we skip more than he asked.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. pass "bool for_dump" argument down to collect_mntinfo() and parse_mou…

    utrace authored and xemul committed Mar 31, 2015
    …ntinfo()
    
    Preparation.
    
    1. Add the new "bool for_dump" arg to collect/parse_mntinfo().
    
    2. Introduce "struct collect_mntns_arg" to pass the additional
       "bool for_dump" field to collect_mntinfo() and change it to
       pass this boolean to collect_mntinfo()->parse_mountinfo() path.
    
    3. Change other callers of collect_mntinfo() to pass "false".
    
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 30, 2015
  1. parse_mountinfo_ent: xrealloc(new->mountpoint) can fail

    utrace authored and xemul committed Mar 29, 2015
    This is pure theoretical, especially in this particular case when we
    actually want to (likely) free the unused memory. Still the code which
    ignores potential error doesn't look good.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. parse_mountinfo_ent: fix the leakage of "opt"

    utrace authored and xemul committed Mar 29, 2015
    1. parse_mountinfo_ent() mixes "return -1" and "goto err" on failure,
       this looks confusing and inconsistent.
    
    2. And buggy. It forgets to free(opt) if parse_mnt_flags() fails.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  3. parse_mountinfo_ent: kill the wrong xfree(new->mountpoint)

    utrace authored and xemul committed Mar 29, 2015
    The caller will do this on failure too. So this is unnecessary and wrong
    because we do not nullify ->mountpoint.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  4. parse_mountinfo: fix and simplify the usage of r_fstype

    utrace authored and xemul committed Mar 29, 2015
    1. parse_mountinfo() forgets to free(fst) if parse_mountinfo_ent()
       succeeds.
    
    2. The usage of fst/r_fstype is ovecomplicated for no reason.
    
    Just change the parse_mountinfo() paths to populate/use/free this
    fsname unconditionally, and move the ownership to the caller. There
    is no reason to check FSTYPE__UNSUPPORTED and/or fallback to ->name.
    
    Better yet, we could even turn fsname into the local "char []" and
    avoid %ms and free(), but then we would need to pass the length of
    this buffer to parse_mountinfo_ent().
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  5. parse_mountinfo: add the "end" block into the main loop

    utrace authored and xemul committed Mar 29, 2015
    Preparation to simplify the review. parse_mountinfo() assumes that:
    
    1. The "err:" block does all the necessary cleanups on failure.
    
       This is wrong, see the next patch.
    
    2. We can never skip the mountpoint.
    
       This is true, but we are going to change this.
    
    s/goto err/goto end/ in the main loop, add the "end:" label which inserts
    the new mount_info into the list and then checks ret != 0 to figure out
    whether we need to abort.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 27, 2015
  1. mount: always report ->mnt_id as decimal

    utrace authored and xemul committed Mar 26, 2015
    validate_mounts() prints ->mnt_id in hex when it reports the failure.
    This complicates the understanding because this ->mnt_id is printed as
    decimal elsewhere, including /proc/$pid/mountinfo.
    
    parse_mountinfo() adds "0x" at least and this is just pr_info(), but
    lets change it too.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Andrew Vagin <avagin@openvz.org>
    Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 26, 2015
  1. dump/x86: sanitize the ERESTART_RESTARTBLOCK -> EINTR transition

    utrace authored and xemul committed Mar 26, 2015
    1. The -ERESTART_RESTARTBLOCK case in get_task_regs() depends on kernel
       internals too much, and for no reason. We shouldn't rely on fact that
       a) we are going to do sigreturn() and b) restore_sigcontext() always
       sets restart_block->fn = do_no_restart_syscall which returns -EINTR.
    
       Just change this code to enforce -EINTR after restore, this is what
       we actually want until we teach criu to handle ERESTART_RESTARTBLOCK.
    
    2. Add pr_warn() to make the potential bug-reports more understandable,
       a sane application should handle -EINTR correctly but this is not
       always the case.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Acked-by: Andrew Vagin <avagin@parallels.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Mar 20, 2015
  1. restore/x86: restore_gpregs() needs to initialize ->ss as well

    utrace authored and xemul committed Mar 19, 2015
    Before the recent "x86_64,signal: Fix SS handling for signals delivered
    to 64-bit programs" kernel patch, sigreturn paths forgot to restore ->ss
    after return from the signal handler.
    
    Now that the kernel was fixed, restore_gpregs() has to initialize ->ss
    too, it is no longer ignored.
    
    Note: this is the minimal fix. In the long term we probably should not
    dump/restore the segment registers at all. We can use sigcontext filled
    by the target kernel and modify the general-purpose regs.
    
    Reported-and-tested-by: Andrey Wagin <avagin@gmail.com>
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Acked-by: Andrew Vagin <avagin@openvz.org>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Commits on Dec 10, 2014
  1. criu/testing: caps00: don't hang if PR_CAPBSET_DROP fails

    utrace authored and xemul committed Dec 7, 2014
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
  2. criu: check: don't leak the child if PTRACE_ATTACH fails

    utrace authored and xemul committed Dec 6, 2014
    PS: never ever use PTRACE_KILL.
    
    Signed-off-by: Oleg Nesterov <oleg@redhat.com>
    Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
You can’t perform that action at this time.