Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test #2

Merged
merged 2 commits into from
Sep 25, 2015
Merged

Test #2

merged 2 commits into from
Sep 25, 2015

Conversation

avagin
Copy link
Owner

@avagin avagin commented Sep 25, 2015

No description provided.

avagin added a commit that referenced this pull request Sep 25, 2015
@avagin avagin merged commit 2aadae1 into master Sep 25, 2015
avagin added a commit that referenced this pull request Dec 4, 2015
It's used to restore bind-mounts. For example, we cat the common
part of bind-mounts:

Core was generated by `criu restore -vvvv --file-locks --tcp-established --evasive-devices --manage-cg'.
Program terminated with signal 11, Segmentation fault.
741                     BUG_ON(target_root[tok] == '\0');
(gdb) bt
 #0  0x000000000045eef2 in cut_root_for_bind (target_root=0x1e00f20 "/", source_root=0x1e04910 "/vzt/del/vzctl-rm-me.X99UVU8/.criu.cgyard.D5Dfcv/zdtmtst/") at mount.c:741
 #1  0x000000000045f594 in do_bind_mount (mi=mi@entry=0x1e00dd0) at mount.c:2035
 #2  0x000000000045fd02 in do_mount_one (mi=0x1e00dd0) at mount.c:2191
 #3  0x000000000046241f in mnt_tree_for_each (fn=0x45fc80 <do_mount_one>, start=0x1e044d0) at mount.c:1759
 #4  populate_mnt_ns () at mount.c:2729
 #5  prepare_mnt_ns () at mount.c:2843
 #6  0x000000000045a3c3 in prepare_namespace (item=0x7fe10b9ce050, clone_flags=2080505856) at namespaces.c:1311
 #7  0x000000000043383e in restore_task_with_children (_arg=0x7ffd0f7faae0) at cr-restore.c:1535
 #8  0x00007fe10acb41ad in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

https://jira.sw.ru/browse/PSBM-41932

Reported-by: Virtuozzo QA Team
avagin pushed a commit that referenced this pull request Dec 29, 2015
This is fixlet to patch #2.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Looks-good-to-me: Andrew Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this pull request Apr 7, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.
avagin added a commit that referenced this pull request Apr 7, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Message-Id: <1460050184-6293-2-git-send-email-avagin@openvz.org>
avagin added a commit that referenced this pull request Apr 8, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Message-Id: <1460050184-6293-2-git-send-email-avagin@openvz.org>
avagin added a commit that referenced this pull request Apr 8, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Message-Id: <1460050184-6293-2-git-send-email-avagin@openvz.org>
avagin added a commit that referenced this pull request Apr 12, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request May 16, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 2, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 3, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 3, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 11, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 15, 2016
CID 159478 (#2 of 2): Resource leak (RESOURCE_LEAK)
8. leaked_handle: Handle variable sk going out of scope leaks the handle.

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Jun 15, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Jun 24, 2016
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be null dest when
calling handle_user_fault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Jul 7, 2016
It can be dead-lokced:
 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Jul 13, 2016
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be null dest when
calling handle_user_fault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Jul 13, 2016
Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x0000000000435744 in cr_pre_dump_finish (ret=0) at cr-dump.c:1452
1452			pr_info("\tPre-dumping %d\n", ctl->pid.virt);
(gdb) bt
 #0  0x0000000000435744 in cr_pre_dump_finish (ret=0) at cr-dump.c:1452
 #1  cr_pre_dump_tasks (pid=pid@entry=24) at cr-dump.c:1556
 #2  0x000000000041f665 in main (argc=<optimized out>, argv=0x7ffda430e818, envp=<optimized out>) at crtools.c:753

https://github.com/xemul/criu/issues/189

Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Aug 16, 2016
Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x0000000000435744 in cr_pre_dump_finish (ret=0) at cr-dump.c:1452
1452			pr_info("\tPre-dumping %d\n", ctl->pid.virt);
(gdb) bt
 #0  0x0000000000435744 in cr_pre_dump_finish (ret=0) at cr-dump.c:1452
 #1  cr_pre_dump_tasks (pid=pid@entry=24) at cr-dump.c:1556
 #2  0x000000000041f665 in main (argc=<optimized out>, argv=0x7ffda430e818, envp=<optimized out>) at crtools.c:753

https://github.com/xemul/criu/issues/189

Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Aug 16, 2016
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be null dest when
calling handle_user_fault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Sep 9, 2016
It can be dead-locked:

 #0  0x00007fafbf49f6ac in __lll_lock_wait_private () from /lib64/libc.so.6
 #1  0x00007fafbf44af1c in _L_lock_2460 () from /lib64/libc.so.6
 #2  0x00007fafbf44ad57 in __tz_convert () from /lib64/libc.so.6
 #3  0x00000000004022e2 in test_msg (format=0x404508 "Receive signal %d\n") at msg.c:51
 #4  <signal handler called>
 #5  0x00007fafbf3f2483 in __GI__IO_vfscanf () from /lib64/libc.so.6
 #6  0x00007fafbf408f27 in vsscanf () from /lib64/libc.so.6
 #7  0x00007fafbf4032f7 in sscanf () from /lib64/libc.so.6
 #8  0x00007fafbf449ba6 in __tzset_parse_tz () from /lib64/libc.so.6
 #9  0x00007fafbf44c4cb in __tzfile_compute () from /lib64/libc.so.6
 #10 0x00007fafbf44ae17 in __tz_convert () from /lib64/libc.so.6
 #11 0x00000000004022e2 in test_msg (format=format@entry=0x40458c "PASS\n") at msg.c:51
 #12 0x0000000000401ceb in main (argc=<optimized out>, argv=<optimized out>) at ptrace_sig.c:172

https://jira.sw.ru/browse/PSBM-47772

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
avagin pushed a commit that referenced this pull request Sep 12, 2016
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be null dest when
calling handle_user_fault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Sep 28, 2016
I discovered that the scripts/ suffix is added to __nmk_dir despite
the fact it already contains it, ending in obviously wrong filenames
like scripts/nmk/scripts/scripts/msg.mk. As those files are non-existent,
make tried to recreate every .mk file, spawninga child to execute 'true'
command, like this (part of "make -dr" output):

> Considering target file '../scripts/nmk/scripts/scripts/include.mk'.
>  File '../scripts/nmk/scripts/scripts/include.mk' does not exist.
>  Finished prerequisites of target file
> '../scripts/nmk/scripts/scripts/include.mk'.
> Must remake target '../scripts/nmk/scripts/scripts/include.mk'.
> Putting child 0x564ec1768740 (../scripts/nmk/scripts/scripts/include.mk)
> PID 21633 on the chain.
> Live child 0x564ec1768740 (../scripts/nmk/scripts/scripts/include.mk)
> PID 21633
> Reaping winning child 0x564ec1768740 PID 21633
> Removing child 0x564ec1768740 PID 21633 from chain.

The fix was to remove the extra scripts/, but once I did it, I found
out problem #2: these targets, being defined in contents that is often
included in the beginning of Makefiles, hijacks the default make
target (the first one in the Makefile), breaking the usual and
expected make behavior, and forcing to use .DEFAULT_GOAL.

Finally, I don't know why these targets are there, i.e. what purpose
do they serve. Maybe it was done to exclude any implicit rules to
re-make those files, but there are no such rules as far as I can see.

So, in order to address problem #2, I have removed these targets.
I don't see any harm in doing that; let me know if it breaks anything.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Oct 17, 2016
I discovered that the scripts/ suffix is added to __nmk_dir despite
the fact it already contains it, ending in obviously wrong filenames
like scripts/nmk/scripts/scripts/msg.mk. As those files are non-existent,
make tried to recreate every .mk file, spawninga child to execute 'true'
command, like this (part of "make -dr" output):

> Considering target file '../scripts/nmk/scripts/scripts/include.mk'.
>  File '../scripts/nmk/scripts/scripts/include.mk' does not exist.
>  Finished prerequisites of target file
> '../scripts/nmk/scripts/scripts/include.mk'.
> Must remake target '../scripts/nmk/scripts/scripts/include.mk'.
> Putting child 0x564ec1768740 (../scripts/nmk/scripts/scripts/include.mk)
> PID 21633 on the chain.
> Live child 0x564ec1768740 (../scripts/nmk/scripts/scripts/include.mk)
> PID 21633
> Reaping winning child 0x564ec1768740 PID 21633
> Removing child 0x564ec1768740 PID 21633 from chain.

The fix was to remove the extra scripts/, but once I did it, I found
out problem #2: these targets, being defined in contents that is often
included in the beginning of Makefiles, hijacks the default make
target (the first one in the Makefile), breaking the usual and
expected make behavior, and forcing to use .DEFAULT_GOAL.

Finally, I don't know why these targets are there, i.e. what purpose
do they serve. Maybe it was done to exclude any implicit rules to
re-make those files, but there are no such rules as far as I can see.

So, in order to address problem #2, I have removed these targets.
I don't see any harm in doing that; let me know if it breaks anything.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Oct 17, 2016
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS)
7. dereference: Dereferencing a pointer that might be null dest when
calling handle_user_fault.

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin added a commit that referenced this pull request Oct 26, 2016
A root mount namespace list is used to resolve paths to
unix sockets if they are placed on btrfs.

This patch fixes a crash:
 #0 mount_resolve_path at criu/mount.c:213
 #1 phys_stat_resolve_dev at criu/mount.c:240
 #2 phys_stat_dev_match at criu/mount.c:256
 #3 unix_process_name at criu/sk-unix.c:565
 #4 unix_collect_one at criu/sk-unix.c:620
 #5 unix_receive_one at criu/sk-unix.c:692
 #6 nlmsg_receive at criu/libnetlink.c:45
 #7 do_rtnl_req at criu/libnetlink.c:119
 #8 do_collect_req at criu/sockets.c:610
 #9 collect_sockets at criu/sockets.c:636

travis-ci: success for cr-check: fill up a root task mount namespace
https://bugzilla.redhat.com/show_bug.cgi?id=1381351
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
avagin pushed a commit that referenced this pull request Oct 27, 2018
By exhaustive testing I understand a test suite that generates as much
states to try to C/R as possible by trying all the possible sequences
of system calls. Since such a generation, if done on all the Linux API
we support in CRIU, would produce bazillions of process, I propose to
start with something simple.

As a starting point -- unix stream sockets with abstract names that
can be created and used by a single process :)

The script generates situations in which unix sockets can get into by
using a pre-defined set of system calls. In this patch the syscalls
are socket, listen, bind, accept, connect and send. Also the nummber
of system calls to use (i.e. -- the depth of the tree) is limited by
the --depth option.

There are three things that can be done with a generated 'state':

I) Generate :) and show

Generation is done by recursively doing everything that is possible
(and makes sence) in a given state. To reduce the size of the tree
some meaningless branches are cut, e.g. creating a socket and closing
it right after that, creating two similar sockets one-by-one and some
more.

Shown on the screen is a cryptic string, e.g. 'SA-CX-MX_SBL one,
describing the sockets in the state. This is how it can be decoded:

 - sockets are delimited with _
 - first goes type (S -- stream, D --datagram)
 - next goes name state (A -- no name, B with name, X socket is not in
   FD table, i.e. closed or not yet accepted)
 - next may go letter L meaning that the socket is listening
 - -Cx -- socket is connected and x is the peer's name state
 - -Ixyz -- socket has incoming connections queue and xyz are the
   connect()-ors name states
 - -Mxyz -- socket has messages and xyz is senders' name states

The example above means, that we have two sockets:

 - SA-CX-MX: stream, with no name, connected to a dead one and with a
   message from a dead one
 - SBL: stream, with name, listening

Next printed is the sequence of system calls to get into it, e.g. this
is how to get into the state above:

	socket(S) = 1
	bind(1, $name-1)
	listen(1)
	socket(S) = 2
	connect(2, $name-1)
	accept(1) = 3
	send(2, $message-0)
	send(3, $message-0)
	close(3)

Program has created a stream socket, bound it, listened it, then
created another stream socket, connected to the 1st one, then accepted
the connection sent two messages vice-versa and closed the accepted
end, so the 1st socket left connected to the dead socket with a
message from it.

II) Run the state

This is when test actually creates a process that does the syscalls
required to get into the generated state (and hopefully gets into it).

III) Check C/R of the state

This is the trickiest part when it comes to the R step -- it's not
clear how to validate that the state restored is correct. But if only
trying to dump the state -- it's just calling criu dump. As images dir
the state string description is used.

One may choose only to generate the states with --gen option. One may
choose only to run the states with --run option. The latter is useful
to verify that the states generator is actually producing valid
states. If no options given, the state is also dump-ed (restore is to
come later).

For now the usage experience is like this:

- Going --depth 10 --gen (i.e. just generating all possibles states
  that are acheivable with 10 syscalls) produces 44 unique states for
  0.01 seconds. The generated result covers some static tests we have
  in zdtm :)  More generation stats is like this:
   --depth 15 : 1.1 sec   / 72 states
   --depth 18 : 13.2 sec  / 89 states
   --depth 20 : 1 m 8 sec / 101 state

- Running and trying with criu is checked with --depth 9. Criu fails
  to dump the state SA-CX-MX_SBL (shown above) with the error

  Error (criu/sk-queue.c:151): recvmsg fail: error: Connection reset by peer

Nearest plans:

1. Add generators for on-disk sockets names (now oly abstract).
   Here an interesting case is when names overlap and one socket gets
   a name of another, but isn't accessible by it

2. Add datagram sockets.
   Here it'd be fun to look at how many-to-one connections are
   generated and checked.

3. Add socketpair()-s.

Farther plans:

1. Cut the tree better to allow for deeper tree scan.

2. Add restore.

3. Add SCM-s

4. Have the exhaustive testing for other resources.

Changes since v1:

* Added DGRAM sockets :)

  Dgram sockets are trickier that STREAM, as they can reconnect from
  one peer to another. Thus just limiting the tree depth results in
  wierd states when socket just changes peer. In the v1 of this patch
  new sockets were added to the state only when old ones reported that
  there's nothing that can be done with them. This limited the amount
  of stupid branches, but this strategy doesn't work with dgram due to
  reconnect. Due to this, change #2:

* Added the --sockets NR option to limit the amount of sockets.

  This allowed to throw new sockets into the state on each step, which
  made a lot of interesting states for DGRAM ones.

* Added the 'restore' stage and checks after it.

  After the process is restore the script performs as much checks as
  possible having the expected state description in memory. The checks
  verify that the values below get from real sockets match the
  expectations in generated state:

   - socket itself
   - name
   - listen state
   - pending connections
   - messages in queue (sender is not checked)
   - connectivity

  The latter is checked last, after all queues should be empty, by
  sending control messages with socket.recv() method.

* Added --keep option to run all tests even if one of them fails.

  And print nice summary at the end.

So far the test found several issues:

- Dump doesn't work for half-closed connection with unread messages
- Pending half-closed connection is not restored
- Socket name is not restored
- Message is not restored

New TODO:

- Check listen state is still possible to accept connections (?)
- Add socketpair()s
- Add on-disk names
- Add SCM-s
- Exhaustive script for other resources

Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
avagin pushed a commit that referenced this pull request Jul 15, 2019
Segmentation fault was raised while trying to restore a process with
tty. Coredump file says this is caused by uninitialized tty_mutex:
        (gdb) where
        #0  0x00000000004d7270 in atomic_add_return (i=1, v=0x0) at
        include/common/asm/atomic.h:34
        #1  0x00000000004d7398 in mutex_lock (m=0x0) at
        include/common/lock.h:151
        #2  0x00000000004d840c in __pty_open_ptmx_index (index=3, flags=2,
        cb=0x4dce50 <open_pty>, arg=0x11, path=0x5562e0 "ptmx") at
        criu/tty.c:603
        #3  0x00000000004dced8 in pty_create_ptmx_index (dfd=17, index=3,
        flags=2) at criu/tty.c:2384

since init_tty_mutex() is reentrantable, just calling it before
mutex_lock()

Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
avagin pushed a commit that referenced this pull request Sep 7, 2019
Segmentation fault was raised while trying to restore a process with
tty. Coredump file says this is caused by uninitialized tty_mutex:
        (gdb) where
        #0  0x00000000004d7270 in atomic_add_return (i=1, v=0x0) at
        include/common/asm/atomic.h:34
        #1  0x00000000004d7398 in mutex_lock (m=0x0) at
        include/common/lock.h:151
        #2  0x00000000004d840c in __pty_open_ptmx_index (index=3, flags=2,
        cb=0x4dce50 <open_pty>, arg=0x11, path=0x5562e0 "ptmx") at
        criu/tty.c:603
        #3  0x00000000004dced8 in pty_create_ptmx_index (dfd=17, index=3,
        flags=2) at criu/tty.c:2384

since init_tty_mutex() is reentrantable, just calling it before
mutex_lock()

Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
avagin pushed a commit that referenced this pull request Sep 7, 2019
Segmentation fault was raised while trying to restore a process with
tty. Coredump file says this is caused by uninitialized tty_mutex:
        (gdb) where
        #0  0x00000000004d7270 in atomic_add_return (i=1, v=0x0) at
        include/common/asm/atomic.h:34
        #1  0x00000000004d7398 in mutex_lock (m=0x0) at
        include/common/lock.h:151
        #2  0x00000000004d840c in __pty_open_ptmx_index (index=3, flags=2,
        cb=0x4dce50 <open_pty>, arg=0x11, path=0x5562e0 "ptmx") at
        criu/tty.c:603
        #3  0x00000000004dced8 in pty_create_ptmx_index (dfd=17, index=3,
        flags=2) at criu/tty.c:2384

since init_tty_mutex() is reentrantable, just calling it before
mutex_lock()

Signed-off-by: Deng Guangxing <dengguangxing@huawei.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
avagin added a commit that referenced this pull request Mar 30, 2020
CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
21. leaked_storage: Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Andrei Vagin <avagin@gmail.com>
avagin pushed a commit that referenced this pull request Sep 29, 2020
CID 302717 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable dirnew going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Sep 29, 2020
CID 226486 (#1 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Sep 29, 2020
CID 226485 (#1 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#2 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#3 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Sep 29, 2020
CID 226478 (#1 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

CID 226478 (#2 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Sep 29, 2020
CID 73358 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
 sk is passed to a parameter that cannot be negative.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 1, 2020
CID 302717 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable dirnew going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 1, 2020
CID 226486 (#1 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 1, 2020
CID 226485 (#1 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#2 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#3 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

Also changed epoll_prepare() to check return value of epoll_create()
against '< 0' instead if '== -1' to make coverity happy.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 1, 2020
CID 226478 (#1 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

CID 226478 (#2 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 1, 2020
CID 73358 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
 sk is passed to a parameter that cannot be negative.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 5, 2020
CID 302717 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable dirnew going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 5, 2020
CID 226486 (#1 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 5, 2020
CID 226485 (#1 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#2 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#3 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

Also changed epoll_prepare() to check return value of epoll_create()
against '< 0' instead if '== -1' to make coverity happy.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 5, 2020
CID 226478 (#1 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

CID 226478 (#2 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 5, 2020
CID 73358 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
 sk is passed to a parameter that cannot be negative.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 12, 2020
CID 302717 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable dirnew going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 12, 2020
CID 226486 (#1 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 12, 2020
CID 226485 (#1 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#2 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#3 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

Also changed epoll_prepare() to check return value of epoll_create()
against '< 0' instead if '== -1' to make coverity happy.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 12, 2020
CID 226478 (#1 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

CID 226478 (#2 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 12, 2020
CID 73358 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
 sk is passed to a parameter that cannot be negative.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 29, 2020
CID 302717 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable dirnew going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 29, 2020
CID 226486 (#1 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

CID 226486 (#2 of 2): Resource leak (RESOURCE_LEAK)
 Variable mi going out of scope leaks the storage it points to.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 29, 2020
CID 226485 (#1 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#2 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

CID 226485 (#3 of 3): Resource leak (RESOURCE_LEAK)
 Variable events going out of scope leaks the storage it points to

Also changed epoll_prepare() to check return value of epoll_create()
against '< 0' instead if '== -1' to make coverity happy.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 29, 2020
CID 226478 (#1 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

CID 226478 (#2 of 2): Double close (USE_AFTER_FREE)
 Calling close(int) closes handle fd which has already been closed.

Signed-off-by: Adrian Reber <areber@redhat.com>
avagin pushed a commit that referenced this pull request Oct 29, 2020
CID 73358 (#2 of 2): Argument cannot be negative (NEGATIVE_RETURNS)
 sk is passed to a parameter that cannot be negative.

Signed-off-by: Adrian Reber <areber@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant