Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error (criu/net.c:1738): iptables-restore -w failed #469

Open
avagin opened this issue Apr 5, 2018 · 14 comments

Comments

Projects
None yet
5 participants
@avagin
Copy link
Member

commented Apr 5, 2018

====================== Run zdtm/static/cmdlinenv00 in uns ======================
Start test
Test is SUID
./cmdlinenv00 --pidfile=cmdlinenv00.pid --outfile=cmdlinenv00.out --arg1=arg1 --arg2=arg2 --arg3=arg3
Run criu dump
Run criu restore
=[log]=> dump/zdtm/static/cmdlinenv00/186/1/restore.log
------------------------ grep Error ------------------------
(00.291498)      1: 	Running ip rule delete table local
(00.334470)      1: 	Running ip rule restore
(00.348395)      1: 	Running iptables-restore -w for iptables-restore -w
Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.361974)      1: Error (criu/util.c:841): exited, status=4
(00.361990)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.361998)      1: Error (criu/net.c:2389): Can't create net_ns
(00.362025)      1: Error (criu/util.c:1566): Can't wait or bad status: errno=0, status=65280
(00.362656) Error (criu/cr-restore.c:2313): Failed to switch restore stage to CR_STATE_PREPARE_NAMESPACES
(00.410145) Error (criu/mount.c:3175): mnt: Can't remove the directory /tmp/.criu.mntns.2ighcL: No such file or directory
(00.410168) uns: calling exit_usernsd[0x468120] (-1, 1)
(00.410208) uns: daemon calls 0x468120 (203, -1, 1)
(00.410216) uns: `- daemon exits w/ 0
(00.410915) uns: daemon stopped
(00.410923) Error (criu/cr-restore.c:2523): Restoring FAILED.
------------------------ ERROR OVER ------------------------
############## Test zdtm/static/cmdlinenv00 FAIL at CRIU restore ###############

To: @tkhai

@avagin

This comment has been minimized.

Copy link
Member Author

commented Apr 5, 2018

Cc: @Snorch

@avagin

This comment has been minimized.

@tkhai

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2018

This came after iptables commit 80d8bfaac9e2430d710084a10ec78e68bd61e6ec "iptables: insist that the lock is held."

@avagin

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2018

@tkhai iptable-restore -w is called from an user namespace and it doesn't have permissions the open the host /run/xtables.lock

[avagin@laptop linux-task-diag]$ ls -l /run/xtables.lock 
-rw-------. 1 root root 0 Mar 23 22:20 /run/xtables.lock

Why do we need to call iptable-restore with -w? Is it can race with something else?

@tkhai

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

Without the option the restore fails with:
(00.202572) 1: Running iptables-restore for iptables-restore
Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
(00.216730) 1: Error (criu/util.c:709): exited, status=4
(00.216772) 1: Error (criu/net.c:1328): iptables-restore failed

Two parallel restores use the same /run/xtables.lock and race each other.

We should fix this by adding "chmod 0666 /run/xtables.lock" somewhere in scripts. You should better know what is appropriate place for this in zdtm or travis scripts.

@avagin

This comment has been minimized.

Copy link
Member Author

commented Apr 9, 2018

We can't do this. It will mean that any user will be able to take this lock. This problem affects other users, not only our tests.

@Snorch

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

I think these lock is meant to be per net-namespace, we don't care if we do simultaneous iptables-restore in different namespaces, so we can make /run/xtables.lock available to a container's root user (but it should be per-container file).

@tkhai

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

We are affected in tests only, so we can simply do chmod. Tests always executed in its own /.

@rst0git

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2018

Hi @tkhai, I also receive this error when restoring a libvirt-lxc container with enabled userns. Would it be possible the chmod to be in criu\net.c?

@tkhai

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

I think we should try to suggest something like this for iptables:
``
diff --git a/iptables/xshared.c b/iptables/xshared.c
index 06db72d4..4704e5a4 100644
--- a/iptables/xshared.c
+++ b/iptables/xshared.c
@@ -254,7 +254,12 @@ static int xtables_lock(int wait, struct timeval *wait_interval)
time_left.tv_sec = wait;
time_left.tv_usec = 0;

  • fd = open(XT_LOCK_NAME, O_CREAT, 0600);
  • if (link("/proc/self/ns/net", XT_LOCK_NAME) != 0 &&
  •   errno != EEXIST) {
    
  •   fprintf(stderr, "Fatal: can't create lock file\n");
    
  •   return XT_LOCK_FAILED;
    
  • }
  • fd = open(XT_LOCK_NAME, O_RDONLY);
    if (fd < 0) {
    fprintf(stderr, "Fatal: can't open lock file %s: %s\n",
    XT_LOCK_NAME, strerror(errno));
    ``
@rst0git

This comment has been minimized.

Copy link
Collaborator

commented Apr 9, 2018

Yes, I agree! The following diff works for me as well.

diff --git a/iptables/xshared.c b/iptables/xshared.c
index 06db72d..b9ee339 100644
--- a/iptables/xshared.c
+++ b/iptables/xshared.c
@@ -254,7 +254,12 @@ static int xtables_lock(int wait, struct timeval *wait_interval)
        time_left.tv_sec = wait;
        time_left.tv_usec = 0;
 
-       fd = open(XT_LOCK_NAME, O_CREAT, 0600);
+       if (link("/proc/self/ns/net", XT_LOCK_NAME) != 0 &&
+               errno != EEXIST) {
+               fprintf(stderr, "Fatal: can't create lock file\n");
+               return XT_LOCK_FAILED;
+       }
+       fd = open(XT_LOCK_NAME, O_RDONLY);
        if (fd < 0) {
                fprintf(stderr, "Fatal: can't open lock file %s: %s\n",
                        XT_LOCK_NAME, strerror(errno));
@tkhai

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

Hm, though s/link/symlink/

@avagin avagin added the bug label Apr 22, 2018

criupatchwork pushed a commit to criupatchwork/criu that referenced this issue May 4, 2018

net: mount a new tmpfs if it isn't enough rights to open /run/xtables…
….lock

Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>

avagin added a commit to avagin/criu that referenced this issue May 7, 2018

travis: workaround a problem with /run/xtables.lock
checkpoint-restore#469
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 8, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
checkpoint-restore#469
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit that referenced this issue May 8, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
#469

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit that referenced this issue May 9, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
#469

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 9, 2018

net: mount a new tmpfs if it isn't enough rights to open /run/xtables…
….lock

Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>

avagin added a commit to avagin/criu that referenced this issue May 9, 2018

net: mount a new tmpfs if it isn't enough rights to open /run/xtables…
….lock

Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

criupatchwork pushed a commit to criupatchwork/criu that referenced this issue May 9, 2018

net: workaround a problem when iptables can't open /run/xtables.lock
Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.
Net namespaces are restored in a separate process, so we can create a
new mount namespace and create new mounts.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
@dfsdevops

This comment has been minimized.

Copy link

commented May 9, 2018

I think I'm running into this issue or something similar, can somebody confirm?

Ubuntu 18.04 LTS vanilla install, LXD 3.0, CRIU 3.6. All packages vanilla from apt pacakges with no additional or special configuration. Any time I try to do any stateful snapshotting lxc snapshot with --stateful or any live migration, I get a failure from CRIU.

last few lines from the logs

(01.266229) 6733 (native) is going to execute the syscall 1, required is 15
pie: 151: 151: new_sp=0x7fbf16bac008 ip 0x7fbf166cfbb7
(01.266430) 6733 was trapped
(01.266494) `- Expecting exit
(01.266613) 6733 was trapped
(01.266663) 6733 (native) is going to execute the syscall 3, required is 15
(01.266783) 6733 was trapped
(01.266824) `- Expecting exit
(01.266936) 6733 was trapped
(01.266983) 6733 (native) is going to execute the syscall 3, required is 15
(01.267082) 6733 was trapped
(01.267171) `- Expecting exit
(01.267296) 6733 was trapped
(01.267343) 6733 (native) is going to execute the syscall 15, required is 15
(01.267439) 6733 was stopped
(01.267947) Unlock network
(01.267984) Running network-unlock scripts
iptables-restore: invalid option -- 'w'
ip6tables-restore: invalid option -- 'w'
(01.310267) Unfreezing tasks into 1
(01.310367)     Unseizing 6411 into 1
(01.310452)     Unseizing 6568 into 1
(01.310513)     Unseizing 6582 into 1
(01.310572)     Unseizing 6733 into 1
(01.310631)     Unseizing 6762 into 1
(01.310689)     Unseizing 6984 into 1
(01.310931)     Unseizing 6986 into 1
(01.311050)     Unseizing 6987 into 1
(01.311171)     Unseizing 6988 into 1
(01.311238)     Unseizing 6989 into 1
(01.311298)     Unseizing 6991 into 1
(01.311351)     Unseizing 6992 into 1
(01.311418)     Unseizing 6993 into 1
(01.311474)     Unseizing 7007 into 1
(01.311538)     Unseizing 7014 into 1
(01.311640)     Unseizing 7020 into 1
(01.311844) Error (criu/cr-dump.c:1709): Dumping FAILED.
@avagin

This comment has been minimized.

Copy link
Member Author

commented May 10, 2018

@wetpaste No, it is another issue. Could you create a new issue and attach logs?

avagin added a commit to avagin/criu that referenced this issue May 11, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
checkpoint-restore#469

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 12, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
checkpoint-restore#469

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit that referenced this issue May 12, 2018

net: workaround a problem when iptables can't open /run/xtables.lock
Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.
Net namespaces are restored in a separate process, so we can create a
new mount namespace and create new mounts.

#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 12, 2018

net: workaround a problem when iptables can't open /run/xtables.lock
Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.
Net namespaces are restored in a separate process, so we can create a
new mount namespace and create new mounts.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 12, 2018

travis: rollback to fedora 27
We have a few issues with fc28. For example:
checkpoint-restore#469

Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>

avagin added a commit to avagin/criu that referenced this issue May 12, 2018

net: workaround a problem when iptables can't open /run/xtables.lock
Starting with iptables 1.6.2, we have to use the --wait option,
but it doesn't work properly with userns, because in this case,
we don't have enough rights to open /run/xtables.lock.

(00.174703)      1: 	Running iptables-restore -w for iptables-restore -w Fatal: can't open lock file /run/xtables.lock: Permission denied
(00.192058)      1: Error (criu/util.c:842): exited, status=4
(00.192080)      1: Error (criu/net.c:1738): iptables-restore -w failed
(00.192088)      1: Error (criu/net.c:2389): Can't create net_ns
(00.192131)      1: Error (criu/util.c:1567): Can't wait or bad status: errno=0, status=65280

This patch workarounds this problem by mounting tmpfs into /run.
Net namespaces are restored in a separate process, so we can create a
new mount namespace and create new mounts.

checkpoint-restore#469

Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Andrei Vagin <avagin@virtuozzo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.