lxc-start fails when root mount propagation mode set to shared #4

Closed
unitrq opened this Issue Sep 21, 2012 · 3 comments

Comments

Projects
None yet
3 participants
Contributor

unitrq commented Sep 21, 2012

Recent systemd versions set root mount propagation mode to "shared" (commit: http://cgit.freedesktop.org/systemd/systemd/commit/?id=b3ac5f8cb98757416d8660023d6564a7c411f0a0).
This mode prevents lxc from starting containers, resulting in errors like this - http://sourceforge.net/tracker/index.php?func=detail&aid=3559833&group_id=163076&atid=826303
As suggested here - https://bugs.archlinux.org/task/31211, remounting root with --make-rprivate option workarounds this issue.

Owner

stgraber commented Nov 12, 2012

This has been discussed a fair bit on the lxc mailing list over the past few weeks.
Remounting / with --make-rprivate isn't something lxc should do as it'll likely lead to a lot of other problems on the host.

At the moment some people are discussing with systemd upstream to find a compromise to fix what is ultimately a systemd design flaw.

I'm closing the bug against lxc as this should be fixed in systemd or in the kernel and not worked around in lxc.

@stgraber stgraber closed this Nov 12, 2012

Contributor

unitrq commented Nov 14, 2012

that's all I wanted to know, thanks a lot for explanation

So any permanent fix? Are you aware that we can't use lxc containers on F18 and F19 hosts? This is bad.

@fgrehm fgrehm referenced this issue in fgrehm/vagrant-lxc Jul 21, 2013

Closed

Can't start VM on Fedora 19 #113

stgraber added a commit to stgraber/lxc that referenced this issue Dec 19, 2013

remove static_lock()/static_unlock() and start to use thread local st…
…orage (v2)

While testing lxc#106, I found that concurrent starts
are hanging time to time. I then reproduced the same problem in master and got following;

 [caglar@oOo:~] sudo gdb -p 16221
 (gdb) bt
 #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007f495526515c in _L_lock_982 () from /lib/x86_64-linux-gnu/libpthread.so.0
 #2  0x00007f4955264fab in __GI___pthread_mutex_lock (mutex=0x7f49556d4600 <static_mutex>) at pthread_mutex_lock.c:64
 #3  0x00007f49554b27a6 in lock_mutex (l=l@entry=0x7f49556d4600 <static_mutex>) at lxclock.c:78
 #4  0x00007f49554b2dac in static_lock () at lxclock.c:330
 #5  0x00007f4955498f71 in lxc_global_config_value (option_name=option_name@entry=0x7f49554c02cf "cgroup.use") at utils.c:273
 #6  0x00007f495549926c in default_cgroup_use () at utils.c:366
 #7  0x00007f49554953bd in lxc_cgroup_load_meta () at cgroup.c:94
 #8  0x00007f495548debc in lxc_spawn (handler=handler@entry=0x7f49200af300) at start.c:783
 #9  0x00007f495548e7a7 in __lxc_start (name=name@entry=0x7f49200b48a0 "lxc-test-concurrent-4", conf=conf@entry=0x7f49200b2030, ops=ops@entry=0x7f49556d3900 <start_ops>, data=data@entry=0x7f495487db90,
    lxcpath=lxcpath@entry=0x7f49200b2010 "/var/lib/lxc") at start.c:951
 #10 0x00007f495548eb9c in lxc_start (name=0x7f49200b48a0 "lxc-test-concurrent-4", argv=argv@entry=0x7f495487dbe0, conf=conf@entry=0x7f49200b2030, lxcpath=0x7f49200b2010 "/var/lib/lxc") at start.c:1048
 #11 0x00007f49554b68f1 in lxcapi_start (c=0x7f49200b1dd0, useinit=<optimized out>, argv=0x7f495487dbe0) at lxccontainer.c:648
 #12 0x0000000000401317 in do_function (arguments=0x1aa80b0) at concurrent.c:94
 #13 0x0000000000401499 in concurrent (arguments=<optimized out>) at concurrent.c:130
 #14 0x00007f4955262f6e in start_thread (arg=0x7f495487e700) at pthread_create.c:311
 #15 0x00007f4954f8d9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

It looks like both parent and child end up with locked mutex thus deadlocks.

I ended up placing values in the thread local storage pool, instead of doing "unlock the lock in the child" dance

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

stgraber added a commit that referenced this issue Dec 19, 2013

remove static_lock()/static_unlock() and start to use thread local st…
…orage (v2)

While testing #106, I found that concurrent starts
are hanging time to time. I then reproduced the same problem in master and got following;

 [caglar@oOo:~] sudo gdb -p 16221
 (gdb) bt
 #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007f495526515c in _L_lock_982 () from /lib/x86_64-linux-gnu/libpthread.so.0
 #2  0x00007f4955264fab in __GI___pthread_mutex_lock (mutex=0x7f49556d4600 <static_mutex>) at pthread_mutex_lock.c:64
 #3  0x00007f49554b27a6 in lock_mutex (l=l@entry=0x7f49556d4600 <static_mutex>) at lxclock.c:78
 #4  0x00007f49554b2dac in static_lock () at lxclock.c:330
 #5  0x00007f4955498f71 in lxc_global_config_value (option_name=option_name@entry=0x7f49554c02cf "cgroup.use") at utils.c:273
 #6  0x00007f495549926c in default_cgroup_use () at utils.c:366
 #7  0x00007f49554953bd in lxc_cgroup_load_meta () at cgroup.c:94
 #8  0x00007f495548debc in lxc_spawn (handler=handler@entry=0x7f49200af300) at start.c:783
 #9  0x00007f495548e7a7 in __lxc_start (name=name@entry=0x7f49200b48a0 "lxc-test-concurrent-4", conf=conf@entry=0x7f49200b2030, ops=ops@entry=0x7f49556d3900 <start_ops>, data=data@entry=0x7f495487db90,
    lxcpath=lxcpath@entry=0x7f49200b2010 "/var/lib/lxc") at start.c:951
 #10 0x00007f495548eb9c in lxc_start (name=0x7f49200b48a0 "lxc-test-concurrent-4", argv=argv@entry=0x7f495487dbe0, conf=conf@entry=0x7f49200b2030, lxcpath=0x7f49200b2010 "/var/lib/lxc") at start.c:1048
 #11 0x00007f49554b68f1 in lxcapi_start (c=0x7f49200b1dd0, useinit=<optimized out>, argv=0x7f495487dbe0) at lxccontainer.c:648
 #12 0x0000000000401317 in do_function (arguments=0x1aa80b0) at concurrent.c:94
 #13 0x0000000000401499 in concurrent (arguments=<optimized out>) at concurrent.c:130
 #14 0x00007f4955262f6e in start_thread (arg=0x7f495487e700) at pthread_create.c:311
 #15 0x00007f4954f8d9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

It looks like both parent and child end up with locked mutex thus deadlocks.

I ended up placing values in the thread local storage pool, instead of doing "unlock the lock in the child" dance

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

stgraber pushed a commit to stgraber/lxc that referenced this issue Feb 10, 2015

Merge pull request #4 from fatih/master.containers
container: change XxxContainers() return behaviour

z-image pushed a commit to z-image/lxc that referenced this issue Oct 16, 2016

remove static_lock()/static_unlock() and start to use thread local st…
…orage (v2)

While testing lxc#106, I found that concurrent starts
are hanging time to time. I then reproduced the same problem in master and got following;

 [caglar@oOo:~] sudo gdb -p 16221
 (gdb) bt
 #0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
 #1  0x00007f495526515c in _L_lock_982 () from /lib/x86_64-linux-gnu/libpthread.so.0
 #2  0x00007f4955264fab in __GI___pthread_mutex_lock (mutex=0x7f49556d4600 <static_mutex>) at pthread_mutex_lock.c:64
 #3  0x00007f49554b27a6 in lock_mutex (l=l@entry=0x7f49556d4600 <static_mutex>) at lxclock.c:78
 #4  0x00007f49554b2dac in static_lock () at lxclock.c:330
 #5  0x00007f4955498f71 in lxc_global_config_value (option_name=option_name@entry=0x7f49554c02cf "cgroup.use") at utils.c:273
 #6  0x00007f495549926c in default_cgroup_use () at utils.c:366
 #7  0x00007f49554953bd in lxc_cgroup_load_meta () at cgroup.c:94
 #8  0x00007f495548debc in lxc_spawn (handler=handler@entry=0x7f49200af300) at start.c:783
 #9  0x00007f495548e7a7 in __lxc_start (name=name@entry=0x7f49200b48a0 "lxc-test-concurrent-4", conf=conf@entry=0x7f49200b2030, ops=ops@entry=0x7f49556d3900 <start_ops>, data=data@entry=0x7f495487db90,
    lxcpath=lxcpath@entry=0x7f49200b2010 "/var/lib/lxc") at start.c:951
 #10 0x00007f495548eb9c in lxc_start (name=0x7f49200b48a0 "lxc-test-concurrent-4", argv=argv@entry=0x7f495487dbe0, conf=conf@entry=0x7f49200b2030, lxcpath=0x7f49200b2010 "/var/lib/lxc") at start.c:1048
 #11 0x00007f49554b68f1 in lxcapi_start (c=0x7f49200b1dd0, useinit=<optimized out>, argv=0x7f495487dbe0) at lxccontainer.c:648
 #12 0x0000000000401317 in do_function (arguments=0x1aa80b0) at concurrent.c:94
 #13 0x0000000000401499 in concurrent (arguments=<optimized out>) at concurrent.c:130
 #14 0x00007f4955262f6e in start_thread (arg=0x7f495487e700) at pthread_create.c:311
 #15 0x00007f4954f8d9cd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113

It looks like both parent and child end up with locked mutex thus deadlocks.

I ended up placing values in the thread local storage pool, instead of doing "unlock the lock in the child" dance

Signed-off-by: S.Çağlar Onur <caglar@10ur.org>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment