Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative to wake_with() #22

Open
nyh opened this issue Jul 16, 2013 · 1 comment
Open

Alternative to wake_with() #22

nyh opened this issue Jul 16, 2013 · 1 comment

Comments

@nyh
Copy link
Contributor

nyh commented Jul 16, 2013

Right now, code like condvar_wake_one/all use wake_with() to allow waking a thread which might exit as soon as we wake it.

Avi doesn't like wake_with() for two reasons: It is unorthodox API (so users need to know to use it, when and how), and it slows down every wake() with two atomic fetch-add.

Why does condvar_wake_one/all() need to use wake_with at all?
It already does
m.lock();
waiter = wr->t;
wr->t = nullptr;
waiter->wake();
m.unlock();

If the waiter would regain the lock when waking up, then we would know the thread cannot exit before wake_one/all releases this lock, and the problem would be solved. But we didn't regain the lock (except in the case of timeout), because we feared that the waiter will often go back to sleep immediately after waking, to wait for the lock which the waker is still holding.

The idea (suggested by Avi) is, instead of finding wake_with() as a replacement to the above solution - just make the above solution efficient.

As explained above, the waiter will regain the lock after waiting. The waker will do:
m.lock();
preempt_disable();
waiter->wr->t;
wr->t = nullptr;
waiter->wake();
m.unlock();
preempt_enable();

If the waiter thread is on the same CPU as the waker, this is efficient - the waiter will not start to run (and try to regain the lock) until the preempt_enable() after we release the lock. This is more efficient than the wake_with() case because now we have a thread-local increment (preempt_disable()) instead of the atomic increment used by wake_with().

If the waiter thread is on a different CPU, the preempt_disable() won't help, and it can start running before we finish m.unlock(). In the usual case, it will take a short time until m.unlock() completes, so all we need to do is to add to mutex_lock() the ability to do a bit of spinning before going to sleep. We want this feature anyway. This will be a performance improvement over wake_with() only if in the typical case, m.unlock() will finish more quickly than the waiter thread starting to run so we'll have no spinning iterations at all. We'll need to verify that this is actually the case - otherwise all of this won't be an improvement at all.

@nyh
Copy link
Contributor Author

nyh commented Jul 16, 2013

Guy suggested that:

  1. He likes the idea of wake() inside the lock.
  2. He suggests there's no need to add the preempt_disable/enable() - instead the scheduler should make better decisions - the waker thread should not be preempted just because the woken waiter has a low vruntime (because it slept for a while). E.g., we need a "minimum time slice" (e.g., 1ms) and not preempt the waker before such time has passed, even if higher-priority threads are woken.
  3. In the different-CPU case, since we anyway believe/assume that usually no spinning is needed, we don't even need to implement spinning - if the sleep-on-wakeup situation will be very rare, we don't care if it will be slow.

So basically the only change we need to make to code is to improve the scheduler's handling of this case
(and of course, start using the new idiom of wake-inside-lock).

NOTE: In the condvar case, using this new idiom will mean regaining the lock at the end of the wait. Since the mutex code also uses atomic increments (and more) this will offset any performance saving of not using wake_with(). So I don't believe this new idiom will give any better performance. It will be perhaps just easier for users to use.

penberg pushed a commit that referenced this issue Aug 26, 2013
If a crashed OSv guest is restarted, ZFS mount causes a GPF in early
startup:

  VFS: mounting zfs at /usr
  zfs: mounting osv/usr from device /dev/vblk1
  Aborted

GDB backtrace points finger at zfs_rmnode():

  #0  processor::halt_no_interrupts () at ../../arch/x64/processor.hh:212
  #1  0x00000000003e7f2a in osv::halt () at ../../core/power.cc:20
  #2  0x000000000021cdd4 in abort (msg=0x636df0 "Aborted\n") at ../../runtime.cc:95
  #3  0x000000000021cda2 in abort () at ../../runtime.cc:86
  #4  0x000000000044c149 in osv::generate_signal (siginfo=..., ef=0xffffc0003ffe7008) at ../../libc/signal.cc:44
  #5  0x000000000044c220 in osv::handle_segmentation_fault (addr=72, ef=0xffffc0003ffe7008) at ../../libc/signal.cc:55
  #6  0x0000000000366df3 in page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:876
  #7  <signal handler called>
  #8  0x0000000000345eaa in zfs_rmnode (zp=0xffffc0003d1de400)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:611
  #9  0x000000000035650c in zfs_zinactive (zp=0xffffc0003d1de400)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1355
  #10 0x0000000000345be1 in zfs_unlinked_drain (zfsvfs=0xffffc0003ddfe000)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:523
  #11 0x000000000034f45c in zfsvfs_setup (zfsvfs=0xffffc0003ddfe000, mounting=true)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:881
  #12 0x000000000034f7a4 in zfs_domount (vfsp=0xffffc0003de02000, osname=0x6b14cb "osv/usr")
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1016
  #13 0x000000000034f98c in zfs_mount (mp=0xffffc0003de02000, dev=0x6b14d7 "/dev/vblk1", flags=0, data=0x6b14cb)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1415
  #14 0x0000000000406852 in sys_mount (dev=0x6b14d7 "/dev/vblk1", dir=0x6b14a3 "/usr", fsname=0x6b14d3 "zfs", flags=0, data=0x6b14cb)
      at ../../fs/vfs/vfs_mount.c:171
  #15 0x00000000003eff97 in mount_usr () at ../../fs/vfs/main.cc:1415
  #16 0x0000000000203a89 in do_main_thread (_args=0xffffc0003fe9ced0) at ../../loader.cc:215
  #17 0x0000000000448575 in pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, pthread_private::thread_attr const*)::{lambda()#1}::operator()() const () at ../../libc/pthread.cc:59
  #18 0x00000000004499d3 in std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda0>::_M_invoke(const std::_Any_data &) (__functor=...)
      at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2071
  #19 0x000000000037e602 in std::function<void ()>::operator()() const (this=0xffffc0003e170038)
      at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2468
  #20 0x00000000003bae3e in sched::thread::main (this=0xffffc0003e170010) at ../../core/sched.cc:581
  #21 0x00000000003b8c92 in sched::thread_main_c (t=0xffffc0003e170010) at ../../arch/x64/arch-switch.hh:133
  #22 0x0000000000399c8e in thread_main () at ../../arch/x64/entry.S:101

The problem is that ZFS tries to check if the znode is an attribute
directory and trips over zp->z_vnode being NULL.  However, as explained
in commit b7ee91e ("zfs: port vop_lookup"), we don't even support
extended attributes so drop the check completely for OSv.
penberg pushed a commit that referenced this issue Aug 26, 2013
If a crashed OSv guest is restarted, ZFS mount causes a GPF in early
startup:

  VFS: mounting zfs at /usr
  zfs: mounting osv/usr from device /dev/vblk1
  Aborted

GDB backtrace points finger at zfs_rmnode():

  #0  processor::halt_no_interrupts () at ../../arch/x64/processor.hh:212
  #1  0x00000000003e7f2a in osv::halt () at ../../core/power.cc:20
  #2  0x000000000021cdd4 in abort (msg=0x636df0 "Aborted\n") at ../../runtime.cc:95
  #3  0x000000000021cda2 in abort () at ../../runtime.cc:86
  #4  0x000000000044c149 in osv::generate_signal (siginfo=..., ef=0xffffc0003ffe7008) at ../../libc/signal.cc:44
  #5  0x000000000044c220 in osv::handle_segmentation_fault (addr=72, ef=0xffffc0003ffe7008) at ../../libc/signal.cc:55
  #6  0x0000000000366df3 in page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:876
  #7  <signal handler called>
  #8  0x0000000000345eaa in zfs_rmnode (zp=0xffffc0003d1de400)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:611
  #9  0x000000000035650c in zfs_zinactive (zp=0xffffc0003d1de400)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:1355
  #10 0x0000000000345be1 in zfs_unlinked_drain (zfsvfs=0xffffc0003ddfe000)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c:523
  #11 0x000000000034f45c in zfsvfs_setup (zfsvfs=0xffffc0003ddfe000, mounting=true)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:881
  #12 0x000000000034f7a4 in zfs_domount (vfsp=0xffffc0003de02000, osname=0x6b14cb "osv/usr")
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1016
  #13 0x000000000034f98c in zfs_mount (mp=0xffffc0003de02000, dev=0x6b14d7 "/dev/vblk1", flags=0, data=0x6b14cb)
      at ../../bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1415
  #14 0x0000000000406852 in sys_mount (dev=0x6b14d7 "/dev/vblk1", dir=0x6b14a3 "/usr", fsname=0x6b14d3 "zfs", flags=0, data=0x6b14cb)
      at ../../fs/vfs/vfs_mount.c:171
  #15 0x00000000003eff97 in mount_usr () at ../../fs/vfs/main.cc:1415
  #16 0x0000000000203a89 in do_main_thread (_args=0xffffc0003fe9ced0) at ../../loader.cc:215
  #17 0x0000000000448575 in pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, pthread_private::thread_attr const*)::{lambda()#1}::operator()() const () at ../../libc/pthread.cc:59
  #18 0x00000000004499d3 in std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::__lambda0>::_M_invoke(const std::_Any_data &) (__functor=...)
      at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2071
  #19 0x000000000037e602 in std::function<void ()>::operator()() const (this=0xffffc0003e170038)
      at ../../external/gcc.bin/usr/include/c++/4.8.1/functional:2468
  #20 0x00000000003bae3e in sched::thread::main (this=0xffffc0003e170010) at ../../core/sched.cc:581
  #21 0x00000000003b8c92 in sched::thread_main_c (t=0xffffc0003e170010) at ../../arch/x64/arch-switch.hh:133
  #22 0x0000000000399c8e in thread_main () at ../../arch/x64/entry.S:101

The problem is that ZFS tries to check if the znode is an attribute
directory and trips over zp->z_vnode being NULL.  However, as explained
in commit b7ee91e ("zfs: port vop_lookup"), we don't even support
extended attributes so drop the check completely for OSv.
penberg pushed a commit that referenced this issue Aug 26, 2013
Fix mincore() to deal with unmapped addresses like msync() does.

This fixes a SIGSEGV in libunwind's access_mem() when leak detector is
enabled:

   (gdb) bt
  #0  page_fault (ef=0xffffc0003ffe7008) at ../../core/mmu.cc:871
  #1  <signal handler called>
  #2  ContiguousSpace::block_start_const (this=<optimized out>, p=0x77d2f3968)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/oops/oop.inline.hpp:411
  #3  0x00001000008ae16c in GenerationBlockStartClosure::do_space (this=0x2000001f9100, s=<optimized out>)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:242
  #4  0x00001000007f097c in DefNewGeneration::space_iterate (this=0xffffc0003fb68c00, blk=0x2000001f9100, usedOnly=<optimized out>)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/defNewGeneration.cpp:480
  #5  0x00001000008aca0e in Generation::block_start (this=<optimized out>, p=<optimized out>)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/memory/generation.cpp:251
  #6  0x0000100000b06d2f in os::print_location (st=st@entry=0x2000001f9560, x=32165017960, verbose=verbose@entry=false)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/runtime/os.cpp:868
  #7  0x0000100000b11b5b in os::print_register_info (st=0x2000001f9560, context=0x2000001f9740)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:839
  #8  0x0000100000c6cde8 in VMError::report (this=0x2000001f9610, st=st@entry=0x2000001f9560)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:551
  #9  0x0000100000c6da3b in VMError::report_and_die (this=this@entry=0x2000001f9610)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/share/vm/utilities/vmError.cpp:984
  #10 0x0000100000b1109f in JVM_handle_linux_signal (sig=11, info=0x2000001f9bb8, ucVoid=0x2000001f9740,
      abort_if_unrecognized=<optimized out>)
      at /usr/src/debug/java-1.7.0-openjdk-1.7.0.25-2.3.12.3.fc19.x86_64/openjdk/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp:528
  #11 0x000000000039f242 in call_signal_handler (frame=0x2000001f9b10) at ../../arch/x64/signal.cc:69
  #12 <signal handler called>
  #13 0x000000000057d721 in access_mem ()
  #14 0x000000000057cb1d in dwarf_get ()
  #15 0x000000000057ce51 in _ULx86_64_step ()
  #16 0x00000000004315fd in backtrace (buffer=0x1ff9d80 <memory::alloc_tracker::remember(void*, int)::bt>, size=20)
      at ../../libc/misc/backtrace.cc:16
  #17 0x00000000003b8d99 in memory::alloc_tracker::remember (this=0x1777ae0 <memory::tracker>, addr=0xffffc0004508de00, size=54)
      at ../../core/alloctracker.cc:59
  #18 0x00000000003b0504 in memory::tracker_remember (addr=0xffffc0004508de00, size=54) at ../../core/mempool.cc:43
  #19 0x00000000003b2152 in std_malloc (size=54) at ../../core/mempool.cc:723
  #20 0x00000000003b259c in malloc (size=54) at ../../core/mempool.cc:856
  #21 0x0000100001615e4c in JNU_GetStringPlatformChars (env=env@entry=0xffffc0003a4dc1d8, jstr=jstr@entry=0xffffc0004591b800,
      isCopy=isCopy@entry=0x0) at ../../../src/share/native/common/jni_util.c:801
  #22 0x000010000161ada6 in Java_java_io_UnixFileSystem_getBooleanAttributes0 (env=0xffffc0003a4dc1d8, this=<optimized out>,
      file=<optimized out>) at ../../../src/solaris/native/java/io/UnixFileSystem_md.c:111
  #23 0x000020000021ed8e in ?? ()
  #24 0x00002000001faa58 in ?? ()
  #25 0x00002000001faac0 in ?? ()
  #26 0x00002000001faa50 in ?? ()
  #27 0x0000000000000000 in ?? ()

Spotted by Avi Kivity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant