ceph-fuse: start up log on parent process before shutdown #12358

Merged
merged 1 commit into from Dec 14, 2016

Projects

None yet

4 participants

@tchaikov
Contributor
tchaikov commented Dec 7, 2016 edited

in this change we use global_init_postfork_start() to restart the log
after it is stopped by Preforker, otherwise, we hit an assert in the
Ceph context and logging teardown.

  • ceph_fuse.c:
    • rewrite the fork hackery using Preforker helper class
    • write "starting ceph client" message to cerr, as the cout was closed
      by global_init_postfork_start()
  • fuse_ll.cc: write -1 to signal the parent process that init is done.
  • Preforker.h: add a helper method to return the fd to which, the child
    process can write an int to notify its status.

Fixes: http://tracker.ceph.com/issues/18157
Signed-off-by: Kefu Chai kchai@redhat.com

@tchaikov tchaikov added this to the kraken milestone Dec 7, 2016
@jcsp
Contributor
jcsp commented Dec 7, 2016

Seems reasonable (I've never used these helpers before)

@gregsfortytwo
Member

I haven't used these either but the Preforker is a lot more compact than I'd thought and this looks good to me. I'll run some local tests with it today.

@gregsfortytwo
Member

Yep, passes my basic local tests just fine.

@liewegas
Member
liewegas commented Dec 7, 2016

This crashes when i use -f:

gnit:build (master) 03:56 PM $ sudo bin/ceph-fuse mnt -f
2016-12-07 15:56:50.311131 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-12-07 15:56:50.311525 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-12-07 15:56:50.315068 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
2016-12-07 15:56:50.317368 7f46ce835f40 -1 init, newargv = 0x55ca4a7e3200 newargc=11
ceph-fuse[9276]: starting ceph client
ceph-fuse[9276]: starting fuse
2016-12-07 15:56:50.333408 7f46c029c700 -1 fuse_ll: do_init: safe_write failed with error (9) Bad file descriptor
*** Caught signal (Aborted) **
 in thread 7f46c029c700 thread_name:ceph-fuse
 ceph version 11.0.2-2329-gdb5b2ab (db5b2ab32dea46901ee38752d76f65d97bd5833f)
 1: (()+0x25b2de) [0x55ca40e482de]
 2: (()+0x115c0) [0x7f46ccb0f5c0]
 3: (gsignal()+0x9f) [0x7f46cb69b92f]
 4: (abort()+0x16a) [0x7f46cb69d52a]
 5: (()+0x1a165d) [0x55ca40d8e65d]
 6: (()+0x1528c) [0x7f46ce19c28c]
 7: (()+0x164c1) [0x7f46ce19d4c1]
 8: (()+0x12c68) [0x7f46ce199c68]
 9: (()+0x76ca) [0x7f46ccb056ca]
 10: (clone()+0x5f) [0x7f46cb76df6f]
2016-12-07 15:56:50.334321 7f46c029c700 -1 *** Caught signal (Aborted) **
 in thread 7f46c029c700 thread_name:ceph-fuse

 ceph version 11.0.2-2329-gdb5b2ab (db5b2ab32dea46901ee38752d76f65d97bd5833f)
 1: (()+0x25b2de) [0x55ca40e482de]
 2: (()+0x115c0) [0x7f46ccb0f5c0]
 3: (gsignal()+0x9f) [0x7f46cb69b92f]
 4: (abort()+0x16a) [0x7f46cb69d52a]
 5: (()+0x1a165d) [0x55ca40d8e65d]
 6: (()+0x1528c) [0x7f46ce19c28c]
 7: (()+0x164c1) [0x7f46ce19d4c1]
 8: (()+0x12c68) [0x7f46ce199c68]
 9: (()+0x76ca) [0x7f46ccb056ca]
 10: (clone()+0x5f) [0x7f46cb76df6f]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this.

  -116> 2016-12-07 15:56:50.311131 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
  -114> 2016-12-07 15:56:50.311525 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
  -110> 2016-12-07 15:56:50.315068 7f46ce835f40 -1 WARNING: the following dangerous and experimental features are enabled: *
   -77> 2016-12-07 15:56:50.317368 7f46ce835f40 -1 init, newargv = 0x55ca4a7e3200 newargc=11
    -1> 2016-12-07 15:56:50.333408 7f46c029c700 -1 fuse_ll: do_init: safe_write failed with error (9) Bad file descriptor
     0> 2016-12-07 15:56:50.334321 7f46c029c700 -1 *** Caught signal (Aborted) **
 in thread 7f46c029c700 thread_name:ceph-fuse

 ceph version 11.0.2-2329-gdb5b2ab (db5b2ab32dea46901ee38752d76f65d97bd5833f)
 1: (()+0x25b2de) [0x55ca40e482de]
 2: (()+0x115c0) [0x7f46ccb0f5c0]
 3: (gsignal()+0x9f) [0x7f46cb69b92f]
 4: (abort()+0x16a) [0x7f46cb69d52a]
 5: (()+0x1a165d) [0x55ca40d8e65d]
 6: (()+0x1528c) [0x7f46ce19c28c]
 7: (()+0x164c1) [0x7f46ce19d4c1]
 8: (()+0x12c68) [0x7f46ce199c68]
 9: (()+0x76ca) [0x7f46ccb056ca]
 10: (clone()+0x5f) [0x7f46cb76df6f]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this.

Aborted

Applying the simple fix for now; we can follow up with this later.

@liewegas liewegas removed this from the kraken milestone Dec 7, 2016
@tchaikov tchaikov ceph-fuse: rewrite the fork hackery using Prefork
in this change we use global_init_postfork_start() to restart the
log after it is stopped by Preforker.

* ceph_fuse.c:
   - rewrite the fork hackery using Preforker helper class
   - write "starting ceph client" message to cerr, as the cout was
     closed
     by global_init_postfork_start()
* fuse_ll.cc: write -1 to signal the parent process that init is
  done.
* Preforker.h: add a helper method to return the fd to which, the
  child process can write an int to notify its status.

Signed-off-by: Kefu Chai <kchai@redhat.com>
83aaa55
@tchaikov
Contributor
tchaikov commented Dec 8, 2016

@liewegas fixed and repushed.

if ceph-fuse is running in the foreground, fork() is not called, and get_signal_fd() should not return a non-zero fd. so do_init() should not write to the fd_on_success in this case.

@gregsfortytwo gregsfortytwo added this to the kraken milestone Dec 9, 2016
@gregsfortytwo
Member

If this works now I think we probably want it in Kraken for the behavior change.

@liewegas
Member

@gregsfortytwo @jcsp go ahead and merge once it's tested?

@jcsp jcsp merged commit 69e03da into ceph:master Dec 14, 2016

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
@tchaikov tchaikov deleted the tchaikov:wip-18157 branch Dec 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment