Skip to content

rgw/notifications: delete bucket notification object when empty#39944

Merged
yuvalif merged 1 commit intoceph:masterfrom
yuvalif:wip-yuval-fix-49650
Apr 11, 2021
Merged

rgw/notifications: delete bucket notification object when empty#39944
yuvalif merged 1 commit intoceph:masterfrom
yuvalif:wip-yuval-fix-49650

Conversation

@yuvalif
Copy link
Copy Markdown
Contributor

@yuvalif yuvalif commented Mar 9, 2021

Fixes: https://tracker.ceph.com/issues/49650

Signed-off-by: Yuval Lifshitz ylifshit@redhat.com


Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Copy link
Copy Markdown
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Apr 1, 2021

@cbodley after rebasing to latest code I consistently get the following crash (that seems unrelated to the above code change), when running the multisite test suite locally.
could not reproduce without the code changes from this PR.

backtrace:

 in thread 7fab39558640 thread_name:radosgw

 ceph version 17.0.0-2661-g9561b4dde37 (9561b4dde37fe1ecdd471151bdca5997536feb53) quincy (dev)
 1: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x2799040) [0x7fab949aa040]
 2: /lib64/libpthread.so.0(+0x141e0) [0x7fab8f2231e0]
 3: (RGWCoroutinesStack::wakeup()+0x10) [0x7fab943e4fee]
 4: (RGWCoroutine::wakeup()+0x1f) [0x7fab943e976b]
 5: (RGWMetaSyncCR::wakeup(int)+0xab) [0x7fab94258e2f]
 6: (RGWRemoteMetaLog::wakeup(int)+0x37) [0x7fab9423f3f3]
 7: (RGWMetaSyncStatusManager::wakeup(int)+0x24) [0x7fab944f7476]
 8: (RGWMetaSyncProcessorThread::wakeup_sync_shards(std::set<int, std::less<int>, std::allocator<int> >&)+0x77) [0x7fab944f8975]
 9: (RGWRados::wakeup_meta_sync_shards(std::set<int, std::less<int>, std::allocator<int> >&)+0x5e) [0x7fab944a4a52]
 10: (rgw::sal::RGWRadosStore::wakeup_meta_sync_shards(std::set<int, std::less<int>, std::allocator<int> >&)+0x27) [0x7fab9466739b]
 11: (RGWOp_MDLog_Notify::execute(optional_yield)+0x558) [0x7fab93f0950a]
 12: (rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, req_state*, optional_yield, bool)+0x14a9) [0x7fab93eec346]
 13: (process_request(rgw::sal::RGWStore*, RGWREST*, RGWRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, optional_yield, rgw::dmclock::Scheduler*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*, int*)+0x1b92) [0x7fab93eeec0a]
 14: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b3fcb7) [0x7fab93d50cb7]
 15: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b3a619) [0x7fab93d4b619]
 16: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b47de7) [0x7fab93d58de7]
 17: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b4aef9) [0x7fab93d5bef9]
 18: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b4adac) [0x7fab93d5bdac]
 19: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b4ac21) [0x7fab93d5bc21]
 20: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b4a9f5) [0x7fab93d5b9f5]
 21: /root/projects/another-ceph/build/lib/libradosgw.so.2(+0x1b4a296) [0x7fab93d5b296]
 22: make_fcontext()

log messages before the crash:

 -13> 2021-04-01T20:35:59.930+0300 7fab6ede4640 20 garbage collection: RGWGC::process cls_rgw_gc_queue_list_entries returned with return value:0, entries.size=0, truncated=0, next_marker=''
   -12> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857e3c0:op=0x55ab39498300:25RGWDataSyncShardControlCR: operate()
   -11> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857e500:op=0x55ab39498a00:25RGWDataSyncShardControlCR: operate()
   -10> 2021-04-01T20:35:59.930+0300 7fab60dc8640  5 ERROR: can't read user header: ret=-2
    -9> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857e640:op=0x55ab39497c00:25RGWDataSyncShardControlCR: operate()
    -8> 2021-04-01T20:35:59.930+0300 7fab60dc8640  5 ERROR: sync_user() failed, user=zone.user ret=-2
    -7> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857e780:op=0x55ab38a4b100:25RGWDataSyncShardControlCR: operate()
    -6> 2021-04-01T20:35:59.930+0300 7fab60dc8640 20 RGWUserStatsCache: sync user=tester
    -5> 2021-04-01T20:35:59.930+0300 7fab6ede4640  1 -- 10.46.11.34:0/345805247 --> [v2:10.46.11.34:6808/2776614,v1:10.46.11.34:6809/2776614] -- osd_op(unknown.0.0:75 2.1d 2:beaa4be4:gc::gc.7:head [call lock.unlock in=34b] snapc 0=[] ondisk+write+known_if_redirected e31) v8 -- 0x55ab392b0400 con 0x55ab395dd800
    -4> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857e8c0:op=0x55ab38a49500:25RGWDataSyncShardControlCR: operate()
    -3> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857ea00:op=0x55ab38a48700:25RGWDataSyncShardControlCR: operate()
    -2> 2021-04-01T20:35:59.930+0300 7fab6b5dd640 20 cr:s=0x55ab3857eb40:op=0x55ab38a4b800:25RGWDataSyncShardControlCR: operate()
    -1> 2021-04-01T20:35:59.930+0300 7fab39558640 -1 *** Caught signal (Segmentation fault) **

teuthology is also showing crashes (though with different backtraces):
http://qa-proxy.ceph.com/teuthology/yuvalif-2021-04-01_21:32:36-rgw:multisite-wip-yuval-fix-49650-distro-basic-gibba/6015470/

@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Apr 4, 2021

different crash is see when running master without any changes. see: https://tracker.ceph.com/issues/50135

@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Apr 6, 2021

@yuvalif
Copy link
Copy Markdown
Contributor Author

yuvalif commented Apr 11, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants