Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: notifications on object replication #43371

Merged
merged 1 commit into from Apr 28, 2022
Merged

rgw: notifications on object replication #43371

merged 1 commit into from Apr 28, 2022

Conversation

liavt
Copy link
Contributor

@liavt liavt commented Sep 30, 2021

Optionally publishes notifications upon object replication between zones in the same zone group.

New notification types were added to facilitate this change.

Uses commits and API from #39192

Fixes: no ticket

Signed-off-by: Liav Turkia liav.turkia@gmail.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@github-actions
Copy link

This pull request can no longer be automatically merged: a rebase is needed and changes have to be manually resolved

* we need to fetch info about source object, so that we can determine
* the correct policy configuration. This can happen if there are multiple
* policy rules, and some depend on the object tagging */
yield call(new RGWStatRemoteObjCR(sync_env->async_rados,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you're making this Stat request unconditional? my vague understanding is that need_more_info is very rarely needed at the moment, so this could be an expensive change. this Stat is also inherently racy, because the remote could overwrite the object between our calls to Stat and FetchRemoteObj

whatever info you need from this Stat request, you should get from FetchRemoteObj instead

Comment on lines 2462 to 2467
int ret = rgw::notify::publish_reserve(dpp, rgw::notify::ObjectSyncedCreate, notify_res, &obj_tags);
if (ret < 0) {
ldpp_dout(dpp, 1) << "ERROR: reserving notification failed, with error: " << ret << dendl;
// no need to return, the sync already happened
} else {
ret = rgw::notify::publish_commit(&obj, src_size, src_mtime, src_etag, 0/* version id */, rgw::notify::ObjectSyncedCreate, notify_res, dpp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are blocking calls to librados, which we shouldn't use in coroutines. blocking here means that sync can't make progress with other buckets/objects in the meantime

it would probably be easier to trigger this stuff from RGWRados::fetch_remote_obj(), which is already being called synchronously via the RGWAsyncRadosProcessor thread pool

@@ -2441,6 +2442,33 @@ class RGWObjFetchCR : public RGWCoroutine {
}
return set_cr_error(retcode);
}

// notify that object has synced to this zone
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that a call to FetchRemoteObj doesn't actually mean that we synced the object. it sends a GET request with the If-Modified-Since header so that it only transfers the object if it's newer than our local copy. if not, we just get ERR_NOT_MODIFIED and don't have to do anything

you can find some extra logic in RGWAsyncFetchRemoteObj::_send_request() for perf counters that tries to detect whether or not it actually transferred anything

Copy link
Contributor

@yuvalif yuvalif Oct 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @cbodley !

@liavt, moving the notification code to RGWAsyncFetchRemoteObj::_send_request() would simplify the code, on top of resolving the issues in the PR.
it already has: RGWObjectCtx, rgw::sal::Attrs attr, rgw::sal::RadosBucket and rgw::sal::RadosObject available.

in addition, you can use the value of bytes_transferred to verify that the object was indeed synced

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the notification code to RGWAsyncFetchRemoteObj::_send_request() and reverted changes to the coroutine, including the one for need_more_info
Commit: 6211ee4

@yuvalif
Copy link
Contributor

yuvalif commented Oct 3, 2021

@liavt please don't forget to sign your commits (use: git commit -s to automatically do that).
note that you don't have to do that right now, you can sign the one commit you will have after you squash them.

@stale
Copy link

stale bot commented Jan 9, 2022

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@stale stale bot added the stale label Jan 9, 2022
@mattbenjamin
Copy link
Contributor

@liavt hi; the original lc notifications pr has merged, please try to rebase this on master

@stale stale bot removed the stale label Jan 9, 2022
@yuvalif
Copy link
Contributor

yuvalif commented Feb 9, 2022

@liavt please remove Matt's commit: 9f6ffe1 (since it is already merged to master) and rebase

@liavt
Copy link
Contributor Author

liavt commented Feb 9, 2022

I removed Matt's commit and rebased the pull request, it still requires testing however (and will be squashed and signed after everything is done)

@mattbenjamin
Copy link
Contributor

thanks guys, yay. Just fyi, this should have a tracker ticket, as well :)

@yuvalif
Copy link
Contributor

yuvalif commented Feb 9, 2022

I removed Matt's commit and rebased the pull request, it still requires testing however (and will be squashed and signed after everything is done)

this is great!
please follow these instructions for testing manual testing (let me know if you run into issues): https://gist.github.com/yuvalif/d50bcc3d27121732ffbbafe7b7146112

when i tried that (several months ago), it was eventually crashing, with something that seemed unrelated to the code changed here :-(
note that you cannot start the RGW with gdb, instead, you attach it to gdb after it already started (but before you do something that may crash it...)

run:

pgrep -a radosgw

to get the pid of the RGW you want to attach to.
and then:

gdb bin/radosgw -p <pid>

note that, for some unknown reason, after a while, gdb will stop on a "read()" function (even if no breakpoint is set).
just hit ENTER and the "c" (=continue)

@yuvalif
Copy link
Contributor

yuvalif commented Feb 9, 2022

regarding the rebase, i don't think we should see this commit here: 08e7383

  • on your local branch, keep only the commits with your code changes
  • probably better to squash your commits at that point (git rebase -i <hash>^), so that you have less work later on in fixing conflicts
  • fetch the latest upstream master
  • rebase your code on top of the latest master (and fix conflicts): git rebase origin/master (assuming "origin" is the upstream and not your fork)

@liavt
Copy link
Contributor Author

liavt commented Feb 10, 2022

regarding the rebase, i don't think we should see this commit here: 08e7383

  • on your local branch, keep only the commits with your code changes
  • probably better to squash your commits at that point (git rebase -i <hash>^), so that you have less work later on in fixing conflicts
  • fetch the latest upstream master
  • rebase your code on top of the latest master (and fix conflicts): git rebase origin/master (assuming "origin" is the upstream and not your fork)

The rebase should be fixed, the extra commit is gone

@@ -713,20 +712,22 @@ int RGWAsyncFetchRemoteObj::_send_request(const DoutPrefixProvider *dpp)
}
}

// NOTE: we create a mutable copy of bucket.get_tenant as the get_notification function expects a std::string&, not const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you can just change the function to use const std::string&
nothing should change the tenant inside the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the function to use const std::string& has a lot of other side effects and changes to other code, namely the class reservation_t from rgw_notify.h will need to be updated to have a const tenant, as well as any class that deals with notifications (which is a couple)
Ideally I think it should be a separate PR if anything as it touches a lot of unrelated files, and it isn't necessarily required for this one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. making all of them const is a good thing.
but lets keep it out of scope of this PR

ceph::make_timespan(g_conf()->rgw_op_thread_timeout),
ceph::make_timespan(g_conf()->rgw_op_thread_suicide_timeout),
&m_tp) {
ceph::make_timespan(g_conf()->rgw_op_thread_timeout),
Copy link
Contributor

@yuvalif yuvalif Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please avoid indentation changes in existing code in the file. here as well as lines 77, 78, 112, 130, 667

@@ -42,6 +42,14 @@ namespace rgw::notify {
return "s3:ObjectLifecycle:Transition:Current";
case ObjectTransitionNoncurrent:
return "s3:ObjectLifecycle:Transition:Noncurrent";
case ObjectSynced:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like indentation is off
(we use 2 spaces and expand tabs)

std::string tenant(bucket.get_tenant());

std::unique_ptr<rgw::sal::Notification> notify
= store->get_notification(dpp, &dest_obj, &src_obj, &obj_ctx, rgw::notify::ObjectSyncedCreate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use nullptr instead of src_obj
names are confusing, but for this function "src" and "dest" mean that an object is copied to another object (which is not the case here).

ldpp_dout(dpp, 0) << "sending sync notification" << dendl;

// send notification that object was succesfully synced
std::string user_id = "0";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to have "req_id" = "0" and "user_id" = "rgw sync"

= store->get_notification(dpp, &dest_obj, &src_obj, &obj_ctx, rgw::notify::ObjectSyncedCreate,
&bucket, user_id,
tenant,
user_id, null_yield);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be "req_id" and not "user_id"

ldpp_dout(dpp, 1) << "ERROR: reserving notification failed, with error: " << ret << dendl;
// no need to return, the sync already happened
} else {
ret = rgw::notify::publish_commit(&src_obj, src_obj.get_obj_size(), src_mtime, "" /* etag */, "0"/* version id */, rgw::notify::ObjectSyncedCreate, notify_res, dpp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you not get the realm etag and version id here?

it looks like you could get the etag from the string *petag argument of fetch_remote_obj()

dest_obj.get_instance() should give you the version id

} else {
// r >= 0
if (bytes_transferred) {
ldpp_dout(dpp, 0) << "sending sync notification" << dendl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we drop this log message? in most cases, notifications for sync won't be enabled on the bucket, so this message would be misleading

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
It was here for debugging right now but it can be removed once the commit is finalized for the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good

@yuvalif
Copy link
Contributor

yuvalif commented Feb 23, 2022

@liavt I think the code is ready! next steps for this PR:

  • move out of draft state
  • squash commits
  • sign commit
  • add documentation
    • new types compatibility doc
    • add a note there with short description of the new notifications and also explaining they have to be set explicitly on each zone
  • teuthology regression testing (I will own that part)

next steps for other PRs:

  • add integration test
  • finish the "example" PR
  • fix comment

@yuvalif
Copy link
Contributor

yuvalif commented Feb 28, 2022

jenkins test make check

@yuvalif yuvalif self-requested a review February 28, 2022 08:51
@yuvalif
Copy link
Contributor

yuvalif commented Mar 7, 2022

jenkins test docs

@yuvalif
Copy link
Contributor

yuvalif commented Mar 7, 2022

teuthology run has 6 failures:
http://pulpito.front.sepia.ceph.com/yuvalif-2022-03-07_12:52:41-rgw-wip-yuval-sync-notifications-distro-basic-smithi/
mostly overlap with baseline: http://pulpito.front.sepia.ceph.com/yuriw-2022-03-04_22:06:30-rgw-master-distro-default-smithi/.

however, bucket notification test is failing due to RGW crash:

1: /lib64/libpthread.so.0(+0x12c20) [0x7ff38ba43c20]
2: /lib64/librados.so.2(+0xfe053) [0x7ff38dd7c053]
3: /lib64/librados.so.2(+0xff01b) [0x7ff38dd7d01b]
4: /lib64/librados.so.2(+0x1041d6) [0x7ff38dd821d6]
5: (DispatchQueue::fast_dispatch(boost::intrusive_ptr<Message> const&)+0x1a0) [0x7ff38c642ba0]
6: (ProtocolV2::handle_message()+0x12c6) [0x7ff38c72b346]
7: (ProtocolV2::handle_read_frame_dispatch()+0x238) [0x7ff38c73dc28]
8: (ProtocolV2::_handle_read_frame_epilogue_main()+0x85) [0x7ff38c73dd15]
9: (ProtocolV2::handle_read_frame_epilogue_main(std::unique_ptr<ceph::buffer::v15_2_0::ptr_node, ceph::buffer::v15_2_0::ptr_node::disposer>&&, int)+0x204) [0x7ff38c73f2c4]
10: (ProtocolV2::run_continuation(Ct<ProtocolV2>&)+0x3c) [0x7ff38c726a3c]
11: (AsyncConnection::process()+0x789) [0x7ff38c6ed0f9]
12: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x1547) [0x7ff38c74cf87]
13: /usr/lib64/ceph/libceph-common.so.2(+0x61aa4e) [0x7ff38c754a4e]
14: /lib64/libstdc++.so.6(+0xc2ba3) [0x7ff38aa69ba3]
15: /lib64/libpthread.so.0(+0x817f) [0x7ff38ba3917f]
16: clone()

the backtrace does not seem related to the code changes, however, it is probably due to something in this PR.
log around the crash is here:
http://qa-proxy.ceph.com/teuthology/yuvalif-2022-03-07_12:52:41-rgw-wip-yuval-sync-notifications-distro-basic-smithi/6724876/remote/smithi161/crash/posted/2022-03-07T13%3A17%3A35.979405Z_1c4627dc-6ad8-4206-b24e-d327760efea1/log

@liavt
Copy link
Contributor Author

liavt commented Mar 8, 2022

The pulpito links do not seem to work for me, the requests time out

Regarding the backtrace, it doesn't seem like that passes through anything changed by the PR. Do you know if there is a way to reproduce this locally or have an area of the code that could be affected by the changes?

@yuvalif
Copy link
Contributor

yuvalif commented Mar 13, 2022

The pulpito links do not seem to work for me, the requests time out

Regarding the backtrace, it doesn't seem like that passes through anything changed by the PR. Do you know if there is a way to reproduce this locally or have an area of the code that could be affected by the changes?

sorry, can you try this link (the above use the sepia VPN)?
https://pulpito.ceph.com/yuvalif-2022-03-07_12:52:41-rgw-wip-yuval-sync-notifications-distro-basic-smithi/

you can run the bucket notification test locally to see why it crashes. the directory holding the test code and instructions is here:
https://github.com/ceph/ceph/tree/master/src/test/rgw/bucket_notification
you can try first without the kafka./amqp setup (their tests will ber skipped).
if the crash is not reproduced then you can try these tests as well.

= store->get_notification(dpp, &dest_obj, nullptr, &obj_ctx, rgw::notify::ObjectSyncedCreate,
&dest_bucket, user_id,
tenant,
req_id, null_yield);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using null_yield here may have a negative impact on sync performance.
in such a case, the notification sending code would just block using a mutex until the ack for the notification is received.
this needs more investigation since we are already inside a coroutine, there should be a way to yield and do async waits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs more investigation since we are already inside a coroutine, there should be a way to yield and do async waits.

unfortunately these are two different kinds of coroutines. they could potentially work together like you describe (i.e. a RGWCoroutine could spawn::spawn() a yield context and wait on it), but that would be a big project. that could be a stepping stone on the way to a better multisite, but we might also be able to avoid it - that's worth discussion

but classes like RGWAsyncFetchRemoteObj that inherit from RGWAsyncRadosRequest are actually being handed off to a thread pool in RGWAsyncRadosProcessor to run synchronously. fetch_remote_obj() itself is blocking, so this blocking null_yield is harmless

@yuvalif
Copy link
Contributor

yuvalif commented Apr 7, 2022

can still see crashes here: http://qa-proxy.ceph.com/teuthology/yuvalif-2022-04-07_08:28:12-rgw:notifications-wip-yuval-sync-notifications-distro-basic-smithi/6780814/remote/smithi135/crash/posted/

crash here makes more sense (even though it is not in the code introduced in this PR). from the log:

-174> 2022-04-07T08:48:14.734+0000 7f6fc0119700 20 AMQP run: multiple n/acks received with tag=131 and result=0
-173> 2022-04-07T08:48:14.734+0000 7f6fc0119700 20 AMQP run: invoking callback with tag=125
-154> 2022-04-07T08:48:14.739+0000 7f6fc0119700 -1 *** Caught signal (Aborted) **
 in thread 7f6fc0119700 thread_name:amqp_manager

 ceph version 17.0.0-11458-gd46c8509 (d46c85091b2d82ebff5510920a39fc8c7071290c) quincy (dev)
 1: /lib64/libpthread.so.0(+0x12ce0) [0x7f6ff50a3ce0]
 2: gsignal()
 3: abort()
 4: /lib64/libc.so.6(+0x21c89) [0x7f6ff36c9c89]
 5: /lib64/libc.so.6(+0x473a6) [0x7f6ff36ef3a6]
 6: /lib64/libpthread.so.0(+0x13ace) [0x7f6ff50a4ace]
 7: (RGWPubSubAMQPEndpoint::Waiter::finish(int)+0xc3) [0x7f6ff80fa0d3]
 8: (rgw::amqp::Manager::run()+0x234e) [0x7f6ff840fe3e]
 9: /lib64/libstdc++.so.6(+0xc2ba3) [0x7f6ff40c9ba3]
 10: /lib64/libpthread.so.0(+0x81cf) [0x7f6ff50991cf]
 11: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

looks like an AMQP ack was received, triggering the call to RGWPubSubAMQPEndpoint::Waiter::finish() that would release the async wait.

@yuvalif
Copy link
Contributor

yuvalif commented Apr 12, 2022

after another rebase, notification tests are not crashing anymore:
http://pulpito.front.sepia.ceph.com/yuvalif-2022-04-12_04:19:06-rgw:notifications-wip-yuval-sync-notifications-distro-basic-smithi/
the failure is expected, unrelated to this change, and handled by another PR

@liavt please rebase and force push the rebased code

@yuvalif
Copy link
Contributor

yuvalif commented Apr 13, 2022

multisite tests are also passing:
http://pulpito.front.sepia.ceph.com/yuvalif-2022-04-13_08:44:21-rgw:multisite-wip-yuval-sync-notifications-distro-basic-smithi/
failures are expected:

  • version_suspended_incremental_sync is also failing on master
  • 9 pubsub tests are also failing on master

std::string tenant(dest_bucket.get_tenant());

std::unique_ptr<rgw::sal::Notification> notify
= store->get_notification(dpp, &dest_obj, nullptr, &obj_ctx, rgw::notify::ObjectSyncedCreate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not compile now:

error: no matching function for call to ‘rgw::sal::RadosStore::get_notification(...

the function changes to:

rgw_sal_rados.h:97:43: note: candidate: ‘virtual std::unique_ptr<rgw::sal::Notification> rgw::sal::RadosStore::get_notification(const DoutPrefixProvider*, rgw::sal::Object*, rgw::sal::Object*, rgw::notify::EventType, rgw::sal::Bucket*, std::string&, std::string&, std::string&, optional_yield)’

I think you should remove: obj_ctx

Signed-off-by: liavt <liav.turkia@gmail.com>
@liavt
Copy link
Contributor Author

liavt commented Apr 26, 2022

@yuvalif I removed obj_ctx and it compiles for me now

@yuvalif
Copy link
Contributor

yuvalif commented Apr 27, 2022

notification tests are passing after rebase (error exist in baseline as well):
http://pulpito.front.sepia.ceph.com/yuvalif-2022-04-27_12:08:42-rgw:multisite-wip-yuval-syn-notifications-distro-basic-smithi/

@yuvalif yuvalif removed the needs-qa label Apr 27, 2022
@ceph-jenkins
Copy link
Collaborator

Can one of the admins verify this patch?

Copy link
Contributor

@mattbenjamin mattbenjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -30,6 +30,14 @@ namespace rgw::notify {
return "s3:ObjectLifecycle:NoncurrentExpiration";
case ObjectDeleteMarkerExpiration:
return "s3:ObjectLifecycle:DeleteMarkerExpiration";
case ObjectSynced:
return "s3:ObjectSynced:*";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did use s3:ObjectLifecycle in my lifecycle notifications change, but should we possibly use rgw: here (and maybe there, I don't know--it's a little less clear given that the lifecycle features are AWS compatible)

@yuvalif
Copy link
Contributor

yuvalif commented Apr 28, 2022

remaining work on the feature: https://gist.github.com/yuvalif/0db188fc63db40af3229f7bd63407bfb
should be done in a followup PR

@yuvalif yuvalif merged commit cc5354e into ceph:master Apr 28, 2022
11 checks passed
@tchaikov
Copy link
Contributor

@yuvalif hi Yuval, in future, could you please add "Reviewed-by" lines in the merge commit when merging pull requests?

@mattbenjamin
Copy link
Contributor

@yuvalif hi Yuval, in future, could you please add "Reviewed-by" lines in the merge commit when merging pull requests?

Isn't that something we can rely on github to do, since we have approving github reviewers?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants