Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: sync modules, metadata search #10731

Merged
merged 35 commits into from Oct 10, 2016
Merged

Conversation

yehudasa
Copy link
Member

Make data sync more modular, so that we could add sync modules. An example for such sync module is the "log" module that logs every object that needs to be synced. Other examples (yet to be implemented) are meta indexing module, or backup (to external storage) module.

@yehudasa yehudasa force-pushed the wip-rgw-sync-plugins branch 2 times, most recently from 3796429 to 26774b3 Compare August 19, 2016 12:31
@yehudasa yehudasa changed the title [DNM] rgw: sync modules rgw: sync modules, metadata search Aug 26, 2016
@yehudasa yehudasa added the rgw label Aug 26, 2016
return ret;
}

{ /* opening scope so that we can do goto, sorry */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy/paste? i don't see a goto

map<string, bufferlist>::iterator iter = src_attrs.find(RGW_ATTR_ETAG);
if (iter != src_attrs.end()) {
bufferlist& etagbl = iter->second;
*petag = string(etagbl.c_str(), etagbl.length());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a bufferlist::to_str() for this (and unlike bufferlist::c_str(), it doesn't require reallocating and copying into a contiguous buffer if the bufferlist has multiple segments)


void set_result(ceph::real_time& _mtime,
uint64_t _size,
map<string, bufferlist>& _attrs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider taking _attrs by rvalue ref, so it's obvious to the caller that _attrs is being moved

RGWSyncModulesManager *get_sync_modules_manager() {
return sync_modules_manager;
}
RGWSyncModuleInstanceRef& get_sync_module() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider returning by value or const reference. by reference allows the caller to modify, i.e. store->get_sync_module().reset()

* in this case, we're not returning the object's content, only the prepended
* extra metadata
*/
total_len = 0;
Copy link
Contributor

@cbodley cbodley Sep 21, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool, so stat_remote_obj() works like a normal GET request that skips the data - and we get the size from Rgwx-Object-Size instead of Content-Length?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah



class RGWSyncModulesManager {
Mutex lock;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Can only sync from tiers that can export data.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Needed for sync module instance configuration

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@yehudasa
Copy link
Member Author

yehudasa commented Oct 7, 2016

@cbodley addressed you comments, repushed

Instead of having it as part of the data sync module. Since we only have a
single sync_module, having it there will make it easier to get its properties.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Use new rgwx-stat http param that allows getting only object's
meta. Use that when calling stat_remote_object().

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Moving code that fetches remote object meta to its own classes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
No real code change

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
If key is not passed in, don't try to sign the request.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
We're not necessarily going to connect to rgw/s3 endpoints,
we only need store param to handle s3 signing.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
sync module that will handle rgw metadata indexing.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
zone configuration now includes two new fields: sync_from_all
which is boolean, and sync_from, which is a least of zones to
sync from. By default sync_from_all is set to true. Sync will
only happen from all the zones, or from the specified zones if
sync_from all is false. We also check to see whether zone can
export data (depending on tier_type).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
add refrence to --sync-from*

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Using the zone name is easier and clearer.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
@cbodley
Copy link
Contributor

cbodley commented Oct 7, 2016

looks good. is it passing test_multi.py?

@yehudasa
Copy link
Member Author

yehudasa commented Oct 7, 2016

@cbodley yes

@cbodley cbodley merged commit 4ededdb into ceph:master Oct 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants