New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: sync modules, metadata search #10731
Conversation
3796429
to
26774b3
Compare
51b6fb4
to
96a970d
Compare
0cfc847
to
c97cd47
Compare
return ret; | ||
} | ||
|
||
{ /* opening scope so that we can do goto, sorry */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy/paste? i don't see a goto
map<string, bufferlist>::iterator iter = src_attrs.find(RGW_ATTR_ETAG); | ||
if (iter != src_attrs.end()) { | ||
bufferlist& etagbl = iter->second; | ||
*petag = string(etagbl.c_str(), etagbl.length()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a bufferlist::to_str()
for this (and unlike bufferlist::c_str()
, it doesn't require reallocating and copying into a contiguous buffer if the bufferlist has multiple segments)
|
||
void set_result(ceph::real_time& _mtime, | ||
uint64_t _size, | ||
map<string, bufferlist>& _attrs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider taking _attrs by rvalue ref, so it's obvious to the caller that _attrs is being moved
RGWSyncModulesManager *get_sync_modules_manager() { | ||
return sync_modules_manager; | ||
} | ||
RGWSyncModuleInstanceRef& get_sync_module() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider returning by value or const reference. by reference allows the caller to modify, i.e. store->get_sync_module().reset()
* in this case, we're not returning the object's content, only the prepended | ||
* extra metadata | ||
*/ | ||
total_len = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, so stat_remote_obj()
works like a normal GET request that skips the data - and we get the size from Rgwx-Object-Size
instead of Content-Length
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah
|
||
|
||
class RGWSyncModulesManager { | ||
Mutex lock; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
c97cd47
to
a07f521
Compare
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Can only sync from tiers that can export data. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Needed for sync module instance configuration Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
a07f521
to
881cb98
Compare
@cbodley addressed you comments, repushed |
Instead of having it as part of the data sync module. Since we only have a single sync_module, having it there will make it easier to get its properties. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Use new rgwx-stat http param that allows getting only object's meta. Use that when calling stat_remote_object(). Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Moving code that fetches remote object meta to its own classes. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
No real code change Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
If key is not passed in, don't try to sign the request. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
We're not necessarily going to connect to rgw/s3 endpoints, we only need store param to handle s3 signing. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
sync module that will handle rgw metadata indexing. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
zone configuration now includes two new fields: sync_from_all which is boolean, and sync_from, which is a least of zones to sync from. By default sync_from_all is set to true. Sync will only happen from all the zones, or from the specified zones if sync_from all is false. We also check to see whether zone can export data (depending on tier_type). Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
add refrence to --sync-from* Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Using the zone name is easier and clearer. Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
looks good. is it passing test_multi.py? |
@cbodley yes |
Make data sync more modular, so that we could add sync modules. An example for such sync module is the "log" module that logs every object that needs to be synced. Other examples (yet to be implemented) are meta indexing module, or backup (to external storage) module.