rgw: sync modules, metadata search #10731

Merged
merged 35 commits into from Oct 10, 2016

Projects

None yet

2 participants

@yehudasa
Member

Make data sync more modular, so that we could add sync modules. An example for such sync module is the "log" module that logs every object that needs to be synced. Other examples (yet to be implemented) are meta indexing module, or backup (to external storage) module.

@yehudasa yehudasa added the feature label Aug 15, 2016
@yehudasa yehudasa changed the title from [DNM] rgw: sync modules to rgw: sync modules, metadata search Aug 26, 2016
@yehudasa yehudasa added the rgw label Aug 26, 2016
@cbodley cbodley was assigned by yehudasa Aug 26, 2016
src/rgw/rgw_rados.cc
+ return ret;
+ }
+
+ { /* opening scope so that we can do goto, sorry */
@cbodley
cbodley Sep 21, 2016 Contributor

copy/paste? i don't see a goto

src/rgw/rgw_rados.cc
+ map<string, bufferlist>::iterator iter = src_attrs.find(RGW_ATTR_ETAG);
+ if (iter != src_attrs.end()) {
+ bufferlist& etagbl = iter->second;
+ *petag = string(etagbl.c_str(), etagbl.length());
@cbodley
cbodley Sep 21, 2016 Contributor

there's a bufferlist::to_str() for this (and unlike bufferlist::c_str(), it doesn't require reallocating and copying into a contiguous buffer if the bufferlist has multiple segments)

+
+
+class RGWSyncModulesManager {
+ Mutex lock;
@cbodley
cbodley Sep 21, 2016 Contributor

unused

src/rgw/rgw_sync_module.h
+
+ void set_result(ceph::real_time& _mtime,
+ uint64_t _size,
+ map<string, bufferlist>& _attrs) {
@cbodley
cbodley Sep 21, 2016 Contributor

consider taking _attrs by rvalue ref, so it's obvious to the caller that _attrs is being moved

src/rgw/rgw_rados.h
+ RGWSyncModulesManager *get_sync_modules_manager() {
+ return sync_modules_manager;
+ }
+ RGWSyncModuleInstanceRef& get_sync_module() {
@cbodley
cbodley Sep 21, 2016 Contributor

consider returning by value or const reference. by reference allows the caller to modify, i.e. store->get_sync_module().reset()

+ * in this case, we're not returning the object's content, only the prepended
+ * extra metadata
+ */
+ total_len = 0;
@cbodley
cbodley Sep 21, 2016 edited Contributor

cool, so stat_remote_obj() works like a normal GET request that skips the data - and we get the size from Rgwx-Object-Size instead of Content-Length?

yehudasa added some commits Jul 5, 2016
@yehudasa yehudasa rgw: initial data plugin definition and default implementation
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
841ac38
@yehudasa yehudasa rgw: use data sync module callbacks
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
739aa58
@yehudasa yehudasa rgw: define sync modules manager, instance
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
f30c966
@yehudasa yehudasa rgw: define zone tier type, sync from appropriate tiers only
Can only sync from tiers that can export data.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
b184f8d
@yehudasa yehudasa rgw: add tier config for zone params
Needed for sync module instance configuration

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
f2d547e
@yehudasa yehudasa rgw_admin: can set/modify zone tier's config
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
ddbda51
@yehudasa
Member
yehudasa commented Oct 7, 2016

@cbodley addressed you comments, repushed

yehudasa added some commits Aug 3, 2016
@yehudasa yehudasa rgw: define sync_module on RGWRados
Instead of having it as part of the data sync module. Since we only have a
single sync_module, having it there will make it easier to get its properties.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2abd5ac
@yehudasa yehudasa rgw: non-rgw tier is not writeable
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
f84bd9d
@yehudasa yehudasa rgw: add a simple logging sync module
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
561dd09
@yehudasa yehudasa rgw: helper to stat remote obj
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
006bd49
@yehudasa yehudasa rgw: add cr to stat remote obj
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
d75ffe4
@yehudasa yehudasa rgw: propagate attrs, mtime, size of remote object
Use new rgwx-stat http param that allows getting only object's
meta. Use that when calling stat_remote_object().

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2dcafc7
@yehudasa yehudasa rgw: log sync module gets source object's meta
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
2c0f1a3
@yehudasa yehudasa rgw: some abstraction around log sync module
Moving code that fetches remote object meta to its own classes.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
a88e2b3
@yehudasa yehudasa rgw: move the rgw sync code module around
No real code change

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
90ccfbf
@yehudasa yehudasa rgw: REST client, don't sign requests if empty key
If key is not passed in, don't try to sign the request.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
0df0eab
@yehudasa yehudasa rgw: allow null store in RGWRESTConn
We're not necessarily going to connect to rgw/s3 endpoints,
we only need store param to handle s3 signing.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
f1e9905
@yehudasa yehudasa rgw: a new cr to send http PUT requests
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
e2477e5
@yehudasa yehudasa rgw: initial implementation of elasticsearch sync module
sync module that will handle rgw metadata indexing.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
faa90fa
@yehudasa yehudasa cmake: fix linkage of ceph_test_librgw_file_nfsns
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
cf37268
@yehudasa yehudasa rgw: es sync module, send object info to elasticsearch
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
448b98a
@yehudasa yehudasa rgw: es sync module, store object attrs
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
162c683
@yehudasa yehudasa rgw: es sync module, store acl information
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
5e72430
@yehudasa yehudasa rgw: utility function to dump iso8601
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
a7c3c4f
@yehudasa yehudasa rgw: es sync module, keep object mtime
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
4c96e9b
@yehudasa yehudasa rgw: es sync module, store custom metadata
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
9b0bf84
@yehudasa yehudasa rgw: rest conn functions cleanup, only append zonegroup if not empty
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
4e41af1
@yehudasa yehudasa rgw: add cr to send DELETE to remove endpoint
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
24bc83e
@yehudasa yehudasa rgw: es module, remove entry on delete
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
61902f7
@yehudasa yehudasa rgw: support partial mesh for zone sync
zone configuration now includes two new fields: sync_from_all
which is boolean, and sync_from, which is a least of zones to
sync from. By default sync_from_all is set to true. Sync will
only happen from all the zones, or from the specified zones if
sync_from all is false. We also check to see whether zone can
export data (depending on tier_type).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
8bd2642
@yehudasa yehudasa rgw_admin: config options to set sync_from and sync_from_all
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
5f3c41f
@yehudasa yehudasa rgw_admin: update usage
add refrence to --sync-from*

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
962449f
@yehudasa yehudasa rgw: setting sync-from zone by name not by id
Using the zone name is easier and clearer.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
9ef728a
@yehudasa yehudasa rgw_admin: sync status command shows if not syncing from zone
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
8c030fd
@yehudasa yehudasa rgw: index metadata in elasticsearch using realm name for path
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
881cb98
@cbodley
Contributor
cbodley commented Oct 7, 2016

looks good. is it passing test_multi.py?

@yehudasa
Member
yehudasa commented Oct 7, 2016

@cbodley yes

@cbodley cbodley merged commit 4ededdb into ceph:master Oct 10, 2016

2 checks passed

Signed-off-by all commits in this PR are signed
Details
default Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment