New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw structures rework #11485

Merged
merged 29 commits into from Mar 9, 2017

Conversation

Projects
None yet
4 participants
@yehudasa
Member

yehudasa commented Oct 13, 2016

Create new types:

  • rgw_pool (to represent pools instead of rgw_bucket)
  • rgw_raw_obj (to represent the raw rados object instead of rgw_obj)
  • rgw_obj_index_key

Don't store placement info in rgw_bucket (if possible, still keeping explicit placement for backward compatibility when dealing with older data). head object will always reside in the bucket's default placement target pool, (from the zone config), tail depends on what's in the manifest.

Rework rgw_obj; rename some of the fields, don't refer to oid as object anymore and don't auto generate the oid. Don't encode the oid, but the name, ns, and instance. Split rgw_obj_key and rgw_obj_index_key, and rgw_obj handles both as needed.

Pools can now be defined with a namespace ([:ns]), and consolidate a few of the default pools.

@yehudasa yehudasa changed the title from [DNM] rgw code cleanup to [DNM] rgw structures rework Dec 16, 2016

@cbodley

This comment has been minimized.

Contributor

cbodley commented Dec 19, 2016

@yehudasa did you forget to commit test_rgw_common.cc? getting this error from cmake:

CMake Error at src/test/rgw/CMakeLists.txt:25 (add_library):
  Cannot find source file:

    test_rgw_common.cc
@yehudasa

This comment has been minimized.

Member

yehudasa commented Dec 19, 2016

@cbodley fixed now

@cbodley

looking really good!

the new structure names make it much clearer what each piece is for and how it's used

the namespace changes on top are really elegant, using escape characters for the parsing, and having a single rgw_init_ioctx() to handle the pool creation and the call to set_namespace() 👍

your commit messages made it a lot easier to review (thanks!), and the unit tests are a big help for validation

map<string, bufferlist> attrset;
RGWRawObjState() {}
RGWRawObjState(const RGWRawObjState& rhs) : obj (rhs.obj) {

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

it doesn't look like the copy ctor is doing anything special - can we just let the compiler generate it?

ENCODE_FINISH(bl);
}
void decode_from_bucket(bufferlist::iterator& bl);

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

not used or implemented?

@@ -121,7 +121,8 @@ class RGWReadDataSyncStatusMarkersCR : public RGWShardCollectCR {
RGWDataSyncEnv *env;
const int num_shards;
int shard_id{0};
int shard_id{0};;

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

extra ;

}
map<string, bufferlist> attrs = s->bucket_attrs;
attrs[RGW_ATTR_CORS] = cors_bl;
op_ret = rgw_bucket_set_attrs(store, s->bucket_info, attrs, &s->bucket_info.objv_tracker);

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

why is the is_object_op case not needed here (and RGWDeleteCORS)?

const rgw_obj& src_obj,
int versioning_status,
uint16_t bilog_flags = 0,
const ceph::real_time& expiration_time = ceph::real_time());
/** Delete a raw object.*/
int delete_raw_obj(const rgw_raw_obj& obj);

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

is there supposed to be a semantic difference between this and delete_system_obj()? they are basically equivant, since objv_tracker defaults to null

This comment has been minimized.

@yehudasa

yehudasa Dec 22, 2016

Member

trying to separate the notion of a raw obj vs a system obj. Maybe this can be reworked later to avoid duplicate code though.

RGWObjectCtx obj_ctx(store);
RGWBucketInfo bucket_info;
int ret = store->get_bucket_instance_info(obj_ctx, bucket, bucket_info, NULL, NULL);

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

it looks like the callers (the RGWRados::bucket_index_*() functions, at least) already have a RGWBucketInfo - can BucketShard::init() take that instead of rgw_bucket to avoid this extra io?

/*
* Ceph - scalable distributed file system
*
* Copyright (C) 2013 eNovance SAS <licensing@enovance.com>

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

bad copy-paste?

ns->clear();
return;
}
if (key[1] == '_') {

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

is there anything before key[0] and key[1] that checks the string length?

* part of the given namespace, it returns false.
*/
static bool oid_to_key_in_ns(const string& oid, rgw_obj_key *key, const string& ns) {
string obj_ns;

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

unused variable

@@ -1424,6 +1425,83 @@ bool RGWUserCaps::is_valid_cap_type(const string& tp)
return false;
}
static ssize_t unescape_str(const string& s, ssize_t ofs, char esc_char, char special_char, string *dest)

This comment has been minimized.

@cbodley

cbodley Dec 21, 2016

Contributor

string::npos is a size_t, so that's probably a better fit for the return type

@yehudasa

This comment has been minimized.

Member

yehudasa commented Feb 25, 2017

@cbodley PR that adds ragweed support here: #13644

@yehudasa

This comment has been minimized.

@yehudasa yehudasa changed the title from [DNM] rgw structures rework to rgw structures rework Mar 3, 2017

@yehudasa yehudasa changed the title from rgw structures rework to [DNM] rgw structures rework Mar 3, 2017

@yehudasa yehudasa changed the title from [DNM] rgw structures rework to rgw structures rework Mar 8, 2017

yehudasa added some commits Oct 8, 2016

rgw: introduce rgw_pool, rgw_raw_obj
Pools are represented by rgw_pool (and not rgw_bucket anymore),
and we use rgw_raw_obj to reference rados objs and all 'system'
objects (vs rgw_obj that is used for rgw objects).

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: separate RGWObjState
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: more fixes and adjustments following rgw_pool, rgw_raw_obj
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: remove unneeded virtual declarations
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: decode rgw_raw_obj as rgw_obj when it's old object
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: use rgw_raw_obj in manifest code
This drags in multiple related changes that are needed in order to
support that.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: simple manifest compaction
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
test/rgw: extend obj, manifest unitests
Test rgw_raw_obj and upgrade of old rgw_obj, rgw_bucket and
old manifest.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: implicit rgw_bucket placement, manifest uses rgw_raw_obj
Two main changes here:
1. Newly created rgw_bucket does not have a predetermined placement pools
assigned to it. The placement_id param in the objects themselves points
at where the data is located. This affects object's tail location, head
is located where the bucket instance's placement rule points at.
2. Modify object manifest to use rgw_raw_obj instead of rgw_obj.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: fix bucket overwrite
got broken through the rgw_bucket cleanup related work

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: clean rgw_obj
Instead of storing the oid and the name, just store the name
and calculate it when needed (same goes to locator). Also give more
coherent names to the various fields.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: add rgw_obj_index_key, kill RGWObjEnt
Use rgw_obj_index_key to represent entries in bucket index (typedef of
cls_rgw_obj_key). Get rid of RGWObjEnt, it was duplicate of rgw_bucket_dir_entry
anyway.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: keep rgw_obj key info in rgw_obj_key field
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
cls/version: add more useful logging
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: move placement rule out of rgw_bucket
Bucket's placement rule is in the bucket instance's info. Object's
placement rule is in the manifest

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: compilation and other fixes following rebase
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: add namespace to rgw_pool
add a namespace field to the rgw_pool struct

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

yehudasa added some commits Dec 16, 2016

rgw: handle pools namespace
Use rgw_pool all around, and replace librados::create_ioctx() with
helper that also sets the namespace.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: modify default pools to use namespaces
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: fix async cr operation
Fix crash due to code cleanup. Changes scope of obj ref.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
mrgw.sh: fix script
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw_admin: remove broken check
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: metadata put of bucket instance sets bucket_id
Need to parse the bucket id off the entry and then set it on the
bucket struct.

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: get_obj_state() checks for empty oids
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: new rest api to retrieve object layout
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: more fixes following rebase
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
qa/tasks/radosgw_admin: adjust test to new bucket structure
Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>
rgw: fix crash when listing objects via swift
Fixes: http://tracker.ceph.com/issues/19249

Signed-off-by: Yehuda Sadeh <yehuda@redhat.com>

@yehudasa yehudasa merged commit 3d29012 into ceph:master Mar 9, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details
@smithfarm

This comment has been minimized.

Contributor

smithfarm commented Mar 22, 2017

@yehudasa I understand you'd like to backport a9ec5e8 from this PR, but to which stable release(s)?

(Backport field of http://tracker.ceph.com/issues/19249 is currently empty)

@smithfarm

This comment has been minimized.

Contributor

smithfarm commented on src/rgw/rgw_rest_swift.cc in 5cf5ab4 Apr 13, 2017

@yehudasa Is this just a cleanup or was the & in the previous version a mistake?

This comment has been minimized.

Member

yehudasa replied Apr 13, 2017

@smithfarm the types are different now, can't take a reference

This comment has been minimized.

Contributor

smithfarm replied Apr 13, 2017

Thanks - I guess that means my kraken backport conflict resolution is good.

manifest->set_tail_bucket(_b);
manifest->set_head(_h, 0);
manifest->set_tail_placement(placement_rule, _b);
manifest->set_head(placement_rule, _obj, 0);

This comment has been minimized.

@fangyuxiangGL

fangyuxiangGL May 5, 2017

Contributor

so the tail stripes and head object will still stored as a same placement_rule?

@fangyuxiangGL

This comment has been minimized.

Contributor

fangyuxiangGL commented May 5, 2017

@yehudasa

"head object will always reside in the bucket's default placement target pool, (from the zone config), tail depends on what's in the manifest."

head object and tail stripes still have same rados store way as your code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment