os/bluestore: refactor ExtentMap::update to avoid preceeding db updat… #12394

ifed01 · 2016-12-08T15:44:56Z

…e if reshard takes place.

update method might add some 'set' ops to the transaction prior to detection that resharding is needed.
No need to apply these ops after reshard takes place - correspoding records to be removed anyway.

Signed-off-by: Igor Fedotov ifed@mirantis.com

liewegas

I think we can simplify this: the only reason we ever call update() twice is when update() itself returns true. That means that update() itself can stage the changes and only put them in t if it is sure it won't return true. Which means this change coudl be local to update() only.. right?

Also, if we can avoid the memory allocations of map<>, that would be nice. Like, put the bufferlist in Shard, and string the dirty ones together in an intrusive_list or something. Or just put single local vector<Shard*> on the local stack or something.

ifed01 · 2016-12-14T14:21:53Z

@liewegas fixed and rebased

liewegas · 2016-12-14T15:36:20Z

src/os/bluestore/BlueStore.cc

+
+    struct dirty_shard_t{
+    string* key = nullptr;
+    bufferlist bl;


If we put the bufferlist dirty_bl in Shard, we can avoid this structure entirely, right?

liewegas · 2016-12-14T15:36:26Z

src/os/bluestore/BlueStore.cc

+    &dirty_shard_t::dirty_list_item> > dirty_shard_list_t;
+
+
+    vector<dirty_shard_t> encoded_shards;


liewegas · 2016-12-14T15:36:57Z

src/os/bluestore/BlueStore.cc

 	unsigned nn;
+        bufferlist& bl = encoded_shards[pos].bl;


bufferlist& bl = p->dirty_bl;

liewegas · 2016-12-14T15:37:32Z

src/os/bluestore/BlueStore.cc

+    //schedule DB update for dirty shards
+    auto it = dirty_shards.begin();
+    while( it != dirty_shards.end()) {
+      t->set(PREFIX_OBJ, *(it->key), it->bl);


and here we'd clear dirty_bl again?

I guess this uses a bit more RAM...

yeah, more RAM usage for a shard was the root cause why I selected the way above. I believe that we might need to think about Onode in-memory size reduction one day. Since this affects onode caching effectiveness: larger onode - less entries in the cache - worse performance. And even empty bufferlist isn't that small. Hence IMHO that's an overkill to have a bufferlist instance for each onode's shard. Especially taking into the consideration it's local use.
Still thinking we want to have it along with shard?

And a small bonus of my approach - we don't need to enumerate ALL the shards twice. I'm iterating over dirty blobs only at the final stage.

liewegas · 2016-12-14T15:59:48Z

Yeah, let's go with what you ahve. Can you clean up whitespace, and remove the unneed intrusive member item in Shard?

…e if reshard takes place Signed-off-by: Igor Fedotov <ifedotov@mirantis.com>

Signed-off-by: Igor Fedotov <ifedotov@mirantis.com>

ifed01 · 2016-12-14T16:48:47Z

@liewegas - cleaned-up

ifed01 added the bluestore label Dec 8, 2016

liewegas requested changes Dec 12, 2016

View reviewed changes

ifed01 force-pushed the wip-bluestore-fix-reshard branch from 584c8a5 to b12c444 Compare December 14, 2016 14:20

liewegas reviewed Dec 14, 2016

View reviewed changes

src/os/bluestore/BlueStore.cc

&dirty_shard_t::dirty_list_item> > dirty_shard_list_t;

vector<dirty_shard_t> encoded_shards;

Copy link

Member

liewegas Dec 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this

liewegas reviewed Dec 14, 2016

View reviewed changes

src/os/bluestore/BlueStore.cc

unsigned nn;

bufferlist& bl = encoded_shards[pos].bl;

Copy link

Member

liewegas Dec 14, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bufferlist& bl = p->dirty_bl;

liewegas reviewed Dec 14, 2016

View reviewed changes

Igor Fedotov added 2 commits December 14, 2016 16:21

os/bluestore: refactor ExtentMap::update to avoid preceeding db updat…

693ee50

…e if reshard takes place Signed-off-by: Igor Fedotov <ifedotov@mirantis.com>

os/bluestore: remove redundant onode parameter in ExtentMap methods

c66977b

Signed-off-by: Igor Fedotov <ifedotov@mirantis.com>

ifed01 force-pushed the wip-bluestore-fix-reshard branch from b12c444 to c66977b Compare December 14, 2016 16:45

liewegas approved these changes Dec 19, 2016

View reviewed changes

liewegas added needs-qa wip-sage-testing labels Dec 19, 2016

liewegas merged commit fcbabdd into ceph:master Dec 22, 2016

ifed01 deleted the wip-bluestore-fix-reshard branch December 22, 2016 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os/bluestore: refactor ExtentMap::update to avoid preceeding db updat… #12394

os/bluestore: refactor ExtentMap::update to avoid preceeding db updat… #12394

ifed01 commented Dec 8, 2016 •

edited

liewegas left a comment

ifed01 commented Dec 14, 2016

liewegas Dec 14, 2016

liewegas Dec 14, 2016

liewegas Dec 14, 2016

liewegas Dec 14, 2016

ifed01 Dec 14, 2016

ifed01 Dec 14, 2016

liewegas commented Dec 14, 2016 via email

ifed01 commented Dec 14, 2016

		&dirty_shard_t::dirty_list_item> > dirty_shard_list_t;


		vector<dirty_shard_t> encoded_shards;

os/bluestore: refactor ExtentMap::update to avoid preceeding db updat… #12394

os/bluestore: refactor ExtentMap::update to avoid preceeding db updat… #12394

Conversation

ifed01 commented Dec 8, 2016 • edited

liewegas left a comment

Choose a reason for hiding this comment

ifed01 commented Dec 14, 2016

liewegas Dec 14, 2016

Choose a reason for hiding this comment

liewegas Dec 14, 2016

Choose a reason for hiding this comment

liewegas Dec 14, 2016

Choose a reason for hiding this comment

liewegas Dec 14, 2016

Choose a reason for hiding this comment

ifed01 Dec 14, 2016

Choose a reason for hiding this comment

ifed01 Dec 14, 2016

Choose a reason for hiding this comment

liewegas commented Dec 14, 2016 via email

ifed01 commented Dec 14, 2016

ifed01 commented Dec 8, 2016 •

edited