Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MOSDOp/do_op/handle_op: Simple Messenger optimization #5211

Merged
merged 2 commits into from Sep 30, 2015

Conversation

Projects
None yet
5 participants
@jjlakis
Copy link

jjlakis commented Jul 13, 2015

Optimization of Simple Messenger according to Sage's proposition, consulted with Intel and HP teams.

  1. Couple of checks moved to parallelized workers
  2. Message decoding splitted to partial and final decoding, and move final decoding to workers
  3. Partial decoding optimized to contain only fields necessary to enqueue message to workers

We observed decreasing latency for reads, there's no visible improvement for writes. Here's the results for Filestore:
image
by Szymon Graczyk, Intel

Also, after moving checks and message decoding to workers, we observed increased activity of worker threads:
image
by Stephen Blinick, Intel

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Jul 13, 2015

@gregsfortytwo , it would be nice to see you review again :)

@jjlakis jjlakis changed the title Simple Messenger optimization MOSDOp/do_op/handle_op: Simple Messenger optimization Jul 13, 2015

@gregsfortytwo gregsfortytwo self-assigned this Jul 13, 2015

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Jul 13, 2015

@liewegas , your propositions for SM optimization

@@ -267,7 +273,7 @@ struct ceph_osd_request_head {
}

virtual void decode_payload() {
bufferlist::iterator p = payload.begin();
p = payload.begin();

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jul 13, 2015

Member

Before doing this we should probably assert that partialDecodeNeeded and fullDecodeNeeded are set as expected?

This comment has been minimized.

@jjlakis

jjlakis Jul 13, 2015

Author

added

@@ -35,6 +35,7 @@
#define CEPH_FEATURE_OSD_HBMSGS (1ULL<<28)
#define CEPH_FEATURE_MDSENC (1ULL<<29)
#define CEPH_FEATURE_OSDHASHPSPOOL (1ULL<<30)
#define CEPH_FEATURE_NEW_ENCODING (1ULL<<30)

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jul 13, 2015

Member

Heh, I know I made a reference to maybe letting it share with another feature, but I meant if we had another feature go in during the same dev release, not trying to match it up with an existing one! Just give it an unused number so we can disambiguate between brand new clients that support putting the pgid at the front, versus those that understand the new OSDMap encoding that this covers. ;)

This comment has been minimized.

@jjlakis

jjlakis Jul 13, 2015

Author

Somehow, I claimed that all the numbers are taken, my mistake.

::encode(mtime, payload);
::encode(reassert_version, payload);
::encode(oloc, payload);
::encode(pgid, payload);

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jul 13, 2015

Member

So we're here encoding in the classic order because the other end doesn't understand the new order, but as part of that we also should set the message version to be the older one (5, I think?). You just need to overwrite head.version with the appropriate value.

@gregsfortytwo gregsfortytwo removed their assignment Jul 13, 2015

@gregsfortytwo

This comment has been minimized.

Copy link
Member

gregsfortytwo commented Jul 13, 2015

A few pieces still off but overall this looks really good.

I appreciate pushing the review-provoked patches on top but this PR is small enough that you should feel free to squash down any changes you like as well.

Thanks!

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Jul 13, 2015

Thanks @gregsfortytwo , we're focused here on our own local configuration so I wasn't prepared to make correct pull request ready to merge with master :) . Thanks again for your help!

@@ -64,6 +64,7 @@
// duplicated since it was introduced at the same time as MIN_SIZE_RECOVERY
#define CEPH_FEATURE_OSD_PROXY_FEATURES (1ULL<<49) /* overlap w/ above */
#define CEPH_FEATURE_MON_METADATA (1ULL<<50)
#define CEPH_FEATURE_NEW_ENCODING (1ULL<<51) /* New, v6 encoding */

This comment has been minimized.

@liewegas

liewegas Aug 5, 2015

Member

CEPH_FEATURE_NEW_OSDOP_ENCODING ? "new encoding" isn't very descriptive

This comment has been minimized.

@jjlakis

@liewegas liewegas self-assigned this Aug 5, 2015

@liewegas liewegas added the performance label Aug 5, 2015

@@ -242,14 +236,27 @@ struct ceph_osd_request_head {
::encode_nohead(snaps, payload);
} else {

This comment has been minimized.

@liewegas

liewegas Aug 5, 2015

Member

} else if ((features & CEPH_FEATURE_NEW_OSDOP_ENCODING) == 0) {
header.version = 5;
...
} else {
new encoding ...
}


// In old versions, final decoding is done in first step
finalDecodeNeeded = false;
} else if (header.version < 6) {
// new decode

This comment has been minimized.

@liewegas

liewegas Aug 5, 2015

Member

change this to "v6 decode"?

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Aug 5, 2015

Aside from the cosmetic/style issues, this looks good!

Can you please restructure this into two patches:

  1. move the OSD checks from handle_op into do_op
  2. all of the MOSDOp changes and the OSD change that uses it (calls finish_decode())

Thanks!

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Aug 6, 2015

@liewegas You mean 2 commits or 2 different pull requests? First thing - done

@jjlakis jjlakis force-pushed the jjlakis:smc3 branch from 1cf8544 to b3aee8c Aug 6, 2015

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Aug 6, 2015

Two commits.. this looks great. Thanks!

@liewegas liewegas added the needs-qa label Aug 6, 2015

@liewegas liewegas added this to the infernalis milestone Aug 12, 2015

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Aug 14, 2015

2015-08-14 14:23:50.497138 7f35c67fe700 -1 ./messages/MOSDOp.h: In function 'const eversion_t& MOSDOp::get_version()' thread 7f35c67fe700 time 2015-08-14 14:23:50.418958
./messages/MOSDOp.h: 87: FAILED assert(!final_decode_needed)

ceph version 9.0.2-1219-g941ea50 (941ea50)
1: (ceph::ceph_assert_fail(char const, char const, int, char const_)+0x8b) [0x7f35e8bba56b]
2: (()+0x277bb8) [0x7f35e8740bb8]
3: (PG::can_discard_op(std::shared_ptr&)+0x29) [0x7f35e8883b49]
4: (PG::can_discard_request(std::shared_ptr&)+0x55) [0x7f35e8884395]
5: (ReplicatedPG::do_request(std::shared_ptr&, ThreadPool::TPHandle&)+0x86) [0x7f35e88eeca6]
6: (OSD::dequeue_op(boost::intrusive_ptr, std::shared_ptr, ThreadPool::TPHandle&)+0x3bd) [0x7f35e876832d]
7: (PGQueueable::RunVis::operator()(std::shared_ptr&)+0x5d) [0x7f35e876854d]
8: (OSD::ShardedOpWQ::process(unsigned int, ceph::heartbeat_handle_d)+0x687) [0x7f35e878c2f7]
9: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x85f) [0x7f35e8baaf9f]
10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x7f35e8bacea0]
11: (()+0x7df5) [0x7f35e7a08df5]
12: (clone()+0x6d) [0x7f35e5b891ad]
NOTE: a copy of the executable, or objdump -rdS <executable> is needed to interpret this.

@liewegas liewegas removed this from the infernalis milestone Aug 17, 2015

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Aug 18, 2015

Sorry for this issues + I was on vacation. I can't reproduce it but I see the bug. @liewegas what will be the better choice?
a) use condition m->is_final_decoded() instead of m->version > 0 - it may be difficult to read
b) remove asserts from get_... formulas of fields decoded in final decoding and assume that they are decoded (and use its default values in conditions [i.e. version == 0]) - as before

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Sep 9, 2015

@0003ydlom this needs rebase.

@jjlakis jjlakis force-pushed the jjlakis:smc3 branch from c4893cb to 8f06fde Sep 14, 2015

@jjlakis

This comment has been minimized.

Copy link
Author

jjlakis commented Sep 14, 2015

@liewegas @tchaikov it's done

@jjlakis jjlakis force-pushed the jjlakis:smc3 branch from 8f06fde to 33af425 Sep 16, 2015

@jjlakis jjlakis force-pushed the jjlakis:smc3 branch from 33af425 to 9581e69 Sep 22, 2015

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Sep 25, 2015

@liewegas and @gregsfortytwo this pr passed the rados run on my test branch. is it good to merge?

@gregsfortytwo

This comment has been minimized.

Copy link
Member

gregsfortytwo commented Sep 25, 2015

LGTM

@tchaikov

This comment has been minimized.

Copy link
Contributor

tchaikov commented Sep 27, 2015

@0003ydlom could you rebase again? i think it's ready to merge then. thanks.

@liewegas

This comment has been minimized.

Copy link
Member

liewegas commented Sep 27, 2015

Jacek J. Lakis added some commits Aug 6, 2015

Jacek J. Lakis Jacek J. Łakis
handle_op/do_op: Moving couple of checks from dispatcher to paralleli…
…zed workers

Signed-off-by: Jacek J. Lakis <jacek.lakis@intel.com>
Jacek J. Lakis Jacek J. Łakis
MOSDOp::decode : Splitting message decoding, new version
Signed-off-by: Jacek J. Lakis <jacek.lakis@intel.com>

@jjlakis jjlakis force-pushed the jjlakis:smc3 branch from 9581e69 to 818d790 Sep 28, 2015

liewegas added a commit that referenced this pull request Sep 30, 2015

Merge pull request #5211 from 0003ydlom/smc3
MOSDOp/do_op/handle_op: Simple Messenger optimization

Reviewed-by: Sage Weil <sage@redhat.com>
Reviewed-by: Greg Farnum <gfarnum@redhat.com>
Tested-by: Kefu Chai <kchai@redhat.com>

@liewegas liewegas merged commit 756f809 into ceph:master Sep 30, 2015

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 9, 2015

MOSDRepOp: Simple Messenger optimization
the origin idea is from ceph#5211.

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 9, 2015

MOSDRepOp: Simple Messenger optimization
the origin idea is from ceph#5211.

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 9, 2015

MOSDRepOpReply: Simple Messenger optimization
the origin idea is from ceph#5211.

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 9, 2015

MOSDRepOp: Simple Messenger optimization
the origin idea is from ceph#5211.

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 9, 2015

MOSDRepOpReply: Simple Messenger optimization
the origin idea is from ceph#5211.

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 11, 2015

msg: ignore request_redirect_t encode/decode when we not need
reusing the feature introduced by ceph#5211

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 11, 2015

msg: ignore request_redirect_t encode/decode when we not need
reusing the feature introduced by ceph#5211

Signed-off-by: Xinze Chi <xinze@xsky.com>

XinzeChi added a commit to XinzeChi/ceph that referenced this pull request Nov 11, 2015

msg: ignore request_redirect_t encode/decode when we not need
reusing the feature introduced by ceph#5211

Signed-off-by: Xinze Chi <xinze@xsky.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.