New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

common/TrackedOp: various cleanups and optimizations #12537

Merged
merged 17 commits into from Jan 31, 2017

Conversation

Projects
None yet
4 participants
@liewegas
Member

liewegas commented Dec 16, 2016

  • move to intrusive_ptr
  • use intrusive list
  • preallocate event vector
  • increase dout level (so that it doesn't run by default)
  • cache rendered op description
  • lots of cleanup

Perf shows me going from ~9% to 2.5% with defaults.

@liewegas

This comment has been minimized.

Member

liewegas commented Dec 22, 2016

retest this please

@tchaikov tchaikov self-assigned this Jan 12, 2017

@gregsfortytwo

Mostly looks okay, but some questions.

I'm curious how much of the change is attributable merely to reducing the logging. In the commit you say 2-3%, but I'm surprised switching from shared_ptr to intrusive_ptr atomics is responsible for another 3-4.5%...(maybe killing xlist does it?).

@@ -173,6 +173,10 @@ void MutationImpl::cleanup()
drop_pins();
}
void MutationImpl::_dump_op_descriptor_unlocked(ostream& stream) const
{
stream << "Mutation";

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 13, 2017

Member

Well, I guess this puts a few more of the untracked changes into the OpTracker. We should do better output than a blanket "Mutation" for them though.

This comment has been minimized.

@liewegas

liewegas Jan 13, 2017

Member

They don't actually get tracked since with don't initialize them with an OpTracker*, so it doesn't really matter. We just have to implement the method.

@@ -68,10 +69,6 @@ class MMDSFragmentNotify;
class ESubtreeMap;
struct MDRequestImpl;
typedef ceph::shared_ptr<MDRequestImpl> MDRequestRef;
struct MDSlaveUpdate;

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 13, 2017

Member

We're generally trying to reduce our dependencies instead of including stuff in more places — can we just update these instead of sticking Mutation.h everywhere?

This comment has been minimized.

@liewegas

liewegas Jan 13, 2017

Member

Unfortunately no, because the intrusive_ptr needs access to the refcount in the parent class. :(

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 14, 2017

Member

Oh, and we imply use of all those functions by using the intrusive_ptr as a parameter? sigh

#include "msg/Message.h"
#include "include/memory.h"
#include "common/RWLock.h"
#include <atomic>
#define OPTRACKER_PREALLOC_EVENTS 20

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 13, 2017

Member

Any reason not to make this a config value? How did you choose 20? (That seems awfully small.)

This comment has been minimized.

@liewegas

liewegas Jan 13, 2017

Member

I think osd ops are usually <10, so it seemed safe. If it's more on the MDS can can bump it up.. 50?

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 14, 2017

Member

Well it's ops in flight at a time, right? SSD OSDs can handle a lot a more than 10 at a time (at least, I assume).
Dunno how much it actually matters since vector is reasonably intelligent about internal allocations anyway, though.

@@ -161,7 +164,7 @@ inline ostream& operator<<(ostream &out, const MutationImpl &mut)
return out;
}
typedef ceph::shared_ptr<MutationImpl> MutationRef;
typedef boost::intrusive_ptr<MutationImpl> MutationRef;

This comment has been minimized.

@gregsfortytwo

gregsfortytwo Jan 13, 2017

Member

We probably want an alias in the ceph namespace for intrusive_ptr the same way we did for shared_ptr, so we can switch it around as they add it to the STL or use it with alternative systems.

This comment has been minimized.

@liewegas

liewegas Jan 13, 2017

Member

will do that in a follow-on cleanup pr

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 13, 2017

My guess is it was the allocations (vector of events, intrusive_list). It's been weeks, though, so I don't remember which change was responsible for what.

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 13, 2017

rebased

@gregsfortytwo

Reviewed-by: Greg Farnum gfarnum@redhat.com

Not sure if you were ready to merge this or not @liewegas, but it looks good.

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 14, 2017

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 14, 2017

unsigned bits) {
for (auto i = from->begin(); i != from->end(); ) {
if (split_request(&(*i), match, bits)) {
to->push_back(*i);

This comment has been minimized.

@tchaikov

tchaikov Jan 16, 2017

Contributor

could just use

to->splice(to->end(), *from, i++);

instead of doing push_back() and erase().

This comment has been minimized.

@liewegas

liewegas Jan 16, 2017

Member

split won't work i think because it's dependent on split_request. this code is broken, though, because an op can only be on one intrusive list at a time; fixing

@@ -1537,7 +1549,7 @@ class OSD : public Dispatcher,
Mutex::Locker l(session->session_dispatch_lock);
clear_session_waiting_on_map(session);
for (map<spg_t, list<OpRequestRef> >::const_iterator i =
for (map<spg_t, boost::intrusive::list<OpRequest> >::const_iterator i =

This comment has been minimized.

@tchaikov

tchaikov Jan 16, 2017

Contributor

maybe auto

f->dump_int("num to keep", history_size);
f->dump_int("duration to keep", history_duration);
f->open_object_section("op_history");
f->dump_int("history_size", history_size);

This comment has been minimized.

@tchaikov

tchaikov Jan 16, 2017

Contributor

nit, the prefix of "history" is a little bit redundant.

@yuriw

This comment has been minimized.

Contributor

yuriw commented Jan 17, 2017

@liewegas I can't build notcmalloc xenial =>

/build/ceph-11.1.0-6840-g9a89891/src/osd/OpRequest.cc:19:31: fatal error: tracing/oprequest.h: No such file or directory #include "tracing/oprequest.h"

https://jenkins.ceph.com/job/ceph-dev-new-build/ARCH=x86_64,AVAILABLE_ARCH=x86_64,AVAILABLE_DIST=trusty,DIST=trusty,MACHINE_SIZE=huge/560/console

@liewegas

This comment has been minimized.

Member

liewegas commented Jan 22, 2017

this is responsible for the valgrind errors, see e.g.
/a/sage-2017-01-21_02:26:55-rados-wip-sage-testing---basic-smithi/733623

liewegas added some commits Dec 14, 2016

messages/MLock: drop unnecessary #include
Signed-off-by: Sage Weil <sage@redhat.com>
mds/Mutation: make Mutation a TrackedOp
This seems silly now, but is needed in order to switch from shared_ptr to
intrusive_ptr.  The TrackOp is the refcounted thing, and we want to be
able to cast beween MutationRefs and MDRequestRefs, so... we need to make
sure the refcounting is done via a common parent, TrackedOp.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: use explicit ref count, intrusive_ptr
This is more efficient and more explicit than shared_ptr.  It will also
allow us to put these in intrusive lists.

Note that before we used the shared_ptr OnDelete property only for the
live ops and left it off for the history.  Here we rename is_tracked to
state and explicit note whether we are untracked, live, or history.

Also, now that we #include OpRequest in osd_types.h, we have to include
the .cc file (and ~OpRequest definition) in libcommon to avoid a link
error on all the unit tests etc.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: xlist -> intrusive::list
Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: make TrackedOp an intrusive::list::list_base_hook<>
This lets us put these in efficient lists.  Note that callers must take
care to bump the ref count manually!

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: add get() and put()
Signed-off-by: Sage Weil <sage@redhat.com>
osd: use intrusive list for session op queue
We can avoid these list<> allocations in the fast dispatch path.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: drop unneeded _mark_event; pass c string
Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: add const char* and string variants of event
This avoids the full std::string (copy) for most events.  Wrap it up
into an Event class to hide some of the guts.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: fix formatter output
Spaces not allowed.  Lowercase by convention.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: use preallocated vector for events
This avoids an allocation per mark_event() call.

Signed-off-by: Sage Weil <sage@redhat.com>

liewegas added some commits Dec 16, 2016

common/TrackedOp: cache op description
Only calculate it once.  Invalidate from OSD after a full decode (when it
changes).

Signed-off-by: Sage Weil <sage@redhat.com>
osd/OpRequest: drop useless helpers
No callers!

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: take option time arg
Fix the callers to use this instead of calling into Tracker directly.

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: manage current in parent class
No reason for child to muck with this!

Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: some cleanup
Signed-off-by: Sage Weil <sage@redhat.com>
common/TrackedOp: log at level 6
This is above the default log level.  I have yet to encounter a situation
where the optracker output was helpful during a crash, and this saves
a significant amount of cpu time (~2-3% in my test).

Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas merged commit dc2330b into ceph:master Jan 31, 2017

3 checks passed

Signed-off-by all commits in this PR are signed
Details
Unmodifed Submodules submodules for project are unmodified
Details
default Build finished.
Details

@liewegas liewegas deleted the liewegas:wip-optracker branch Jan 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment