New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common/OpHistory: move insert/cleanup into separate thread #20540
Conversation
Replace push_back with explicit constructor with push_back for minor perf increase. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
No need to do this twice. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
src/common/TrackedOp.h
Outdated
_break_thread(false) { } | ||
|
||
void BreakThread(); | ||
void InsertOp(utime_t& now, TrackedOpRef op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lower_case not CamelCaps for these please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot.
src/common/TrackedOp.cc
Outdated
return; | ||
|
||
opsvc.InsertOp(now, op); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to put this function in teh header so the bool check gets inlined? that'll let us skip one additional stack frame/function call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check. Maybe it'll make sense to inline also OpServiceThread::insert_op().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome to see this mitigates most of the current optracker overhead! just a few style nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good design but a few more nits. :)
src/common/TrackedOp.h
Outdated
~OpHistory() { | ||
assert(arrived.empty()); | ||
assert(duration.empty()); | ||
assert(slow_op.empty()); | ||
} | ||
void insert(utime_t now, TrackedOpRef op); | ||
void insert(utime_t& now, TrackedOpRef op); | ||
void _insert_delayed(utime_t& now, TrackedOpRef op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make any pass-by-reference values const. (It's part of our style guide and assures users that the function won't modify their value.)
src/common/TrackedOp.cc
Outdated
opsvc.InsertOp(now, op); | ||
} | ||
|
||
void OpHistory::_insert_delayed(utime_t& now, TrackedOpRef op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we come up with a better name than _insert_delayed(), given that it's now finished being delayed? I think I'd prefer even just _insert() if nothing better suggests itself.
src/common/TrackedOp.cc
Outdated
|
||
void OpHistoryServiceThread::InsertOp(utime_t& now, TrackedOpRef op) { | ||
queue_spinlock.lock(); | ||
_external_queue.emplace_back(now, op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allocates, right? Is using a mutex really not okay here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted threads to wait for lock to be freed without being preempted. See the graph, there's a slight difference in how it affects machinery performance.
4a368ab
to
313a35c
Compare
gregsfortytwo wrote
branch-predictor wrote
Yeah, I get that, but there are reasons people tell you not to use spinlocks when memory allocation might happen. So I wonder if we can do some trick to allocate the memory and then put it on the end of the list. I'm probably worrying about it too much given the simple case presented here, though. (Or, probably out of scope here, but did you consider giving a separate queue to each OSD op thread and having the OpTracker one pick off the front of each? That would greatly reduce the sharing...I notice you emphasize the preserved ordering but the OpTracker worker could handle that by walking forward with timestamp comparisons, and I don't think it's that big a deal anyway.) |
gregsfortytwo wrote
That's what I'm thinking too. I mean, sure - your concerns are perfectly valid, but it's not like I'm allocating megabytes or even kilobytes of data. Just a few hundred of bytes max, should be little enough to not get slowed down badly by memory allocator.
Yeah, I've been thinking about it too, but at this point I think it's too early for that. Let's see how far this one takes us, and then let's optimize further. I already see few possibilities to optimize it without complicating matters much. |
abbbcd2
to
52d8f09
Compare
Cluster that's flooded with incoming ops (and enabled optracker) is bottlenecked by OpHistory::insert. Reduce that by: - pushing incoming ops into separate queue that'll be processed by separate thread. - using std::atomic_bool for shutdown flag so ops_history_lock doesn't need to be taken as often Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
It's unused anyway. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
Now that it has its own processing thread, it must be shut down explicitly or it'll sigsegv randomly. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
Reorder smaller fields around so they're aligning naturally, regaining a few bytes of storage. Signed-off-by: Piotr Dałek <piotr.dalek@corp.ovh.com>
52d8f09
to
136642d
Compare
Be nice to squash a few of those down, but looks good. |
Eek, wrong button. |
@liewegas @gregsfortytwo does this qualify for backport to luminous? |
I don't have strong feelings either way. It probably qualifies but it's enough of a change I wouldn't do it immediately or casually, I guess? |
Yeah, let's give it some time in master to make sure there isn't fallout before backporting |
Cluster that's flooded with incoming ops (and enabled optracker) is bottlenecked by OpHistory::insert. Reduce that by:
My initial testing has shown this noticeably reduced optracker impact on cluster perfornance:
Using separate thread ("threaded optracker") didn't improve things by much, neither did replacing OpHistorySvc thread mutex with spinlock ("threaded optracker + spin"). Removing the ops_history_lock from the processing path (by either removing it entirely or replacing
shutdown
bool flag with atomic) did the trick and the optracker perf impact is still there, albeit much smaller.Note that I intentionally used spin loop with scaling sleep, as conditional variables/signaling turned out to be too slow for this purpose and it actually made it work much worse. Side effect of scaling sleep is that it reduces cpu time consumed by OpHistorySvc thread as it processes data in batches. This might incur some data latency in OpHistory, but up to around 128ms - data is still guaranteed to go in FIFO order.
Signed-off-by: Piotr Dałek piotr.dalek@corp.ovh.com