Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save TLog resources by letting peek request only spilled data. #1584

Merged
merged 5 commits into from
Jun 21, 2019

Conversation

alexmiller-apple
Copy link
Contributor

@alexmiller-apple alexmiller-apple commented May 15, 2019

A profile on mighty showed that we're spending ~10%-15% of CPU doing peekMessagesFromMemory, which are largely going to be for spilled pieces of data, and thus completely wasted.

This involves a protocol change, so it's not 6.1 patchable.

There's another way to go with this, which would be to do this as a heuristic entirely on the TLog, via looking at if the request is for a begin that's within N million versions of the current version, and then taking endVersion an knownDurableVersion from the peek reply, and doing the same heuristic on the client side. That feels potentially trickier to feel comfortable with, but would be 6.1 patchable instead.

If a peek is entirely fulfilled from spilled data, then it's likely that
the next peek will be also.  It is thus wasteful for each of these peeks
to call peekMessagesFromMemory, which memcpy's excessively, and then
throw all that data away without using it.

Now, TLogs will give a hint back to peek cursors about if the provided
reply was served entirely from the spilled data, which peek curors then
feed back as the hint into their next request.

At some point, a cursor will send a request for only spilled data, get
an incomplete response, and then be told to send its next request as one
that peeks from memory as well, and then it will fully catch up.
If there's some spilled data, there's probably a lot of spilled data,
and now we can pull all of it faster.
@alexmiller-apple
Copy link
Contributor Author

alexmiller-apple commented May 15, 2019

I realized I was down to a one-line diff to address #1529 ParallelPeekGetMore when we've peeked only spilled data.

I haven't fully thought through if this is a good idea in all cases, but it does pass correctness ¯\_(ツ)_/¯

@alexmiller-apple
Copy link
Contributor Author

@fdb-build, test this please

@@ -158,6 +158,7 @@ ACTOR Future<Void> serverPeekParallelGetMore( ILogSystem::ServerPeekCursor* self
expectedBegin = res.end;
self->futureResults.pop_front();
self->results = res;
self->onlySpilled = res.onlySpilled;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When switching from parallelGetMore back to regular GetMore, we still want to process the requests we already have outstanding to the TLogs to both avoid wasting work by the logs and to avoid an oscillation where the tlog spills more just as we stop using parallelGetMore. I think this can be done by uses the size of futureResults as another indicator that this function should be called, and then avoid issuing more requests is onlySpilled==false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants