Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIEREDSTORAGE] Only seek when reading unexpected entry #5356

Merged
merged 3 commits into from Oct 30, 2019

Conversation

@ivankelly
Copy link
Contributor

ivankelly commented Oct 10, 2019

The normal pattern from reading from an offloaded ledger is that the
reader will read the ledger sequentially from start to end. This means
that once a user reads an entry, we should expect that the next entry
they read will be the next entry in the ledger.

The initial implementation of the BlobStoreBackedReadHandleImpl (and
the S3 variant that preceeded it) didn't take this into
account. Instead it did a lookup in the index each time, to find the
block that contained the entry, and then read forward in the block
until it found the entry requested. This is fine for the first few
entries in the block, not so much for the last.

This PR changes the read behaviour to only seek if entryId read
from the block is either:

  • greater than the entry we were expecting to read, in which case we
    need to seek backwards in the block.
  • less than the entry expected, but also belonging to a different
    block to the expected entry, in which case we need to seek to the
    correct block.

This change improves read performance significantly. Adhoc benchmarks
shows that we can read from offloaded topics at ~160MB/s whereas
previously we could only manage <10MB/s.

The normal pattern from reading from an offloaded ledger is that the
reader will read the ledger sequentially from start to end. This means
that once a user reads an entry, we should expect that the next entry
they read will be the next entry in the ledger.

The initial implementation of the BlobStoreBackedReadHandleImpl (and
the S3 variant that preceeded it) didn't take this into
account. Instead it did a lookup in the index each time, to find the
block that contained the entry, and then read forward in the block
until it found the entry requested. This is fine for the first few
entries in the block, not so much for the last.

This PR changes the read behaviour to only seek if entryId read
from the block is either:
- greater than the entry we were expecting to read, in which case we
  need to seek backwards in the block.
- less than the entry expected, but also belonging to a different
  block to the expected entry, in which case we need to seek to the
  correct block.

This change improves read performance significantly. Adhoc benchmarks
shows that we can read from offloaded topics at ~160MB/s whereas
previously we could only manage <10MB/s.
@@ -116,7 +116,7 @@ public int read(byte[] b, int off, int len) throws IOException {

@Override
public void seek(long position) {
log.debug("Seeking to {} on {}/{}, current position {}", position, bucket, key, cursor);
log.info("Seeking to {} on {}/{}, current position {} (bufStart:{}, bufEnd:{})", position, bucket, key, cursor, bufferOffsetStart, bufferOffsetEnd);

This comment has been minimized.

Copy link
@sijie

sijie Oct 10, 2019

Contributor

Can this be annoying?

This comment has been minimized.

Copy link
@ivankelly

ivankelly Oct 10, 2019

Author Contributor

ah yes, should be debug

This comment has been minimized.

Copy link
@sijie

sijie Oct 14, 2019

Contributor

@ivankelly can you change it to debug please?

This comment has been minimized.

Copy link
@skyrocknroll

skyrocknroll Oct 17, 2019

Contributor

@ivankelly This is a very important change we are rooting for in 2.4.2. Merging this would be great.

@aahmed-se

This comment has been minimized.

Copy link
Contributor

aahmed-se commented Oct 23, 2019

@ivankelly can you address the comments

sijie added 2 commits Oct 24, 2019
@sijie
sijie approved these changes Oct 24, 2019
@sijie

This comment has been minimized.

Copy link
Contributor

sijie commented Oct 24, 2019

I reverted info logging to debug logging. once it passes CI, it is ready to merge.

@wolfstudy

This comment has been minimized.

Copy link
Member

wolfstudy commented Oct 30, 2019

run cpp tests

@wolfstudy wolfstudy merged commit 43bc790 into apache:master Oct 30, 2019
3 checks passed
3 checks passed
Jenkins: C++ / Python Tests SUCCESS
Details
Jenkins: Integration Tests SUCCESS
Details
Jenkins: Java 8 - Unit Tests SUCCESS
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.