Introduce Offset Index #1376

sijie · 2018-04-30T16:51:24Z

FEATURE REQUEST

Please describe the feature you are requesting.

Currently in a ledger, we indexed entries by entry id. It would be good to have an index by offsets. This allows supporting APIs like:

readEntries(long startEntryId, int maxBytes)
readEntries(long startOffset, long endOffset)

Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

nice-to-have

Provide any additional detail on your proposed use case for this feature.

entry(/request) oriented api is not very good friendly to resource-usage when do prefetching or batching reads. offset oriented api is much better for estimating resource usage.

eolivelli · 2018-04-30T17:35:58Z

It would be great for stagare usecase, like reading a batch of sequential entries woth random access pattern (opposite to tailing reads)

eolivelli · 2018-04-30T17:36:26Z

A new wire protocol rpc would great as well

Tielem · 2019-02-13T15:00:51Z

FEATURE REQUEST

Please describe the feature you are requesting.

We create continuous streams of growing data, eg ledgers with entities.
However, we also require random access in the underlying data stream.

Proposed API's:

getLastAddConfirmedByteOffset(): long

Returns the byte offset of the last byte of the last entry committed to the ledger.

readBytes(long startOffset, long endOffset): Enumerator<Byte[]>

Reads all bytes between startOffset and endOffset (inclusive), returned per stored entry.
In case the endOffset is beyond the end of the ledger, the behavior should be the same as readEntries.

readBytes(long startOffset): Enumerator<Byte[]>

Reads all bytes from startOffset to current confirmed end of ledger, returned per stored entry.

Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

Currently using a different solution to tackle this use case, with less durability/scalability/etc.
It would make our future architecture simpler and better overall.

Provide any additional detail on your proposed use case for this feature.

Uncommitted API's

While not needed for our use-case, to be API complete, uncommitted API's might be good to have.

Polling API's

While not immediately needed for our use-case and we can tackle this with other polling mechanisms, it would be useful if we can open read binary from a ledger.
Starting at a given startOffset, keep receiving byte[] until either the handler is closed or endOffset is met.

sijie added type/feature area/protocol area/client area/bookie labels Apr 30, 2018

sijie mentioned this issue May 17, 2018

LedgerEntry#getLength does not do what the documentation says #1411

Open

ivankelly added the triage/week-10 label Mar 8, 2019

sijie mentioned this issue Jan 15, 2020

ISSUE-1376: Introduce Offset Index streamnative/bookkeeper-achieved#65

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce Offset Index #1376

Introduce Offset Index #1376

sijie commented Apr 30, 2018

eolivelli commented Apr 30, 2018

eolivelli commented Apr 30, 2018

Tielem commented Feb 13, 2019

Introduce Offset Index #1376

Introduce Offset Index #1376

Comments

sijie commented Apr 30, 2018

eolivelli commented Apr 30, 2018

eolivelli commented Apr 30, 2018

Tielem commented Feb 13, 2019