Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Offset Index #1376

Open
sijie opened this issue Apr 30, 2018 · 3 comments
Open

Introduce Offset Index #1376

sijie opened this issue Apr 30, 2018 · 3 comments

Comments

@sijie
Copy link
Member

sijie commented Apr 30, 2018

FEATURE REQUEST

  1. Please describe the feature you are requesting.

Currently in a ledger, we indexed entries by entry id. It would be good to have an index by offsets. This allows supporting APIs like:

  • readEntries(long startEntryId, int maxBytes)
  • readEntries(long startOffset, long endOffset)
  1. Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

nice-to-have

  1. Provide any additional detail on your proposed use case for this feature.

entry(/request) oriented api is not very good friendly to resource-usage when do prefetching or batching reads. offset oriented api is much better for estimating resource usage.

@eolivelli
Copy link
Contributor

It would be great for stagare usecase, like reading a batch of sequential entries woth random access pattern (opposite to tailing reads)

@eolivelli
Copy link
Contributor

A new wire protocol rpc would great as well

@Tielem
Copy link

Tielem commented Feb 13, 2019

FEATURE REQUEST

  1. Please describe the feature you are requesting.

We create continuous streams of growing data, eg ledgers with entities.
However, we also require random access in the underlying data stream.

Proposed API's:

  • getLastAddConfirmedByteOffset(): long

Returns the byte offset of the last byte of the last entry committed to the ledger.

  • readBytes(long startOffset, long endOffset): Enumerator<Byte[]>

Reads all bytes between startOffset and endOffset (inclusive), returned per stored entry.
In case the endOffset is beyond the end of the ledger, the behavior should be the same as readEntries.

  • readBytes(long startOffset): Enumerator<Byte[]>

Reads all bytes from startOffset to current confirmed end of ledger, returned per stored entry.

  1. Indicate the importance of this issue to you (blocker, must-have, should-have, nice-to-have). Are you currently using any workarounds to address this issue?

Currently using a different solution to tackle this use case, with less durability/scalability/etc.
It would make our future architecture simpler and better overall.

  1. Provide any additional detail on your proposed use case for this feature.
  • Uncommitted API's

While not needed for our use-case, to be API complete, uncommitted API's might be good to have.

  • Polling API's

While not immediately needed for our use-case and we can tackle this with other polling mechanisms, it would be useful if we can open read binary from a ledger.
Starting at a given startOffset, keep receiving byte[] until either the handler is closed or endOffset is met.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants