-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the read performance issue in the offload readAsync #12443
Fix the read performance issue in the offload readAsync #12443
Conversation
--- *Motivation* In the apache#12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream.
@zymap:Thanks for your contribution. For this PR, do we need to update docs? |
1 similar comment
@zymap:Thanks for your contribution. For this PR, do we need to update docs? |
@zymap:Thanks for your contribution. For this PR, do we need to update docs? |
@zymap:Thanks for providing doc info! |
--- *Motivation* In the #12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream. (cherry picked from commit b4d05ac)
--- *Motivation* In the apache#12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream.
--- *Motivation* In the #12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream. (cherry picked from commit b4d05ac)
--- *Motivation* In the apache#12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream. (cherry picked from commit b4d05ac)
--- *Motivation* In the apache#12123, I add the seek operation at the readAsync method. It makes sure the data stream always seek to the first entry position to read and will not introduce EOF exception. But in the offload index entry, it groups a set of entries into a range, the seek operation will seek the posistion to the first entry in the range. That will introduce a performance issue because every read opeartion will read from the first entry in the range until it find the actual first read entry. But if we remove the seek operation, that will cause a EOF exception from the readAsync method. This PR adds a limitation of the seek opeartion. *Modifications* Add available method in the backedInputStream to get know how many bytes we can read from the stream. (cherry picked from commit b4d05ac)
Motivation
In the #12123, I add the seek operation at the readAsync method.
It makes sure the data stream always seek to the first entry position
to read and will not introduce EOF exception.
But in the offload index entry, it groups a set of entries into a range,
the seek operation will seek the posistion to the first entry in the range.
That will introduce a performance issue because every read opeartion will
read from the first entry in the range until it find the actual first read
entry.
But if we remove the seek operation, that will cause a EOF exception from
the readAsync method. This PR adds a limitation of the seek opeartion.
Modifications
Add available method in the backedInputStream to get know how many bytes
we can read from the stream.
Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
(or)
This change is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Does this pull request potentially affect one of the following parts:
If
yes
was chosen, please highlight the changesDocumentation
Check the box below and label this PR (if you have committer privilege).
Need to update docs?
doc-required
(If you need help on updating docs, create a doc issue)
no-need-doc
(Please explain why)
doc
(If this PR contains doc changes)