[fix](be) Keep prefetch reader alive for async tasks#63796
Merged
Conversation
### What problem does this PR solve? Issue Number: close apache#25509 Related PR: apache#61248 Problem Summary: S3/OSS prefetch timeout can cancel and close PrefetchBufferedReader while an async PrefetchBuffer task is still running. The task kept PrefetchBuffer alive but only stored the underlying FileReader as a raw pointer, so the owner could destroy the reader before the async task resumed on the error path and logged reader metadata. Keep a shared FileReader reference in each PrefetchBuffer so the async prefetch task cannot outlive the reader it dereferences, and add a unit test that covers close timeout while the prefetch read is blocked. ### Release note None ### Check List (For Author) - Test: git diff --check; git diff --cached --check; attempted build-support/clang-format.sh and build-support/check-format.sh but llvm@16/clang-format is not installed; attempted ./run-be-ut.sh --run buffered_reader_test but it failed before running tests because JAVA_HOME points to JDK 11 and JDK_17 is not set - Behavior changed: No - Does this need documentation: No
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
/review |
Contributor
There was a problem hiding this comment.
Review Summary
No blocking findings found in this PR. The change keeps the underlying FileReader alive through pending prefetch tasks by storing a FileReaderSPtr in each PrefetchBuffer, which matches the reported async lifetime failure path.
Critical checkpoint conclusions:
- Goal and tests: The PR addresses the close-timeout/use-after-destroy path for async prefetch tasks, and adds a BE unit test covering a blocked prefetch read followed by close timeout.
- Scope: The implementation is small and focused on ownership lifetime plus the targeted test.
- Concurrency: The touched path is concurrent between query/close threads and the prefetch thread pool. Existing mutex/condition-variable state transitions are preserved; the new shared ownership only extends reader lifetime and does not add a new lock order.
- Lifecycle: The main lifecycle concern is the async task outliving
PrefetchBufferedReader; the shared pointer inPrefetchBuffercovers the reader dereferences after owner destruction. No static/global initialization concern was introduced. - Configuration: No new configuration item was added. The test temporarily changes
buffered_reader_read_timeout_msand restores it. - Compatibility: No storage format, RPC, function-symbol, or FE/BE protocol compatibility impact found.
- Parallel paths: The modified constructor is the only
PrefetchBuffercreation path found. - Conditional checks: No new special conditional behavior was introduced in production code.
- Test coverage: Coverage is targeted to the reported lifetime race; I did not identify an additional missing negative case that should block this PR.
- Test results: I reviewed the added test logic but did not run BE UT in this review environment.
- Observability: Existing warning logs on prefetch failure retain path/offset/status context; no new observability appears required.
- Transaction/persistence/data-write correctness: Not applicable to this IO reader lifetime change.
- FE/BE variable passing: Not applicable.
- Performance: The extra shared pointer copies are bounded by the small prefetch-buffer count and are not on the per-read hot copy path; no material performance issue found.
User focus points: No additional user-provided review focus was present.
Contributor
Author
|
run buildall |
Contributor
TPC-H: Total hot run time: 32013 ms |
Contributor
TPC-DS: Total hot run time: 172326 ms |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
morningman
approved these changes
May 29, 2026
Contributor
|
PR approved by at least one committer and no changes requested. |
Contributor
|
PR approved by anyone and no changes requested. |
github-actions Bot
pushed a commit
that referenced
this pull request
May 29, 2026
Problem Summary: S3/OSS prefetch timeout can cancel and close PrefetchBufferedReader while an async PrefetchBuffer task is still running. The task kept PrefetchBuffer alive but only stored the underlying FileReader as a raw pointer, so the owner could destroy the reader before the async task resumed on the error path and logged reader metadata. Keep a shared FileReader reference in each PrefetchBuffer so the async prefetch task cannot outlive the reader it dereferences, and add a unit test that covers close timeout while the prefetch read is blocked.
github-actions Bot
pushed a commit
that referenced
this pull request
May 29, 2026
Problem Summary: S3/OSS prefetch timeout can cancel and close PrefetchBufferedReader while an async PrefetchBuffer task is still running. The task kept PrefetchBuffer alive but only stored the underlying FileReader as a raw pointer, so the owner could destroy the reader before the async task resumed on the error path and logged reader metadata. Keep a shared FileReader reference in each PrefetchBuffer so the async prefetch task cannot outlive the reader it dereferences, and add a unit test that covers close timeout while the prefetch read is blocked.
yiguolei
pushed a commit
that referenced
this pull request
May 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #25509
Related PR: #61248
Problem Summary: S3/OSS prefetch timeout can cancel and close PrefetchBufferedReader while an async PrefetchBuffer task is still running. The task kept PrefetchBuffer alive but only stored the underlying FileReader as a raw pointer, so the owner could destroy the reader before the async task resumed on the error path and logged reader metadata. Keep a shared FileReader reference in each PrefetchBuffer so the async prefetch task cannot outlive the reader it dereferences, and add a unit test that covers close timeout while the prefetch read is blocked.
Release note
None
Check List (For Author)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)