-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[enhancement](cloud) optimize warm up local IO and performance #50275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check whether the data exists based on the cache metadata. If it exists, there is no need to read from the local file, which enhances warm-up performance and reduces disk I/O load. Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
run buildall |
@@ -45,6 +45,8 @@ namespace doris::io { | |||
bvar::Adder<uint64_t> s3_read_counter("cached_remote_reader_s3_read"); | |||
bvar::LatencyRecorder g_skip_cache_num("cached_remote_reader_skip_cache_num"); | |||
bvar::Adder<uint64_t> g_skip_cache_sum("cached_remote_reader_skip_cache_sum"); | |||
bvar::Adder<uint64_t> g_skip_local_cache_io_sum_bytes( | |||
"cached_remote_reader_skip_local_cache_io_sum_bytes"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest naming
cached_remote_reader_skip_local_cache_io_bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add sum will underline its cumulative attribute
@@ -229,6 +230,7 @@ void FileCacheBlockDownloader::download_segment_file(const DownloadFileMeta& met | |||
// TODO(plat1ko): | |||
// 1. Directly append buffer data to file cache | |||
// 2. Provide `FileReader::async_read()` interface | |||
meta.ctx.is_dryrun = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why meta.is_dryrun is not set by the caller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too many of them and is safer to set here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll set it by caller and use DCHECK here to ensure safety
@@ -110,6 +112,7 @@ std::pair<size_t, size_t> CachedRemoteFileReader::s_align_size(size_t offset, si | |||
|
|||
Status CachedRemoteFileReader::read_at_impl(size_t offset, Slice result, size_t* bytes_read, | |||
const IOContext* io_ctx) { | |||
bool is_dryrun = io_ctx->is_dryrun; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression P0 && UT Coverage ReportIncrement line coverage Increment coverage report
|
Signed-off-by: zhengyu <zhangzhengyu@selectdb.com>
run buildall |
TPC-H: Total hot run time: 34089 ms
|
TPC-DS: Total hot run time: 192694 ms
|
ClickBench: Total hot run time: 29.41 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
run buildall |
TPC-H: Total hot run time: 34030 ms
|
TPC-DS: Total hot run time: 185687 ms
|
ClickBench: Total hot run time: 29.12 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-DS: Total hot run time: 192550 ms
|
ClickBench: Total hot run time: 29.37 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
run cloud_p0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
f775355
run buildall |
TPC-H: Total hot run time: 34290 ms
|
TPC-DS: Total hot run time: 192695 ms
|
ClickBench: Total hot run time: 29.59 s
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression P0 && UT Coverage ReportIncrement line coverage Increment coverage report
|
run p0 |
BE Regression P0 && UT Coverage ReportIncrement line coverage Increment coverage report
|
1 similar comment
BE Regression P0 && UT Coverage ReportIncrement line coverage Increment coverage report
|
PR approved by at least one committer and no changes requested. |
Check whether the data exists based on the cache metadata. If it exists, there is no need to read from the local file, which enhances warm-up performance and reduces disk I/O load.
…e#50275) Check whether the data exists based on the cache metadata. If it exists, there is no need to read from the local file, which enhances warm-up performance and reduces disk I/O load.
Check whether the data exists based on the cache metadata. If it exists, there is no need to read from the local file, which enhances warm-up performance and reduces disk I/O load.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)