Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] Estimated column reader memory to control segment cache #34526

Merged
merged 3 commits into from
May 8, 2024

Conversation

Lchangliang
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

If a segment has many columns, it will use a lot more memory than a segment with few columns, but it is not observable at present. So we estimated column reader memory to to control the memory.

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Lchangliang
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented May 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@xinyiZzz
Copy link
Contributor

xinyiZzz commented May 8, 2024

image

auto* lru_handle = LRUCachePolicy::insert(
            key.encode(), &value, 1, value.segment->meta_mem_usage(), CachePriority::NORMAL);

change to

auto* lru_handle = LRUCachePolicy::insert(
            key.encode(), &value, value.segment->meta_mem_usage(), value.segment->meta_mem_usage(), CachePriority::NORMAL);

@@ -408,6 +408,7 @@ Status Segment::_create_column_readers(const SegmentFooterPB& footer) {
RETURN_IF_ERROR(ColumnReader::create(opts, footer.columns(iter->second), footer.num_rows(),
_file_reader, &reader));
_column_readers.emplace(column.unique_id(), std::move(reader));
_meta_mem_usage += config::estimated_mem_per_column_reader;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in past, _meta_mem_usage only tracking some indexes memory,
now, add column reader memory

will there be other memory in segment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still has some unique_ptr/shared_ptr object and the object SubcolumnColumnReaders.

@Lchangliang
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented May 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@xiaokang xiaokang self-assigned this May 8, 2024
@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 35.68% (8982/25171)
Line Coverage: 27.34% (74210/271412)
Region Coverage: 26.58% (38360/144336)
Branch Coverage: 23.40% (19570/83622)
Coverage Report: http://coverage.selectdb-in.cc/coverage/7e959c8aa367d9f9f750700d12b928c78956e71e_7e959c8aa367d9f9f750700d12b928c78956e71e/report/index.html

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 8, 2024
Copy link
Contributor

github-actions bot commented May 8, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented May 8, 2024

PR approved by anyone and no changes requested.

@@ -1053,6 +1053,10 @@ DEFINE_mInt32(schema_cache_sweep_time_sec, "100");

// max number of segment cache, default -1 for backward compatibility fd_number*2/5
DEFINE_mInt32(segment_cache_capacity, "-1");
DEFINE_mInt32(estimated_num_columns_per_segment, "30");
DEFINE_mInt32(estimated_mem_per_column_reader, "1024");
// The value is calculate by storage_page_cache_limit * index_page_cache_percentage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete this line

Copy link
Contributor

@xinyiZzz xinyiZzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 99a6dc0 into apache:master May 8, 2024
26 of 30 checks passed
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 14, 2024
ByteYue pushed a commit to ByteYue/doris that referenced this pull request May 15, 2024
M1saka2003 pushed a commit to M1saka2003/doris that referenced this pull request May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants