Skip to content

[Enhancement] Reduce the number of disk accesses while parsing a segment footer #6652

@sduzh

Description

@sduzh

Search before asking

  • I had searched in the issues and found no similar issues.

Version

61c9d11

What's Wrong?

In the current implementation, parsing a segment footer needs two disk accesses:
https://github.com/apache/incubator-doris/blob/61c9d11fdb72dccf94876ae3706e7ed492622807/be/src/olap/rowset/segment_v2/segment.cpp#L96
If there are a large number of segment files, this will hurt the performance significantly. I have written a demo program to reduce the number of I/Os by reading more data at once, and the query results improved from 47s to 31 seconds. However, since the size of the footer can vary over a wide range, it is difficult to decide how much data is appropriate to read at once.

What You Expected?

Try to parse segment footer with just one disk access.

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions