Skip to content

[refactoring](multi-catalog)data_lake_reader_refactoring. #62306

Open
kaka11chen wants to merge 20 commits intomasterfrom
data_lake_reader_refactoring
Open

[refactoring](multi-catalog)data_lake_reader_refactoring. #62306
kaka11chen wants to merge 20 commits intomasterfrom
data_lake_reader_refactoring

Conversation

@kaka11chen
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Copilot AI review requested due to automatic review settings April 10, 2026 02:59
@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 10, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the data lake reader pipeline to use a unified init_reader(ReaderInitContext*) template-method flow across file/table-format readers, while extending FE/BE scan metadata to carry per-slot column categories and inline default-value expressions.

Changes:

  • Add TColumnCategory and inline default_value_expr into TFileScanSlotInfo (while keeping a legacy default map for backward compatibility).
  • Introduce ReaderInitContext (+ format-specific init contexts) and migrate Parquet/ORC and multiple table-format readers (Hive/Iceberg/Paimon/Transactional Hive, etc.) to the unified init/get_next_block hook model.
  • Update a large set of BE unit tests and scan-node logic to use the new init contexts and column classification.

Reviewed changes

Copilot reviewed 88 out of 88 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
gensrc/thrift/PlanNodes.thrift Adds TColumnCategory and embeds default_value_expr into required slot info.
fe/fe-core/src/main/java/org/apache/doris/nereids/load/NereidsLoadPlanInfoCollector.java Sets slot default expressions inline in required slot info while keeping legacy map.
fe/fe-core/src/main/java/org/apache/doris/datasource/tvf/source/TVFScanNode.java Overrides column classification to mark global rowid columns as synthesized.
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java Adds Iceberg-specific column categorization (rowid/lineage).
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/source/HiveScanNode.java Adds Hive-specific handling for synthesized global rowid columns.
fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java Inlines default expressions into TFileScanSlotInfo in addition to legacy map.
fe/fe-core/src/main/java/org/apache/doris/datasource/FileQueryScanNode.java Introduces classifyColumn API and uses TColumnCategory to set is_file_slot.
be/test/format/table/table_schema_change_helper_test.cpp Updates tests for reordered helper function parameters and includes new helper header.
be/test/format/table/iceberg/iceberg_reader_test.cpp Migrates tests to ParquetInitContext/OrcInitContext init flow and removes set_fill_columns calls.
be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp Updates test reader construction to new Iceberg reader inheritance/constructors.
be/test/format/table/hive/hive_reader_test.cpp Updates to new Hive reader constructors and init contexts.
be/test/format/table/hive/hive_reader_create_column_ids_test.cpp Updates to new Hive reader constructors and metadata access patterns.
be/test/format/parquet/parquet_reader_test.cpp Updates Parquet tests to use ParquetInitContext + path-partition extraction in scan range.
be/test/format/parquet/parquet_read_lines.cpp Updates standalone init to use ParquetInitContext.
be/test/format/parquet/parquet_expr_test.cpp Updates init to use ParquetInitContext.
be/test/format/orc/orc_reader_test.cpp Updates ORC init to use OrcInitContext and adjusts date type expectations.
be/test/format/orc/orc_reader_init_column_test.cpp Updates expectations based on new ORC init column behavior/state.
be/test/format/orc/orc_read_lines.cpp Updates standalone ORC init to use OrcInitContext.
be/test/format/native/native_reader_writer_test.cpp Updates NativeReader::get_columns API expectations.
be/test/format/condition_cache_test.cpp Refactors condition-cache delete-ops tests to align with new reader inheritance model.
be/src/load/group_commit/wal/wal_reader.h Migrates WAL reader to TableFormatReader with WalInitContext and unified init hooks.
be/src/load/group_commit/wal/wal_reader.cpp Implements _open_file_reader/_do_init_reader/_do_get_next_block for WAL reader.
be/src/load/delta_writer/push_handler.cpp Migrates Parquet reader init to ParquetInitContext and new get_columns signature.
be/src/format/table/trino_connector_jni_reader.h Implements unified _do_init_reader adapter for JNI reader.
be/src/format/table/transactional_hive_reader.h Refactors Transactional Hive reader to inherit OrcReader and use init/read hooks.
be/src/format/table/transactional_hive_reader.cpp Implements hook-based ACID schema mapping and delete-delta reading.
be/src/format/table/transactional_hive_common.h Moves ACID row-id types to common header for reuse without circular deps.
be/src/format/table/table_schema_change_helper.h Adds centralized schema-change helper/types used by multiple readers.
be/src/format/table/remote_doris_reader.h Migrates Remote Doris reader to TableFormatReader and unified hooks.
be/src/format/table/remote_doris_reader.cpp Implements _do_get_next_block and _get_columns_impl for Remote Doris reader.
be/src/format/table/parquet_metadata_reader.h Adapts ParquetMetadataReader to unified init and _do_get_next_block.
be/src/format/table/parquet_metadata_reader.cpp Renames get_next_block to _do_get_next_block.
be/src/format/table/paimon_reader.h Splits Paimon readers into direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/paimon_reader.cpp Implements hook-based schema mapping and deletion-vector initialization for Paimon readers.
be/src/format/table/paimon_jni_reader.h Adapts JNI paimon reader to unified init hooks and _do_get_next_block.
be/src/format/table/paimon_jni_reader.cpp Renames get_next_block to _do_get_next_block and delegates correctly.
be/src/format/table/paimon_cpp_reader.h Migrates PaimonCppReader to unified hooks and new get_columns impl.
be/src/format/table/paimon_cpp_reader.cpp Renames read/columns methods to unified hook methods.
be/src/format/table/nested_column_access_helper.h Updates includes to new schema-change helper header.
be/src/format/table/max_compute_jni_reader.h Adds unified _do_init_reader adapter.
be/src/format/table/jdbc_jni_reader.h Renames get_next_block to _do_get_next_block and adds unified init adapter.
be/src/format/table/jdbc_jni_reader.cpp Updates calls to JniReader::_do_get_next_block.
be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_orc_nested_column_utils.h Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_orc_nested_column_utils.cpp Updates includes and dependencies for new helper usage.
be/src/format/table/iceberg_sys_table_jni_reader.h Adds unified _do_init_reader adapter.
be/src/format/table/iceberg_reader.h Refactors Iceberg readers to CRTP mixin over Parquet/ORC with new constructors/hooks.
be/src/format/table/iceberg_delete_file_reader_helper.cpp Migrates delete-reader init to ParquetInitContext/OrcInitContext.
be/src/format/table/hudi_reader.h Refactors Hudi readers to direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/hudi_reader.cpp Implements on_before_init_reader for field-id based schema mapping for Hudi.
be/src/format/table/hudi_jni_reader.h Adds unified _do_init_reader adapter.
be/src/format/table/hive/hive_parquet_nested_column_utils.h Updates includes to schema-change helper.
be/src/format/table/hive/hive_parquet_nested_column_utils.cpp Updates includes to schema-change helper.
be/src/format/table/hive/hive_orc_nested_column_utils.h Updates includes to schema-change helper.
be/src/format/table/hive/hive_orc_nested_column_utils.cpp Updates includes and dependencies for new helper usage.
be/src/format/table/hive_reader.h Refactors Hive readers to direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/hive_reader.cpp Implements Hive on_before_init_reader logic for name/index matching and column-id derivation.
be/src/format/table/equality_delete.cpp Adjusts equality-delete block column lookup and adds debug logging.
be/src/format/parquet/vparquet_reader.h Introduces ParquetInitContext, migrates ParquetReader to unified init and hook-based reading.
be/src/format/parquet/vparquet_reader.cpp Implements unified init, inlines fill-column classification, and renames core read method.
be/src/format/parquet/vparquet_group_reader.h Adapts group reader to use TableFormatReader hooks and provide row positions.
be/src/format/parquet/vparquet_group_reader.cpp Replaces local fill logic with TableFormatReader hook calls; removes Iceberg rowid building code.
be/src/format/parquet/vparquet_column_reader.h Updates includes/dependencies consistent with new reader layering.
be/src/format/parquet/schema_desc.cpp Updates includes to avoid dependency on old table_format_reader include path.
be/src/format/orc/vorc_reader.h Introduces OrcInitContext, migrates OrcReader to unified init, and updates delete row types.
be/src/format/native/native_reader.h Migrates NativeReader to TableFormatReader and unified hooks.
be/src/format/native/native_reader.cpp Renames read/columns methods to unified hook methods.
be/src/format/json/new_json_reader.h Introduces JsonInitContext and migrates NewJsonReader to unified init hooks.
be/src/format/json/new_json_reader.cpp Implements unified init and renames block/columns methods.
be/src/format/jni/jni_reader.h Migrates JNI reader to unified _do_get_next_block and _get_columns_impl.
be/src/format/jni/jni_reader.cpp Renames get_next_block to _do_get_next_block.
be/src/format/generic_reader.h Adds ReaderInitContext, template-method init/get_next_block flow, and cached get_columns.
be/src/format/generic_reader.cpp Adds translation unit for GenericReader (currently no out-of-line methods).
be/src/format/csv/csv_reader.h Introduces CsvInitContext and migrates CsvReader to unified init/get_next_block hooks.
be/src/format/csv/csv_reader.cpp Implements unified init and renames read/columns methods.
be/src/format/count_reader.h Adds CountReader decorator to emit row counts without format-specific COUNT logic.
be/src/format/column_descriptor.h Adds ColumnDescriptor and ColumnCategory to drive unified column classification/filling.
be/src/format/arrow/arrow_stream_reader.h Migrates ArrowStreamReader to unified hooks and new get_columns impl.
be/src/format/arrow/arrow_stream_reader.cpp Renames methods to unified hook methods.
be/src/exec/scan/file_scanner.h Introduces shared init-context fill helpers and handler configuration hooks for scan flow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +75 to +102
/// Delegate to inner reader if available, otherwise return our total.
int64_t get_total_rows() const override {
return _inner_reader ? _inner_reader->get_total_rows() : _initial_total_rows();
}

Status close() override {
if (_inner_reader) {
return _inner_reader->close();
}
return Status::OK();
}

/// Access the inner reader for profile collection or other lifecycle needs.
GenericReader* inner_reader() const { return _inner_reader.get(); }

protected:
void _collect_profile_before_close() override {
if (_inner_reader) {
_inner_reader->collect_profile_before_close();
}
}

private:
int64_t _initial_total_rows() const { return _remaining_rows; }

int64_t _remaining_rows;
size_t _batch_size;
std::unique_ptr<GenericReader> _inner_reader;
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CountReader::get_total_rows() falls back to _initial_total_rows(), but _initial_total_rows() returns _remaining_rows, which decreases as batches are emitted. This makes get_total_rows() time-dependent and can underreport after reading starts. Store the initial total rows in a separate member (e.g., _total_rows) and return that when there is no inner reader.

Copilot uses AI. Check for mistakes.
Comment on lines 22 to 26
#include "core/data_type/data_type_factory.hpp"
#include "format/orc/vorc_reader.h"
#include "format/table/table_format_reader.h"
#include "format/table/transactional_hive_common.h"
#include "format/table/table_schema_change_helper.h"
#include "transactional_hive_common.h"

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file now includes the header as "transactional_hive_common.h", but other translation units include it via the full path (e.g. "format/table/transactional_hive_common.h"). Using the short include path is likely to break compilation depending on include directories. Prefer the consistent full include path here as well.

Copilot uses AI. Check for mistakes.
Comment on lines 80 to +87
void set_delete_rows() final {
auto* parquet_reader = (ParquetReader*)(_file_format_reader.get());
parquet_reader->set_delete_rows(_iceberg_delete_rows);
LOG(INFO) << "[PosDeleteDebug] IcebergParquetReader::set_delete_rows: _iceberg_delete_rows="
<< (_iceberg_delete_rows
? "set(" + std::to_string(_iceberg_delete_rows->size()) + ")"
: "null");
// Call ParquetReader's set_delete_rows(const vector<int64_t>*)
ParquetReader::set_delete_rows(_iceberg_delete_rows);
}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The added LOG(INFO) debug output in set_delete_rows() will execute in production and can be very noisy (potentially per file/scan), impacting performance and log volume. Please remove these logs or downgrade to a conditional debug log (e.g. VLOG) guarded by a debug flag.

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +60
LOG(INFO) << "[EqDeleteDebug] SimpleEqualityDelete::filter_data_block: field_id="
<< column_field_id << ", block_col_name=" << block_col_name
<< ", block_idx=" << block_idx << ", delete_block_rows=" << _delete_block->rows()
<< ", data_block_rows=" << data_block->rows();
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LOG(INFO) statement inside SimpleEqualityDelete::filter_data_block() will run for every data block filtered, which can significantly increase log volume and slow down scans. Please remove it or change to a low-verbosity debug log (e.g. VLOG) behind a feature flag.

Suggested change
LOG(INFO) << "[EqDeleteDebug] SimpleEqualityDelete::filter_data_block: field_id="
<< column_field_id << ", block_col_name=" << block_col_name
<< ", block_idx=" << block_idx << ", delete_block_rows=" << _delete_block->rows()
<< ", data_block_rows=" << data_block->rows();

Copilot uses AI. Check for mistakes.
Comment on lines +830 to +836
LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: _delete_rows="
<< (_delete_rows ? "set(" + std::to_string(_delete_rows->size()) + ")" : "null")
<< " row_group.num_rows=" << row_group.num_rows
<< " first_row=" << row_group_index.first_row;
if (_delete_rows == nullptr) {
LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: NO delete rows, returning "
"no-filter ctx";
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParquetReader::_get_position_delete_ctx() now logs with LOG(INFO) on every row group. This will be extremely noisy on large files and adds overhead in hot path. Please remove these logs or convert them to VLOG/TRACE guarded by a debug setting.

Suggested change
LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: _delete_rows="
<< (_delete_rows ? "set(" + std::to_string(_delete_rows->size()) + ")" : "null")
<< " row_group.num_rows=" << row_group.num_rows
<< " first_row=" << row_group_index.first_row;
if (_delete_rows == nullptr) {
LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: NO delete rows, returning "
"no-filter ctx";
if (_delete_rows == nullptr) {

Copilot uses AI. Check for mistakes.
@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 37.21% (16/43) 🎉
Increment coverage report
Complete coverage report

kaka11chen and others added 20 commits April 10, 2026 21:00
2. Load query decoupling.
3. unify FE default value.
### What problem does this PR solve?

Problem Summary: 8 standalone reader instances (Iceberg position/equality
delete file readers, Hive ACID delete delta reader, push handler) bypassed
the GenericReader NVI template by directly calling the low-level
_do_init_reader(many params) method. This broke encapsulation and prevented
the template from enforcing the standard lifecycle.

### Changes

1. NVI init_reader now skips hooks when column_descs==nullptr (standalone mode):
   standalone readers pre-populate column_names and skip on_before/on_after hooks
2. Renamed low-level _do_init_reader(many params) to _do_init_reader_internal
   and moved from public to private in both ParquetReader and OrcReader
3. Converted all 8 callers to construct ParquetInitContext/OrcInitContext and
   call init_reader(&ctx) through the unified NVI template
4. Added safe defaults to ParquetInitContext (EMPTY_CONJUNCTS, EMPTY_SLOT_PREDICATES)
   and OrcInitContext (EMPTY_CONJUNCTS) so standalone callers don't crash on deref
5. Changed ReaderInitContext::table_info_node default to ConstNode::get_instance()

Converted callers:
- iceberg_reader.cpp: Parquet/ORC position delete readers
- iceberg_reader_mixin.h: Parquet/ORC equality delete readers
- iceberg_delete_file_reader_helper.cpp: Parquet/ORC delete readers
- transactional_hive_reader.cpp: ORC ACID delete delta reader
- push_handler.cpp: Parquet load push reader

### Release note

None

### Check List (For Author)

- Test: No need to test (pure refactoring, no behavioral change)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: GenericReader::get_next_block was a pure virtual that each
reader overrode independently. TransactionalHiveReader and IcebergReaderMixin
manually called on_before_read_block/on_after_read_block hooks, while other
readers skipped them entirely. This inconsistency made it easy to forget
calling hooks in new readers.

Make get_next_block a non-virtual template method in GenericReader:
  on_before_read_block → _do_get_next_block → on_after_read_block

All 15 readers now override _do_get_next_block instead. The hooks (already
declared in GenericReader) are now automatically invoked.

Key changes:
- GenericReader::get_next_block: non-virtual, calls 3 virtuals in order
- _do_get_next_block: new pure virtual replacing the old get_next_block
- TransactionalHiveReader: drops get_next_block override (hooks called by base)
- IcebergReaderMixin: COUNT short-circuit moves to _do_get_next_block
- All other readers: mechanical rename get_next_block → _do_get_next_block

### Release note

None

### Check List (For Author)

- Test: No need to test (pure refactoring, no behavioral change)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem Summary: Simple readers (CSV, JSON, Native, RemoteDoris, WAL) each
manually call fill_remaining_columns() in their _do_get_next_block. This is
error-prone and duplicative.

Introduce TableFormatReader as an intermediate base class between GenericReader
and table-level readers. Its default on_after_read_block automatically calls
fill_remaining_columns() for partition/missing/synthesized columns.

- Simple readers inherit TableFormatReader and no longer call fill manually
- Parquet/ORC override on_after_read_block to no-op (they fill per-batch internally)
- IcebergReaderMixin and TransactionalHiveReader override chains are unaffected

None

- Test: No need to test - pure refactor, existing regression tests cover all paths
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ormatReader

### What problem does this PR solve?

Problem Summary: Column-filling state (partition values, missing defaults,
synthesized column handlers) and methods (on_fill_partition_columns,
on_fill_missing_columns, fill_remaining_columns, etc.) were in GenericReader
but only used by table-level readers. This polluted the base class interface.

Move all column-filling data members, methods, and the default
on_before_init_reader implementation from GenericReader to TableFormatReader:

- set_fill_column_data, missing_cols, on_fill_partition_columns,
  on_fill_missing_columns, fill_synthesized_columns, fill_remaining_columns,
  register_synthesized_column_handler, _extract_partition_values
- Data: _fill_partition_values, _fill_missing_defaults,
  _fill_col_name_to_block_idx, _fill_missing_cols, _synthesized_col_handlers

GenericReader::on_before_init_reader is now a minimal no-op (just sets
_column_descs). RowGroupReader pointer changed from GenericReader* to
TableFormatReader*.

### Release note

None

### Check List (For Author)

- Test: No need to test - pure code movement, no behavior change
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: Parquet/ORC computed missing columns and fill defaults
in _do_init_reader, which mixed fill-column logic with format-specific
initialization. Move this computation to on_before_init_reader so that
by the time _do_init_reader runs, all fill data is already prepared.

Changes:
- Parquet/ORC on_before_init_reader: after building table_info_node,
  compute _fill_missing_cols and _fill_missing_defaults
- Parquet/ORC _do_init_reader: use pre-computed _fill_missing_cols
  instead of re-detecting missing columns
- Remove _missing_cols member from both Parquet and ORC headers
  (replaced by _fill_missing_cols from TableFormatReader base)

### Release note

None

### Check List (For Author)

- Test: No need to test (pure refactoring, behavior-preserving)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Problem Summary: Standalone readers (PushHandler, Iceberg equality delete
file readers) call init_reader with column_descs=nullptr, which skips
on_before_init_reader. This left _read_file_columns/_read_table_columns
empty and caused potential null pointer dereference on ctx->conjuncts.

Changes:
- Guard ctx->conjuncts and ctx->slot_id_to_predicates nullptr access
  in both Parquet and ORC _do_init_reader
- Add fallback 1:1 column mapping for standalone callers: when
  has_column_descs() is false and _read_file_columns is empty,
  populate read columns directly from file schema (Parquet) or
  column_names (ORC)

### Release note

None

### Check List (For Author)

- Test: No need to test (fixes latent bug in standalone path)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem Summary: The parquet test files still used the old init_reader/_do_init_reader
signatures with many individual parameters. This commit migrates all call sites in
parquet_reader_test.cpp (9 sites) and parquet_read_lines.cpp (1 site) to use the new
NVI API with ParquetInitContext.

None

- Test: No need to test (refactoring API call pattern only, build verified)
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@kaka11chen kaka11chen force-pushed the data_lake_reader_refactoring branch from ae53802 to 29774b4 Compare April 10, 2026 13:10
@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 37.21% (16/43) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.00% (2881/3274) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.60% (27452/37301)
Line Coverage 57.26% (295883/516757)
Region Coverage 54.51% (246813/452796)
Branch Coverage 56.10% (106737/190252)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants