[refactoring](multi-catalog)data_lake_reader_refactoring. by kaka11chen · Pull Request #62306 · apache/doris

kaka11chen · 2026-04-10T02:59:14Z

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

Thearas · 2026-04-10T02:59:20Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

kaka11chen · 2026-04-10T02:59:26Z

run buildall

Copilot

Pull request overview

Refactors the data lake reader pipeline to use a unified init_reader(ReaderInitContext*) template-method flow across file/table-format readers, while extending FE/BE scan metadata to carry per-slot column categories and inline default-value expressions.

Changes:

Add TColumnCategory and inline default_value_expr into TFileScanSlotInfo (while keeping a legacy default map for backward compatibility).
Introduce ReaderInitContext (+ format-specific init contexts) and migrate Parquet/ORC and multiple table-format readers (Hive/Iceberg/Paimon/Transactional Hive, etc.) to the unified init/get_next_block hook model.
Update a large set of BE unit tests and scan-node logic to use the new init contexts and column classification.

Reviewed changes

Copilot reviewed 88 out of 88 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
gensrc/thrift/PlanNodes.thrift	Adds `TColumnCategory` and embeds `default_value_expr` into required slot info.
fe/fe-core/src/main/java/org/apache/doris/nereids/load/NereidsLoadPlanInfoCollector.java	Sets slot default expressions inline in required slot info while keeping legacy map.
fe/fe-core/src/main/java/org/apache/doris/datasource/tvf/source/TVFScanNode.java	Overrides column classification to mark global rowid columns as synthesized.
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/source/IcebergScanNode.java	Adds Iceberg-specific column categorization (rowid/lineage).
fe/fe-core/src/main/java/org/apache/doris/datasource/hive/source/HiveScanNode.java	Adds Hive-specific handling for synthesized global rowid columns.
fe/fe-core/src/main/java/org/apache/doris/datasource/FileScanNode.java	Inlines default expressions into `TFileScanSlotInfo` in addition to legacy map.
fe/fe-core/src/main/java/org/apache/doris/datasource/FileQueryScanNode.java	Introduces `classifyColumn` API and uses `TColumnCategory` to set `is_file_slot`.
be/test/format/table/table_schema_change_helper_test.cpp	Updates tests for reordered helper function parameters and includes new helper header.
be/test/format/table/iceberg/iceberg_reader_test.cpp	Migrates tests to `ParquetInitContext`/`OrcInitContext` init flow and removes `set_fill_columns` calls.
be/test/format/table/iceberg/iceberg_reader_create_column_ids_test.cpp	Updates test reader construction to new Iceberg reader inheritance/constructors.
be/test/format/table/hive/hive_reader_test.cpp	Updates to new Hive reader constructors and init contexts.
be/test/format/table/hive/hive_reader_create_column_ids_test.cpp	Updates to new Hive reader constructors and metadata access patterns.
be/test/format/parquet/parquet_reader_test.cpp	Updates Parquet tests to use `ParquetInitContext` + path-partition extraction in scan range.
be/test/format/parquet/parquet_read_lines.cpp	Updates standalone init to use `ParquetInitContext`.
be/test/format/parquet/parquet_expr_test.cpp	Updates init to use `ParquetInitContext`.
be/test/format/orc/orc_reader_test.cpp	Updates ORC init to use `OrcInitContext` and adjusts date type expectations.
be/test/format/orc/orc_reader_init_column_test.cpp	Updates expectations based on new ORC init column behavior/state.
be/test/format/orc/orc_read_lines.cpp	Updates standalone ORC init to use `OrcInitContext`.
be/test/format/native/native_reader_writer_test.cpp	Updates `NativeReader::get_columns` API expectations.
be/test/format/condition_cache_test.cpp	Refactors condition-cache delete-ops tests to align with new reader inheritance model.
be/src/load/group_commit/wal/wal_reader.h	Migrates WAL reader to `TableFormatReader` with `WalInitContext` and unified init hooks.
be/src/load/group_commit/wal/wal_reader.cpp	Implements `_open_file_reader`/`_do_init_reader`/`_do_get_next_block` for WAL reader.
be/src/load/delta_writer/push_handler.cpp	Migrates Parquet reader init to `ParquetInitContext` and new `get_columns` signature.
be/src/format/table/trino_connector_jni_reader.h	Implements unified `_do_init_reader` adapter for JNI reader.
be/src/format/table/transactional_hive_reader.h	Refactors Transactional Hive reader to inherit `OrcReader` and use init/read hooks.
be/src/format/table/transactional_hive_reader.cpp	Implements hook-based ACID schema mapping and delete-delta reading.
be/src/format/table/transactional_hive_common.h	Moves ACID row-id types to common header for reuse without circular deps.
be/src/format/table/table_schema_change_helper.h	Adds centralized schema-change helper/types used by multiple readers.
be/src/format/table/remote_doris_reader.h	Migrates Remote Doris reader to `TableFormatReader` and unified hooks.
be/src/format/table/remote_doris_reader.cpp	Implements `_do_get_next_block` and `_get_columns_impl` for Remote Doris reader.
be/src/format/table/parquet_metadata_reader.h	Adapts ParquetMetadataReader to unified init and `_do_get_next_block`.
be/src/format/table/parquet_metadata_reader.cpp	Renames `get_next_block` to `_do_get_next_block`.
be/src/format/table/paimon_reader.h	Splits Paimon readers into direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/paimon_reader.cpp	Implements hook-based schema mapping and deletion-vector initialization for Paimon readers.
be/src/format/table/paimon_jni_reader.h	Adapts JNI paimon reader to unified init hooks and `_do_get_next_block`.
be/src/format/table/paimon_jni_reader.cpp	Renames `get_next_block` to `_do_get_next_block` and delegates correctly.
be/src/format/table/paimon_cpp_reader.h	Migrates PaimonCppReader to unified hooks and new `get_columns` impl.
be/src/format/table/paimon_cpp_reader.cpp	Renames read/columns methods to unified hook methods.
be/src/format/table/nested_column_access_helper.h	Updates includes to new schema-change helper header.
be/src/format/table/max_compute_jni_reader.h	Adds unified `_do_init_reader` adapter.
be/src/format/table/jdbc_jni_reader.h	Renames `get_next_block` to `_do_get_next_block` and adds unified init adapter.
be/src/format/table/jdbc_jni_reader.cpp	Updates calls to `JniReader::_do_get_next_block`.
be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.h	Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_parquet_nested_column_utils.cpp	Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_orc_nested_column_utils.h	Updates includes to schema-change helper.
be/src/format/table/iceberg/iceberg_orc_nested_column_utils.cpp	Updates includes and dependencies for new helper usage.
be/src/format/table/iceberg_sys_table_jni_reader.h	Adds unified `_do_init_reader` adapter.
be/src/format/table/iceberg_reader.h	Refactors Iceberg readers to CRTP mixin over Parquet/ORC with new constructors/hooks.
be/src/format/table/iceberg_delete_file_reader_helper.cpp	Migrates delete-reader init to `ParquetInitContext`/`OrcInitContext`.
be/src/format/table/hudi_reader.h	Refactors Hudi readers to direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/hudi_reader.cpp	Implements `on_before_init_reader` for field-id based schema mapping for Hudi.
be/src/format/table/hudi_jni_reader.h	Adds unified `_do_init_reader` adapter.
be/src/format/table/hive/hive_parquet_nested_column_utils.h	Updates includes to schema-change helper.
be/src/format/table/hive/hive_parquet_nested_column_utils.cpp	Updates includes to schema-change helper.
be/src/format/table/hive/hive_orc_nested_column_utils.h	Updates includes to schema-change helper.
be/src/format/table/hive/hive_orc_nested_column_utils.cpp	Updates includes and dependencies for new helper usage.
be/src/format/table/hive_reader.h	Refactors Hive readers to direct Parquet/ORC inheritance with hook-based schema mapping.
be/src/format/table/hive_reader.cpp	Implements Hive `on_before_init_reader` logic for name/index matching and column-id derivation.
be/src/format/table/equality_delete.cpp	Adjusts equality-delete block column lookup and adds debug logging.
be/src/format/parquet/vparquet_reader.h	Introduces `ParquetInitContext`, migrates ParquetReader to unified init and hook-based reading.
be/src/format/parquet/vparquet_reader.cpp	Implements unified init, inlines fill-column classification, and renames core read method.
be/src/format/parquet/vparquet_group_reader.h	Adapts group reader to use TableFormatReader hooks and provide row positions.
be/src/format/parquet/vparquet_group_reader.cpp	Replaces local fill logic with TableFormatReader hook calls; removes Iceberg rowid building code.
be/src/format/parquet/vparquet_column_reader.h	Updates includes/dependencies consistent with new reader layering.
be/src/format/parquet/schema_desc.cpp	Updates includes to avoid dependency on old table_format_reader include path.
be/src/format/orc/vorc_reader.h	Introduces `OrcInitContext`, migrates OrcReader to unified init, and updates delete row types.
be/src/format/native/native_reader.h	Migrates NativeReader to `TableFormatReader` and unified hooks.
be/src/format/native/native_reader.cpp	Renames read/columns methods to unified hook methods.
be/src/format/json/new_json_reader.h	Introduces `JsonInitContext` and migrates NewJsonReader to unified init hooks.
be/src/format/json/new_json_reader.cpp	Implements unified init and renames block/columns methods.
be/src/format/jni/jni_reader.h	Migrates JNI reader to unified `_do_get_next_block` and `_get_columns_impl`.
be/src/format/jni/jni_reader.cpp	Renames `get_next_block` to `_do_get_next_block`.
be/src/format/generic_reader.h	Adds `ReaderInitContext`, template-method init/get_next_block flow, and cached get_columns.
be/src/format/generic_reader.cpp	Adds translation unit for GenericReader (currently no out-of-line methods).
be/src/format/csv/csv_reader.h	Introduces `CsvInitContext` and migrates CsvReader to unified init/get_next_block hooks.
be/src/format/csv/csv_reader.cpp	Implements unified init and renames read/columns methods.
be/src/format/count_reader.h	Adds `CountReader` decorator to emit row counts without format-specific COUNT logic.
be/src/format/column_descriptor.h	Adds `ColumnDescriptor` and `ColumnCategory` to drive unified column classification/filling.
be/src/format/arrow/arrow_stream_reader.h	Migrates ArrowStreamReader to unified hooks and new get_columns impl.
be/src/format/arrow/arrow_stream_reader.cpp	Renames methods to unified hook methods.
be/src/exec/scan/file_scanner.h	Introduces shared init-context fill helpers and handler configuration hooks for scan flow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T03:03:45Z

be/src/format/count_reader.h

+    /// Delegate to inner reader if available, otherwise return our total.
+    int64_t get_total_rows() const override {
+        return _inner_reader ? _inner_reader->get_total_rows() : _initial_total_rows();
+    }
+
+    Status close() override {
+        if (_inner_reader) {
+            return _inner_reader->close();
+        }
+        return Status::OK();
+    }
+
+    /// Access the inner reader for profile collection or other lifecycle needs.
+    GenericReader* inner_reader() const { return _inner_reader.get(); }
+
+protected:
+    void _collect_profile_before_close() override {
+        if (_inner_reader) {
+            _inner_reader->collect_profile_before_close();
+        }
+    }
+
+private:
+    int64_t _initial_total_rows() const { return _remaining_rows; }
+
+    int64_t _remaining_rows;
+    size_t _batch_size;
+    std::unique_ptr<GenericReader> _inner_reader;


CountReader::get_total_rows() falls back to _initial_total_rows(), but _initial_total_rows() returns _remaining_rows, which decreases as batches are emitted. This makes get_total_rows() time-dependent and can underreport after reading starts. Store the initial total rows in a separate member (e.g., _total_rows) and return that when there is no inner reader.

Copilot · 2026-04-10T03:03:45Z

be/src/format/table/transactional_hive_reader.cpp

 #include "core/data_type/data_type_factory.hpp"
 #include "format/orc/vorc_reader.h"
-#include "format/table/table_format_reader.h"
-#include "format/table/transactional_hive_common.h"
+#include "format/table/table_schema_change_helper.h"
+#include "transactional_hive_common.h"



This file now includes the header as "transactional_hive_common.h", but other translation units include it via the full path (e.g. "format/table/transactional_hive_common.h"). Using the short include path is likely to break compilation depending on include directories. Prefer the consistent full include path here as well.

Copilot · 2026-04-10T03:03:46Z

be/src/format/table/iceberg_reader.h

    void set_delete_rows() final {
-        auto* parquet_reader = (ParquetReader*)(_file_format_reader.get());
-        parquet_reader->set_delete_rows(_iceberg_delete_rows);
+        LOG(INFO) << "[PosDeleteDebug] IcebergParquetReader::set_delete_rows: _iceberg_delete_rows="
+                  << (_iceberg_delete_rows
+                              ? "set(" + std::to_string(_iceberg_delete_rows->size()) + ")"
+                              : "null");
+        // Call ParquetReader's set_delete_rows(const vector<int64_t>*)
+        ParquetReader::set_delete_rows(_iceberg_delete_rows);
+    }


The added LOG(INFO) debug output in set_delete_rows() will execute in production and can be very noisy (potentially per file/scan), impacting performance and log volume. Please remove these logs or downgrade to a conditional debug log (e.g. VLOG) guarded by a debug flag.

Copilot · 2026-04-10T03:03:46Z

be/src/format/table/equality_delete.cpp

+    LOG(INFO) << "[EqDeleteDebug] SimpleEqualityDelete::filter_data_block: field_id="
+              << column_field_id << ", block_col_name=" << block_col_name
+              << ", block_idx=" << block_idx << ", delete_block_rows=" << _delete_block->rows()
+              << ", data_block_rows=" << data_block->rows();


This LOG(INFO) statement inside SimpleEqualityDelete::filter_data_block() will run for every data block filtered, which can significantly increase log volume and slow down scans. Please remove it or change to a low-verbosity debug log (e.g. VLOG) behind a feature flag.

Suggested change

LOG(INFO) << "[EqDeleteDebug] SimpleEqualityDelete::filter_data_block: field_id="

<< column_field_id << ", block_col_name=" << block_col_name

<< ", block_idx=" << block_idx << ", delete_block_rows=" << _delete_block->rows()

<< ", data_block_rows=" << data_block->rows();

Copilot · 2026-04-10T03:03:46Z

be/src/format/parquet/vparquet_reader.cpp

+    LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: _delete_rows="
+              << (_delete_rows ? "set(" + std::to_string(_delete_rows->size()) + ")" : "null")
+              << " row_group.num_rows=" << row_group.num_rows
+              << " first_row=" << row_group_index.first_row;
    if (_delete_rows == nullptr) {
+        LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: NO delete rows, returning "
+                     "no-filter ctx";


ParquetReader::_get_position_delete_ctx() now logs with LOG(INFO) on every row group. This will be extremely noisy on large files and adds overhead in hot path. Please remove these logs or convert them to VLOG/TRACE guarded by a debug setting.

Suggested change

LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: _delete_rows="

<< (_delete_rows ? "set(" + std::to_string(_delete_rows->size()) + ")" : "null")

<< " row_group.num_rows=" << row_group.num_rows

<< " first_row=" << row_group_index.first_row;

if (_delete_rows == nullptr) {

LOG(INFO) << "[PosDeleteDebug] _get_position_delete_ctx: NO delete rows, returning "

"no-filter ctx";

if (_delete_rows == nullptr) {

hello-stephen · 2026-04-10T04:19:16Z

FE UT Coverage Report

Increment line coverage 37.21% (16/43) 🎉
Increment coverage report
Complete coverage report

2. Load query decoupling. 3. unify FE default value.

### What problem does this PR solve? Problem Summary: 8 standalone reader instances (Iceberg position/equality delete file readers, Hive ACID delete delta reader, push handler) bypassed the GenericReader NVI template by directly calling the low-level _do_init_reader(many params) method. This broke encapsulation and prevented the template from enforcing the standard lifecycle. ### Changes 1. NVI init_reader now skips hooks when column_descs==nullptr (standalone mode): standalone readers pre-populate column_names and skip on_before/on_after hooks 2. Renamed low-level _do_init_reader(many params) to _do_init_reader_internal and moved from public to private in both ParquetReader and OrcReader 3. Converted all 8 callers to construct ParquetInitContext/OrcInitContext and call init_reader(&ctx) through the unified NVI template 4. Added safe defaults to ParquetInitContext (EMPTY_CONJUNCTS, EMPTY_SLOT_PREDICATES) and OrcInitContext (EMPTY_CONJUNCTS) so standalone callers don't crash on deref 5. Changed ReaderInitContext::table_info_node default to ConstNode::get_instance() Converted callers: - iceberg_reader.cpp: Parquet/ORC position delete readers - iceberg_reader_mixin.h: Parquet/ORC equality delete readers - iceberg_delete_file_reader_helper.cpp: Parquet/ORC delete readers - transactional_hive_reader.cpp: ORC ACID delete delta reader - push_handler.cpp: Parquet load push reader ### Release note None ### Check List (For Author) - Test: No need to test (pure refactoring, no behavioral change) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

### What problem does this PR solve? Problem Summary: GenericReader::get_next_block was a pure virtual that each reader overrode independently. TransactionalHiveReader and IcebergReaderMixin manually called on_before_read_block/on_after_read_block hooks, while other readers skipped them entirely. This inconsistency made it easy to forget calling hooks in new readers. Make get_next_block a non-virtual template method in GenericReader: on_before_read_block → _do_get_next_block → on_after_read_block All 15 readers now override _do_get_next_block instead. The hooks (already declared in GenericReader) are now automatically invoked. Key changes: - GenericReader::get_next_block: non-virtual, calls 3 virtuals in order - _do_get_next_block: new pure virtual replacing the old get_next_block - TransactionalHiveReader: drops get_next_block override (hooks called by base) - IcebergReaderMixin: COUNT short-circuit moves to _do_get_next_block - All other readers: mechanical rename get_next_block → _do_get_next_block ### Release note None ### Check List (For Author) - Test: No need to test (pure refactoring, no behavioral change) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Problem Summary: Simple readers (CSV, JSON, Native, RemoteDoris, WAL) each manually call fill_remaining_columns() in their _do_get_next_block. This is error-prone and duplicative. Introduce TableFormatReader as an intermediate base class between GenericReader and table-level readers. Its default on_after_read_block automatically calls fill_remaining_columns() for partition/missing/synthesized columns. - Simple readers inherit TableFormatReader and no longer call fill manually - Parquet/ORC override on_after_read_block to no-op (they fill per-batch internally) - IcebergReaderMixin and TransactionalHiveReader override chains are unaffected None - Test: No need to test - pure refactor, existing regression tests cover all paths - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…ormatReader ### What problem does this PR solve? Problem Summary: Column-filling state (partition values, missing defaults, synthesized column handlers) and methods (on_fill_partition_columns, on_fill_missing_columns, fill_remaining_columns, etc.) were in GenericReader but only used by table-level readers. This polluted the base class interface. Move all column-filling data members, methods, and the default on_before_init_reader implementation from GenericReader to TableFormatReader: - set_fill_column_data, missing_cols, on_fill_partition_columns, on_fill_missing_columns, fill_synthesized_columns, fill_remaining_columns, register_synthesized_column_handler, _extract_partition_values - Data: _fill_partition_values, _fill_missing_defaults, _fill_col_name_to_block_idx, _fill_missing_cols, _synthesized_col_handlers GenericReader::on_before_init_reader is now a minimal no-op (just sets _column_descs). RowGroupReader pointer changed from GenericReader* to TableFormatReader*. ### Release note None ### Check List (For Author) - Test: No need to test - pure code movement, no behavior change - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

### What problem does this PR solve? Problem Summary: Parquet/ORC computed missing columns and fill defaults in _do_init_reader, which mixed fill-column logic with format-specific initialization. Move this computation to on_before_init_reader so that by the time _do_init_reader runs, all fill data is already prepared. Changes: - Parquet/ORC on_before_init_reader: after building table_info_node, compute _fill_missing_cols and _fill_missing_defaults - Parquet/ORC _do_init_reader: use pre-computed _fill_missing_cols instead of re-detecting missing columns - Remove _missing_cols member from both Parquet and ORC headers (replaced by _fill_missing_cols from TableFormatReader base) ### Release note None ### Check List (For Author) - Test: No need to test (pure refactoring, behavior-preserving) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

### What problem does this PR solve? Problem Summary: Standalone readers (PushHandler, Iceberg equality delete file readers) call init_reader with column_descs=nullptr, which skips on_before_init_reader. This left _read_file_columns/_read_table_columns empty and caused potential null pointer dereference on ctx->conjuncts. Changes: - Guard ctx->conjuncts and ctx->slot_id_to_predicates nullptr access in both Parquet and ORC _do_init_reader - Add fallback 1:1 column mapping for standalone callers: when has_column_descs() is false and _read_file_columns is empty, populate read columns directly from file schema (Parquet) or column_names (ORC) ### Release note None ### Check List (For Author) - Test: No need to test (fixes latent bug in standalone path) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Problem Summary: The parquet test files still used the old init_reader/_do_init_reader signatures with many individual parameters. This commit migrates all call sites in parquet_reader_test.cpp (9 sites) and parquet_read_lines.cpp (1 site) to use the new NVI API with ParquetInitContext. None - Test: No need to test (refactoring API call pattern only, build verified) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test  - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason  - Behavior changed: - [ ] No. - [ ] Yes.  - Does this need documentation? - [ ] No. - [ ] Yes.  ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label

kaka11chen · 2026-04-10T13:10:29Z

run buildall

hello-stephen · 2026-04-10T14:25:13Z

FE UT Coverage Report

Increment line coverage 37.21% (16/43) 🎉
Increment coverage report
Complete coverage report

hello-stephen · 2026-04-10T18:36:11Z

BE Regression && UT Coverage Report

Increment line coverage 88.00% (2881/3274) 🎉

Increment coverage report
Complete coverage report

Category	Coverage
Function Coverage	73.60% (27452/37301)
Line Coverage	57.26% (295883/516757)
Region Coverage	54.51% (246813/452796)
Branch Coverage	56.10% (106737/190252)

Copilot AI review requested due to automatic review settings April 10, 2026 02:59

Copilot started reviewing on behalf of kaka11chen April 10, 2026 03:00 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

kaka11chen and others added 20 commits April 10, 2026 21:00

[refactoring](multi-catalog)data_lake_reader_refactoring.

fcad2a1

1. Refactor count agg push down.

5420e85

2. Load query decoupling. 3. unify FE default value.

fix.

9f8c8eb

unify reader init.

0c6293f

update.

9737aed

fix

79fa1d8

remove _fill_columns_from_path.

7afb6dc

update

e0e1f6b

update.

4e4c5cf

update push down agg.

55d1881

remove compile check

29774b4

kaka11chen force-pushed the data_lake_reader_refactoring branch from ae53802 to 29774b4 Compare April 10, 2026 13:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[refactoring](multi-catalog)data_lake_reader_refactoring. #62306

[refactoring](multi-catalog)data_lake_reader_refactoring. #62306
kaka11chen wants to merge 20 commits intomasterfrom
data_lake_reader_refactoring

kaka11chen commented Apr 10, 2026

Uh oh!

Thearas commented Apr 10, 2026

Uh oh!

kaka11chen commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

hello-stephen commented Apr 10, 2026

Uh oh!

kaka11chen commented Apr 10, 2026

Uh oh!

hello-stephen commented Apr 10, 2026

Uh oh!

hello-stephen commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

kaka11chen commented Apr 10, 2026

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

Thearas commented Apr 10, 2026

Uh oh!

kaka11chen commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

hello-stephen commented Apr 10, 2026

FE UT Coverage Report

Uh oh!

kaka11chen commented Apr 10, 2026

Uh oh!

hello-stephen commented Apr 10, 2026

FE UT Coverage Report

Uh oh!

hello-stephen commented Apr 10, 2026

BE Regression && UT Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants