Skip to content

Conversation

@eldenmoon
Copy link
Member

…e#57702)

Reduce `SegmentFooterPB.columns` bloat caused by Variant subcolumns
and enable true on-demand meta loading.
- Introduce a general external column meta (CMO) layout that can be
reused for non-Variant columns.
- Add `storage_format = V3` as “V2 + external column meta” to safely
experiment in fuzzy tests.

Segment tail layout (not to scale):

```text
+------------------------------------------------------------------+
| ... data pages ...                                               |
|                                                                  |
| Column Meta Region (per top-level col, col_id ascending)        |
|   [Meta[col0]] [Meta[col1]] ... [Meta[colN-1]]                   |
|                                                                  |
| Footer (SegmentFooterPB)                                         |
|   col_meta_region_start      : uint64                            |
|   column_meta_entries        : repeated ColumnMetaEntryPB        |
|         - unique_id          : int32                             |
|         - length             : uint32                            |
|   num_columns                : uint32                            |
+------------------------------------------------------------------+
```

- Introduce `ExternalColMetaUtil` with:
- `parse_external_meta_pointers()` to read `col_meta_region_start`,
`column_meta_offsets_start`, `num_columns` from footer and validate
them.
- `parse_uid_to_colid_map()` to build a `uid -> col_id` map based on
`top_level_column_uids[]`.
- `is_valid_meta_slice()` to sanity-check `(pos, size)` against the
external meta region with overflow guards.
- `read_col_meta()` to read a single `ColumnMetaPB` by `col_id` from
external meta region, using index cache lane.
  - `write_external_column_meta()` to:
- externalize Variant subcolumns first (via `VariantExtMetaWriter`),
then
    - write a contiguous Column Meta Region, then
    - write CMO entries, then
- fill footer pointers and `top_level_column_uids`, finally clear
`footer.columns()` to enable external-first loading.
- Wire this layout into `SegmentWriter` / `Segment` / `ColumnReader` so
that:
  - top-level column metas can be loaded lazily via CMO,
- old inline metas are no longer required once external meta is enabled.

Variant subcolumn meta layout per root `uid`:

```text
variant_meta_keys.<uid>    : IndexedColumn<VARCHAR> (path)
  - value index: ON (fast seek_at_or_after by path)
  - ordinal index: OFF
  - encoding: PREFIX_ENCODING

footer.file_meta_datas:
  - key = "variant_meta_keys.<uid>"   , value = IndexedColumnMetaPB bytes
```

- Introduce `VariantExtMetaWriter` to preprocess
`SegmentFooterPB.columns`:
  - For each column:
    - If no `column_path_info` → keep as top-level column.
- If path relative to root is empty → treat as root Variant column and
keep.
- If relative path contains `__DORIS_VARIANT_SPARSE__` (including
bucketized sparse cols like `.b{i}`) → collect as sparse subcolumns to
be embedded into the root Variant’s `children_columns`.
- Otherwise (non-sparse subcolumn) → serialize `ColumnMetaPB` and buffer
as `(relative_path, meta_bytes)` under the root’s `uid`.
  - For each root `uid`, sort `(relative_path, meta_bytes)` by path and:
    - write keys (paths) into `variant_meta_keys.<uid>`,
    - write values (meta bytes) into `variant_meta_values.<uid>`,
- persist the two `IndexedColumnMetaPB` as `file_meta_datas` entries.
  - Replace `footer.columns()` with:
    - only top-level columns (including Variant roots),
- where each root has its sparse subcolumns merged into
`children_columns`.
- Add `VariantExternalMetaReader` to read externalized Variant subcolumn
metas:
  - `init_from_footer()`:
- locate `variant_meta_keys.<uid>` / `variant_meta_values.<uid>` from
`footer.file_meta_datas()`,
- support legacy unsuffixed keys (`variant_meta_keys` /
`variant_meta_values`) as a fallback,
    - open two `IndexedColumnReader`s and load their indexes.
  - `lookup_meta_by_path(rel_path)`:
    - seek on key column via value index,
    - use ordinal to read the matching value (meta bytes),
    - parse `ColumnMetaPB`.
  - `load_all()` / `load_all_once()`:
    - iterate the whole value column, parse each `ColumnMetaPB`,
    - reconstruct `SubcolumnColumnMetaInfo` tree by relative path,
- accumulate `VariantStatistics` (e.g. `none_null_size`) when available.
  - `has_prefix(prefix)`:
- perform `seek_at_or_after(prefix)` on keys and check
`starts_with(prefix)` to answer “does any subcolumn with this path
prefix exist?”.
- Update Variant `ColumnReader` / schema utilities to:
  - load root meta via CMO,
  - obtain non-sparse subcolumn metas via `VariantExternalMetaReader`,
- obtain sparse/bucketized subcolumn metas from root’s
`children_columns`,
- keep compatibility with existing Variant formats while enabling V3
behavior when external meta is present.

- In `PropertyAnalyzer`, extend `storage_format` parsing:
  - reject `"V1"` as deprecated and keep `"V2"` as default.
- accept `"V3"` and map it to `TStorageFormat.V3` (semantically “V2 +
external column meta support”).
- In `Config`:
- add `random_use_v3_storage_format` (mutable, masterOnly) to allow FE
to:
- in fuzzy / regression tests, randomly choose some OLAP tables to use
`storage_format = V3`,
- increase coverage of V3 + ext-meta paths without impacting
user-specified formats.
- In `CreateTableInfo`:
- when engine is `OLAP` and `Config.random_use_v3_storage_format ==
true`,
    - if user does not explicitly set `storage_format`,
- randomly set `storage_format = "V3"` for some tables and log this
choice for observability.

- Existing segments without external meta remain readable:
- CMO pointers validation will fail gracefully and readers fall back to
inline footer metas.
- Variant external meta reader stays in “unavailable” state when no
`variant_meta_*` entries are present.
- All external meta slices are validated via `is_valid_meta_slice()`
with overflow checks and region bounds.
- Footer write order ensures crash-consistency: no visible CMO pointers
in footer are written before the underlying regions are fully flushed.

- Add BE unit tests:
- `external_col_meta_util_test` to exercise pointer parsing, CMO range
validation and read-back correctness.
- extend `variant_column_writer_reader_test` to cover V3 external meta
for Variant, including sparse columns and path resolution.
- extend `schema_util_rowset_test` to ensure schema evolution and
rowset-level utilities understand external meta layout.
- Add regression tests under `variant_p0`:
- `ext_meta/test_storage_format_v2_1` and multiple
`test_variant_external_meta_*` suites (concurrent, edge cases,
integration, with sparse),
- `query_subcolumns` cases to verify subcolumn discovery, prefix checks
and query correctness on V3 tables.
- Wire FE regression configs (`Config.groovy`, load scripts) to
optionally enable `random_use_v3_storage_format` so fuzzy runs
continuously hit V3 + ext-meta paths.
@eldenmoon eldenmoon requested a review from yiguolei as a code owner December 15, 2025 08:22
@Thearas
Copy link
Contributor

Thearas commented Dec 15, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Member Author

run buildall

@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 82.14% (1573/1915)
Line Coverage 67.17% (28071/41794)
Region Coverage 67.70% (13810/20399)
Branch Coverage 58.13% (7363/12666)

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 28.57% (8/28) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 85.67% (915/1068) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.16% (18408/34627)
Line Coverage 38.84% (169709/436894)
Region Coverage 33.61% (131144/390145)
Branch Coverage 34.52% (56630/164027)

@yiguolei yiguolei merged commit 21fa64b into apache:branch-4.0 Dec 16, 2025
23 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants