Skip to content

storage: prune dt_local_indexes by table_id/index_id before getLocalIndexStats #10880

@dbsid

Description

@dbsid

Enhancement

StorageSystemDTLocalIndexes::read should prune by table_id / index_id before calling getLocalIndexStats().

Problem

system.dt_local_indexes currently parses keyspace_id from the query, then iterates DeltaMerge tables and calls getLocalIndexesStatsFromStorage(dm_storage) before applying any table_id / index_id pruning.

At 100k vector-index table scale, TiDB DDL progress polling for ADD VECTOR INDEX repeatedly queries TiFlash local-index status for one target table/index. Without early pruning, TiFlash scans local-index metadata across many tables for every poll.

Recent profiling evidence:

  • Capture 2026-05-31 17:55 CST: about 87,762 vector-index tables were ready and ADD VECTOR INDEX was still running.
  • Capture 2026-06-01 08:33 CST: about 99.9k vector-index tables were ready.
  • TiFlash-1 CPU profile over 30s: StorageSystemDTLocalIndexes::read accounted for about 96.01% cumulative CPU under GetTiFlashSystemTable; DeltaMergeStore::getLocalIndexStats accounted for about 35.97%.
  • TiFlash-2 CPU profile over 30s: StorageSystemDTLocalIndexes::read accounted for about 96.60% cumulative CPU; DeltaMergeStore::getLocalIndexStats accounted for about 37.25%.
  • Related work inside metadata expansion included DMFileMetaV2::getLocalIndexState and generateLocalIndexInfos.

This issue tracks the TiFlash side of a two-layer fix. TiDB should also push table_id / index_id predicates down when querying information_schema.tiflash_indexes.

Proposed change

  • Extend system-table predicate parsing for system.dt_local_indexes to recognize table_id and index_id, at least for equality and IN predicates combined by AND.
  • In StorageSystemDTLocalIndexes::read, skip nonmatching table_info.id before calling getLocalIndexesStatsFromStorage(dm_storage) / getLocalIndexStats().
  • Apply index_id filtering before expanding per-index stats where possible.
  • If current getLocalIndexStats() APIs cannot avoid scanning all local indexes / DMFiles for a table, add a filtered variant or pass a filter into the helper.
  • Preserve existing behavior when no table_id / index_id filter is present.

Acceptance criteria

  • SELECT * FROM system.dt_local_indexes WHERE table_id = X AND index_id = Y avoids scanning other tables and avoids calling getLocalIndexStats() for nonmatching tables.
  • Predicate pruning is covered by unit or integration tests for equality, IN, conjunctions, and no-filter fallback.
  • With the TiDB-side pushdown change, ADD VECTOR INDEX progress polling at 100k table scale is proportional to the matched table/index rather than all local-index tables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions