Skip to content

active-active TSO preflight reads PD config instead of TSO microservice config #5091

@King-Dylan

Description

@King-Dylan

What did you do?

I traced the active-active TSO compatibility check added in TiCDC and compared it with how TiDB exposes cluster config in microservice mode.

Current TiCDC code is hardcoded to read:

  • downstream: SHOW CONFIG WHERE type='pd' AND (name='tso-unique-index' OR name='tso-max-index')
  • upstream: PD HTTP GetConfig()

Relevant code in TiCDC:

  • pkg/check/active_active_tso_indexes.go
  • downstream query is hardcoded in showPDConfigQuery
  • upstream reads PD config through pdhttp.Client.GetConfig()

However, in TiDB microservice mode, SHOW CONFIG already distinguishes pd and tso and routes them to different config endpoints:

  • type='pd' -> PD config endpoint
  • type='tso' -> TSO config endpoint (/tso/api/v1/config)

In the PD release-8.5 code line that introduced active-active TSO index support, tso-unique-index and tso-max-index exist in both:

  • PD server config
  • TSO microservice config

and the TSO allocator consumes cfg.GetTSOIndex() from the TSO service config.

This means that in TSO microservice mode, TiCDC may validate against PD config instead of the config actually used by the running TSO service.

A concrete reproduction recipe would be:

  1. Deploy TiDB/TiCDC with TSO microservice mode enabled.
  2. Configure tso-unique-index / tso-max-index for the TSO service.
  3. Create a changefeed with enable-active-active=true and a MySQL-compatible downstream.
  4. Observe that TiCDC only checks type='pd' downstream and PD config upstream, instead of the TSO service config.

What did you expect to see?

In microservice mode, TiCDC should validate against the TSO service config source, not only the PD config source.

Concretely:

  • downstream: prefer SHOW CONFIG WHERE type='tso' ...
  • upstream: prefer the TSO service config endpoint
  • for backward compatibility, TiCDC can fall back to pd when the cluster does not expose a TSO microservice

What did you see instead?

TiCDC currently has no TSO-microservice-aware compatibility logic here.

It assumes these values are part of PD config and hardcodes both downstream and upstream checks to PD-only sources. In microservice mode, this can validate against the wrong configuration source and potentially produce false acceptance or false mismatch results.

There is also a wording issue in the current downstream error message path/comments: this logic talks about values returned by SHOW CONFIG, but the current code/comments flatten that into "TiDB instances" while the rows are actually returned per target component instance.

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

Not captured from a live cluster for this report.
The issue was identified by tracing TiCDC code against local TiDB/PD source trees.
TiDB local checkout shows microservice-aware `SHOW CONFIG` routing for `type='tso'`.

Upstream TiKV version (execute tikv-server --version):

Not required for reproducing this code-path issue.

TiCDC version (execute cdc version):

Local TiCDC checkout HEAD: 4f50193ea0a0f1ded33e42c822a01ea97eea4b7d
File introduced by commit: 888bd8cfba3823c0a4197520cdb9e2cbd82be46c

Suggested direction

A compatible implementation would likely be:

  1. Detect / prefer TSO microservice config when available.
  2. Fall back to PD config for legacy / embedded TSO deployments.
  3. Update comments and error messages so they describe PD instances or TSO instances instead of TiDB instances for this check.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions