Skip to content

[improve][broker] PIP-473 P5.1: metadata-driven transaction coordinator (NEW_TXN / END_TXN)#25863

Merged
lhotari merged 3 commits into
apache:masterfrom
merlimat:mmerli/pip-473-tc-v5
May 27, 2026
Merged

[improve][broker] PIP-473 P5.1: metadata-driven transaction coordinator (NEW_TXN / END_TXN)#25863
lhotari merged 3 commits into
apache:masterfrom
merlimat:mmerli/pip-473-tc-v5

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

Summary

Adds the PIP-473 metadata-driven v5 transaction coordinator — the broker-side service that serves NEW_TXN / END_TXN against the metadata-store transaction layout (P2's TxnMetadataStore) instead of the legacy TransactionMetadataStoreService.

Gated behind a new transactionCoordinatorScalableTopicsEnabled flag (default off). When on, ServerCnx routes the TC wire commands to TransactionCoordinatorV5; when off, behavior is unchanged. The flag is meant to be flipped together with the scalable-topic transaction buffer and pending-ack store providers (a later phase).

Leader election

Per-partition coordinator, same as the legacy TC: a broker runs the v5 TC for partition N iff it owns partition N of transaction_coordinator_assign. handleClientConnect mirrors the legacy ownership check, so the client-side discovery surface is unchanged. (A pure metadata-store election is a later phase; reusing the assign-topic keeps this PR's scope tight.)

Wire commands

Command Behavior
TC_CLIENT_CONNECT ownership check for the tcId's assign partition
NEW_TXN create /txn/id/<tcId>_<seq> header in OPEN
END_TXN CAS header to COMMITTED/ABORTED, fan out participant events
ADD_PARTITION_TO_TXN, ADD_SUBSCRIPTION_TO_TXN no-op

ADD_PARTITION / ADD_SUBSCRIPTION are no-ops per the PIP: v5 participants advertise themselves by writing /txn/op records when they apply ops, so the pre-registration step is unnecessary.

txnId generation — no reuse

leastSigBits is drawn from a per-tcId monotonic counter at /txn/tc-seq/<tcId> (TcSequence, CAS-incremented with retry on BadVersionException). Monotonic-per-tcId ⟹ txnIds are never reused — the participant-side aborted set is keyed by txnId, and reuse would corrupt it.

endTransaction fan-out

END_TXN CAS-updates the header to the terminal state, then enumerates the txn's participants via the new idx:ops-by-txn secondary index on /txn/op (so it doesn't scan the whole namespace), and publishes one segment-event per affected segment + one subscription-event per affected (segment, subscription) pair. Idempotent on retry — a header already in the requested terminal state short-circuits.

Test plan

  • pulsar-broker:test --tests TransactionCoordinatorV5Test — 8 cases: sequential txnId per tcId, commit/abort fan-out, idempotent retry, mismatched-action failure, unknown-txn failure, add* no-ops, per-tcId scoping.
  • pulsar-broker:test --tests TxnMetadataStoreTest / MetadataTransactionBufferTest / MetadataPendingAckStoreTest — green after the idx:ops-by-txn index addition.
  • Checkstyle clean (main + test).

Deferred / follow-ups

  • Timeout sweep + GC sweep. This PR is happy-path newTxn / endTxn only.
  • Pure metadata-store leader election (replacing the assign-topic reuse).

merlimat added 2 commits May 23, 2026 09:50
… NEW_TXN / END_TXN

Adds the v5 transaction coordinator behind the new `transactionCoordinatorV5Enabled`
flag (default off). When the flag is on, `ServerCnx` routes `TC_CLIENT_CONNECT`,
`NEW_TXN`, `END_TXN`, `ADD_PARTITION_TO_TXN`, and `ADD_SUBSCRIPTION_TO_TXN` to
`TransactionCoordinatorV5` instead of the legacy `TransactionMetadataStoreService`.

The TC writes the txn header at `/txn/id/<tcId>_<seq>`, assigns `seq` from a
per-tcId monotonic counter (`/txn/tc-seq/<tcId>`) so txnIds are never reused, and
at end-txn enumerates the txn's participants via the new `idx:ops-by-txn`
secondary index on `/txn/op`, publishing one segment-event per affected segment
and one subscription-event per affected (segment, subscription) pair.

`ADD_PARTITION_TO_TXN` / `ADD_SUBSCRIPTION_TO_TXN` are no-ops per PIP-473 —
participants advertise themselves by writing `/txn/op` records when they apply
ops, so the pre-registration step is unnecessary.

Leader election reuses the existing `transaction_coordinator_assign` topic
ownership check (same surface the legacy TC uses); a pure metadata-store
election lands in P5.3.
Addresses review feedback:
- Rename `transactionCoordinatorV5Enabled` →
  `transactionCoordinatorScalableTopicsEnabled` and drop PIP / phase references
  from the user-facing config doc.
- Import `TransactionCoordinatorV5` in `PulsarService` instead of using the
  fully qualified name inline.
Comment thread pulsar-broker/src/main/java/org/apache/pulsar/broker/service/ServerCnx.java Outdated
…y fan-out

- Security: the v5 ServerCnx paths bypassed the ownership check the legacy
  handlers run via verifyTxnOwnership. Persist the opening principal in
  TxnHeader.owner (NEW_TXN), add TransactionCoordinatorV5.verifyTxnOwnership
  (null owner ⟹ allowed, mirroring the legacy semantics), route
  ServerCnx.verifyTxnOwnership to the v5 TC when the flag is on, and gate
  END_TXN / ADD_PARTITION / ADD_SUBSCRIPTION on it (superuser fallback unchanged).

- endTransaction idempotent retry now re-drives the fan-out instead of
  short-circuiting: if a prior attempt CAS'd the header but failed before
  publishing, a terminal header gives the reconcile path nothing to act on, so
  the retry must re-publish. Safe — participants key on (txnId, decision) and
  the decision can't change once terminal.

- Replace the hand-rolled ConcurrentHashSet helper with ConcurrentHashMap.newKeySet(),
  and the "\0"-packed (segment, subscription) string with an AckParticipant record.

Tests: owner persistence + verifyTxnOwnership (match / null-allowed / unknown),
idempotent-retry re-publish. Existing P2/P3/P4 TxnHeader constructions updated
for the new owner field.
Copy link
Copy Markdown
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari merged commit 5991f6e into apache:master May 27, 2026
43 checks passed
@merlimat merlimat deleted the mmerli/pip-473-tc-v5 branch May 27, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants