Skip to content

Allow config.ReplicationLowPriorityTaskParallelism to take effect.#10051

Merged
robholland merged 5 commits intotemporalio:mainfrom
robholland:rh-low-priority-queue-tidy
Apr 24, 2026
Merged

Allow config.ReplicationLowPriorityTaskParallelism to take effect.#10051
robholland merged 5 commits intotemporalio:mainfrom
robholland:rh-low-priority-queue-tidy

Conversation

@robholland
Copy link
Copy Markdown
Contributor

@robholland robholland commented Apr 24, 2026

What changed?

Use WorkflowKey for the low priority replication scheduler queue ID rather than a string. Deterministically map RunID to a slot number using config.ReplicationLowPriorityTaskParallelism slots.

Why?

Using strings for the queue ID was preventing concurrent tx map from optimising queue operations.

While we had code to allow parallelising tasks for different executions of the same workflow ID, the code was not reachable. Switch the parallelisation to the queue ID level which is subjectively more intuitive.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

The config.ReplicationLowPriorityTaskParallelism could be set to a non-default number on clusters but would not currently be taking effect. After this change the parallelism would kick in, which may cause surprising behaviour.


Note

Medium Risk
Changes low-priority replication task queueing and hashing, which can increase concurrency and alter task ordering/throughput when ReplicationLowPriorityTaskParallelism is >1. Risk is moderate due to potential load/behavior shifts in replication processing under real traffic.

Overview
Low-priority replication scheduling now honors config.ReplicationLowPriorityTaskParallelism. Instead of using a random slot per task and string queue IDs, the scheduler deterministically maps each RunID to a stable slot (0..P-1) and encodes that slot into a definition.WorkflowKey queue ID, enabling up to P concurrent sequential queues per (namespace, workflow).

Separately, the non-batching sequentialTaskQueueFactoryProvider now uses a definition.WorkflowKey (with empty third field) as the queue ID rather than a concatenated string, aligning queue identity with the hash function and reducing reliance on string IDs.

Reviewed by Cursor Bugbot for commit aa9f9bf. Bugbot is set up for automated code reviews on this repo. Configure here.

@robholland robholland marked this pull request as ready for review April 24, 2026 15:39
@robholland robholland requested a review from a team as a code owner April 24, 2026 15:39
return farm.Fingerprint32(idBytes)
// 0..parallelism-1, stable for a given RunID. Different runs of the same workflow can share
// a slot, so up to P sequential queues (and workers) can progress them concurrently.
slot := int(farm.Fingerprint32([]byte(workflowKey.RunID)) % uint32(parallelism))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏼

@robholland robholland enabled auto-merge (squash) April 24, 2026 18:28
@robholland robholland merged commit 8436aed into temporalio:main Apr 24, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants