feat(taskbroker): Support multiple topics by untitaker · Pull Request #668 · getsentry/taskbroker

untitaker · 2026-06-03T10:36:11Z

Continuing from #663, this is the bare minimum for multi-topic to be useful. We spawn one consumer per topic, and then each consumer has its own pipeline, activationbatcher, but shares an activationstore.

What is deliberately left out of this PR and left for follow-up:

Postgres support is entirely unimplemented. We could add a topic column now, but actually due to other reasons (slicing) we're reconsidering whether even partition should even be there. The main reason partition exists is to avoid (row-level?) lock contention when many brokers share an alloydb instance. This, however, makes draining and migration of topics more complicated, so we are considering using another "sharding key" in pg entirely.
The batching is still per-topic. An activationbatcher is created per-consumer. This is not what we want in the long-term, because postgres perf hinges on that. but without postgres support, it doesn't make sense to do it in this PR
~~Metrics are not fully fixed yet.~~ I fixed everything but realistically there will be huge churn for our dashboards.

Other things fixed:

bugfix: Raw mode was not properly wired up, the legacy fields were used exclusively.
bugfix: kafka_retry_topic must be set explicitly when a consumed topic uses raw mode. before it was too easy to accidentally send retries back to the (raw) main topic (and indeed devenv has that problem right now)

This, to me, is the bare minimum that is useful for consolidating existing low-traffic/low-cost pools. I want to use this PR as-is to achieve that and come back to the other points async.

ref STREAM-1042
ref /STREAM-1096

linear-code · 2026-06-03T10:57:16Z

STREAM-1042

With one consumer per topic, the consumer/pipeline metrics raced (gauges) or merged across topics (counters/histograms). Add a `topic` tag to the rebalance gauges/counters, the activation writer and batcher metrics, and the deserialize payload-size histograms. Store-level metrics are left untagged. Also demote sqlite's per-consumer assign_partitions warn to debug. ref STREAM-1042 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A topic with raw mode but missing namespace/application/taskname/ processing_deadline_duration passed config validation and only panicked later when the consumer built its deserializer (via .expect()). Validate completeness (and that the application is in worker_map, and the retry topic differs from the raw topic) in normalize_and_validate so it's a clean config error instead. ref STREAM-1042 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The forwarding producer authenticates against the deadletter cluster and only overrides bootstrap.servers, so the deadletter cluster is the only target where its credentials reliably work. The consumer-side batcher and the upkeep path disagreed on the default forward cluster when demoted_topic_cluster is unset: the batcher used each topic's own cluster while upkeep used the deadletter cluster for multi-topic. Since the demoted topic is a single global topic, default both to the deadletter cluster. Unchanged for legacy single-topic, where the deadletter cluster address defaults to the consumed cluster's address. ref STREAM-1042 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Follow-up to the forward-cluster fix: the upkeep comment overstated that legacy behavior is "unchanged" (it only is when kafka_deadletter_cluster is unset), and the demoted_topic_cluster doc still claimed it defaults to the consumed cluster. Correct both to say it defaults to the deadletter cluster. ref STREAM-1042 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 2724130. Configure here.}

cursor · 2026-06-03T14:53:13Z

            };
            metrics::counter!(
                "consumer.inflight_activation_writer.backpressure",
+                "topic" => self.config.topic.clone(),


Shared store backpressure race

Medium Severity

With multiple consumable topics, each topic pipeline has its own ActivationWriter on the same store, but flush still decides backpressure from a single count_depths snapshot with no coordination. Two writers can both pass the pending, delay, processing, or DB size checks and insert together, so shared max_* limits can be exceeded under concurrent consumption.

Additional Locations (1)

src/main.rs#L162-L202

^{Reviewed by Cursor Bugbot for commit 2724130. Configure here.}

this is the kind of thing i'd like to scope into a followup pr. the current way we manage db connections (proportional to the amount of topics we connect to) is fundamentally leading to bad performance. in my mind the fix is to reuse the activationwriter and -batcher across topics, but this seemed like a more invasive change as it would require me to break them out of the consumer-specific pipeline.

reusing those services across topics would lower the amount of db transactions and also fix this race.

I agree with this. Also, there isn't a "hard" cap on the DB size, we just want to make sure the DB can't grow unbounded. So this race might let one extra write in, but then things will still backpressure after that.

evanh

Agree that in the future we should have ActivationBatcher and ActivationWriter be standalone components that are shared across all consumers.

evanh · 2026-06-03T15:41:56Z

            };
            metrics::counter!(
                "consumer.inflight_activation_writer.backpressure",
+                "topic" => self.config.topic.clone(),


I agree with this. Also, there isn't a "hard" cap on the DB size, we just want to make sure the DB can't grow unbounded. So this race might let one extra write in, but then things will still backpressure after that.

…116758) See getsentry/taskbroker#667 for details. Same kafka config migration as the self-hosted change, but for the `ingest-profiles` taskbroker in devservices. Deprecation warnings land as per getsentry/taskbroker#663 `ingest-profiles` runs in raw mode, and raw mode now requires an explicit retry topic: raw payloads aren't activations, so retries can't loop back into the `profiles` topic. Retries now go to the main `taskworker` topic so the existing taskbroker picks them up instead of running another broker. Depends on getsentry/taskbroker#668, which wires up per-topic raw mode and enforces the retry-topic requirement. Once that ships, the current devenv config breaks without this change. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

untitaker added 5 commits June 3, 2026 11:50

Minor refactor for config and admin client creation

48b02e3

more plumbing, spawn many consumers

a0e0351

add test

5a9b77c

add test to ci

8d8975e

add deprecation warning for raw mode too

bdb398b

untitaker changed the title ~~multi topic impl~~ feat(taskbroker): Support multiple topics Jun 3, 2026

fix lints

58af659

untitaker mentioned this pull request Jun 3, 2026

ref(taskbroker): Migrate ingest-profiles devenv to new config format getsentry/sentry#116758

Merged

untitaker marked this pull request as ready for review June 3, 2026 11:19

untitaker requested a review from a team as a code owner June 3, 2026 11:20

sentry Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/kafka/deserialize_raw.rs

untitaker mentioned this pull request Jun 3, 2026

feat(taskbroker): Add drain mode #669

Draft

untitaker and others added 2 commits June 3, 2026 16:34

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/upkeep.rs Outdated

cursor Bot reviewed Jun 3, 2026

View reviewed changes

Comment thread src/kafka/activation_batcher.rs

cursor Bot reviewed Jun 3, 2026

View reviewed changes

evanh approved these changes Jun 3, 2026

View reviewed changes

untitaker merged commit c58a161 into main Jun 3, 2026
26 checks passed

untitaker deleted the multi-topic-impl branch June 3, 2026 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(taskbroker): Support multiple topics#668

feat(taskbroker): Support multiple topics#668
untitaker merged 10 commits into
mainfrom
multi-topic-impl

untitaker commented Jun 3, 2026 •

edited

Loading

Uh oh!

linear-code Bot commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 3, 2026

Uh oh!

untitaker Jun 3, 2026

Uh oh!

evanh Jun 3, 2026

Uh oh!

evanh left a comment

Uh oh!

evanh Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

untitaker commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

linear-code Bot commented Jun 3, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 3, 2026

Choose a reason for hiding this comment

Shared store backpressure race

Uh oh!

untitaker Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

evanh Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

evanh left a comment

Choose a reason for hiding this comment

Uh oh!

evanh Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

untitaker commented Jun 3, 2026 •

edited

Loading