From 7f673a620e0be6f75a716cb3d8b0efbac80bb6d1 Mon Sep 17 00:00:00 2001 From: Reinaldy Rafli Date: Sat, 22 Nov 2025 11:11:16 +0700 Subject: [PATCH 1/4] docs(self-hosted): provide a detailed example on how to separate ingest taskbroker --- develop-docs/self-hosted/tasks.mdx | 51 ++++++++++++++++++++++++++++-- 1 file changed, 49 insertions(+), 2 deletions(-) diff --git a/develop-docs/self-hosted/tasks.mdx b/develop-docs/self-hosted/tasks.mdx index c8f10b6cd81f0..14ca79b08a4bf 100644 --- a/develop-docs/self-hosted/tasks.mdx +++ b/develop-docs/self-hosted/tasks.mdx @@ -128,14 +128,14 @@ tb-d --> w-d[default Worker] To achieve this work separation we need to make a few changes: 1. Provision any additional topics. Topic names need to come from one of the - predefined topics in `src/sentry/conf/types/kafka_definition.py` + predefined topics in [`src/sentry/conf/types/kafka_definition.py`](https://github.com/getsentry/sentry/blob/master/src/sentry/conf/types/kafka_definition.py) 2. Deploy the additional broker replicas. You can use the `TASKBROKER_KAFKA_TOPIC` environment variable to define the topic a taskbroker consumes from. 3. Deploy additional workers that use the new brokers in their `rpc-host-list` CLI flag. 4. Find the list of namespaces you want to shift to the new topic. The list of - task namespaces can be found in the `sentry.taskworker.namespaces` module. + task namespaces can be found in the [`sentry.taskworker.namespaces`](https://github.com/getsentry/sentry/blob/master/src/sentry/taskworker/namespaces.py) module. 5. Update task routing option, defining the namespace -> topic mappings. e.g. ```yaml # in sentry/config.yml @@ -143,3 +143,50 @@ To achieve this work separation we need to make a few changes: "ingest.errors": "taskworker-ingest" "ingest.transactions": "taskworker-ingest" ``` + +### Separate Ingest Workers + +Having separate ingest `taskbroker` and `taskworker` is useful for high-throughput +installation, therefore you can receive timely alerts and not having to wait for +ingest-related tasks to finish. As an implementation of the above steps, +you need to add a few new containers on your `docker-compose.override.yml` file: + +```yaml +# Copy `x-sentry_defaults` and `file_healthcheck_defaults` section from +# `docker-compose.yml` to `docker-compose.override.yml` first. Put it on the +# top of the file. +services: + taskbroker-ingest: + restart: "unless-stopped" + image: "$TASKBROKER_IMAGE" + environment: + TASKBROKER_KAFKA_CLUSTER: "kafka:9092" + TASKBROKER_KAFKA_DEADLETTER_CLUSTER: "kafka:9092" + TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations.sqlite" + volumes: + - sentry-taskbroker-ingest:/opt/sqlite + depends_on: + - kafka + taskworker-ingest: + <<: *sentry_defaults + command: run taskworker --concurrency=4 --rpc-host-list=taskbroker-ingest:50051 --health-check-file-path=/tmp/health.txt + healthcheck: + <<: *file_healthcheck_defaults + +volumes: + sentry-taskbroker-ingest: {} +``` + +On your `sentry/config.yml` file, you need to append the following to the +bottom of the file: +```yaml +taskworker.route.overrides: + "ingest.errors": "taskworker-ingest" + "ingest.transactions": "taskworker-ingest" + "ingest.profiling": "taskworker-ingest" + "ingest.attachments": "taskworker-ingest" + "ingest.errors.postprocess": "taskworker-ingest" +``` + +Any other tasks that are not defined on the routes override above will be +handled by the default `taskbroker` and `taskworker` service. From 3e3570571d7493b125ba034bd22a34b31a47feb1 Mon Sep 17 00:00:00 2001 From: Reinaldy Rafli Date: Sat, 22 Nov 2025 12:14:58 +0700 Subject: [PATCH 2/4] docs(self-hosted): define specific kafka topic to consume to --- develop-docs/self-hosted/tasks.mdx | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/develop-docs/self-hosted/tasks.mdx b/develop-docs/self-hosted/tasks.mdx index 14ca79b08a4bf..058df85e95d3d 100644 --- a/develop-docs/self-hosted/tasks.mdx +++ b/develop-docs/self-hosted/tasks.mdx @@ -128,7 +128,9 @@ tb-d --> w-d[default Worker] To achieve this work separation we need to make a few changes: 1. Provision any additional topics. Topic names need to come from one of the - predefined topics in [`src/sentry/conf/types/kafka_definition.py`](https://github.com/getsentry/sentry/blob/master/src/sentry/conf/types/kafka_definition.py) + predefined topics in [`src/sentry/conf/types/kafka_definition.py`](https://github.com/getsentry/sentry/blob/master/src/sentry/conf/types/kafka_definition.py). + By default, any topics will automatically be created during `./install.sh` + process. 2. Deploy the additional broker replicas. You can use the `TASKBROKER_KAFKA_TOPIC` environment variable to define the topic a taskbroker consumes from. @@ -160,9 +162,11 @@ services: restart: "unless-stopped" image: "$TASKBROKER_IMAGE" environment: + TASKBROKER_KAFKA_TOPIC: "taskworker-ingest" + TASKBROKER_KAFKA_CONSUMER_GROUP: "taskworker-ingest" TASKBROKER_KAFKA_CLUSTER: "kafka:9092" TASKBROKER_KAFKA_DEADLETTER_CLUSTER: "kafka:9092" - TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations.sqlite" + TASKBROKER_DB_PATH: "/opt/sqlite/taskbroker-activations-ingest.sqlite" volumes: - sentry-taskbroker-ingest:/opt/sqlite depends_on: From 469d6bc573fccb7cb219c2f7c7bd73df0cdf0834 Mon Sep 17 00:00:00 2001 From: Reinaldy Rafli Date: Sat, 22 Nov 2025 12:15:43 +0700 Subject: [PATCH 3/4] chore: grammar --- develop-docs/self-hosted/tasks.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/develop-docs/self-hosted/tasks.mdx b/develop-docs/self-hosted/tasks.mdx index 058df85e95d3d..134a50f09025d 100644 --- a/develop-docs/self-hosted/tasks.mdx +++ b/develop-docs/self-hosted/tasks.mdx @@ -149,7 +149,7 @@ To achieve this work separation we need to make a few changes: ### Separate Ingest Workers Having separate ingest `taskbroker` and `taskworker` is useful for high-throughput -installation, therefore you can receive timely alerts and not having to wait for +installation, therefore you can receive timely alerts and not have to wait for ingest-related tasks to finish. As an implementation of the above steps, you need to add a few new containers on your `docker-compose.override.yml` file: From 6b28f68c884a3cce4240275a03f8761c4dd6ce30 Mon Sep 17 00:00:00 2001 From: Reinaldy Rafli Date: Tue, 25 Nov 2025 07:38:02 +0700 Subject: [PATCH 4/4] Update develop-docs/self-hosted/tasks.mdx Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --- develop-docs/self-hosted/tasks.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/develop-docs/self-hosted/tasks.mdx b/develop-docs/self-hosted/tasks.mdx index 134a50f09025d..4e31c390968db 100644 --- a/develop-docs/self-hosted/tasks.mdx +++ b/develop-docs/self-hosted/tasks.mdx @@ -149,7 +149,7 @@ To achieve this work separation we need to make a few changes: ### Separate Ingest Workers Having separate ingest `taskbroker` and `taskworker` is useful for high-throughput -installation, therefore you can receive timely alerts and not have to wait for +installations, therefore you can receive timely alerts and not have to wait for ingest-related tasks to finish. As an implementation of the above steps, you need to add a few new containers on your `docker-compose.override.yml` file: