Skip to content

fix(ingestion): pre-create combined consumer output topics in kafka-init#60883

Merged
pl merged 1 commit into
masterfrom
pl/ingestion/kafka-init-missing-topics
Jun 1, 2026
Merged

fix(ingestion): pre-create combined consumer output topics in kafka-init#60883
pl merged 1 commit into
masterfrom
pl/ingestion/kafka-init-missing-topics

Conversation

@pl
Copy link
Copy Markdown
Contributor

@pl pl commented Jun 1, 2026

Problem

On a fresh dev or hobby stack, the combined analytics ingestion consumer (INGESTION-V2-COMBINED) fails its startup health check with repeated errors like:

🔴 Topic check failed for "DEFAULT" topic "clickhouse_groups"
🔴 Topic check failed for "DEFAULT" topic "clickhouse_person_distinct_id"
🔴 Topic check failed for "DEFAULT" topic "log_entries"
🔴 Topic check failed for "DEFAULT" topic "clickhouse_tophog"

The consumer checks topic existence at startup for every output it produces to (nodejs/src/ingestion/analytics/outputs/registry.ts, verified in nodejs/src/ingestion/outputs/single-ingestion-output.ts). Redpanda's auto_create_topics_enabled only fires on a producer-first write, so the output topics that no other service writes to first don't exist when the check runs, and startup fails.

The kafka-init services pre-created only a single topic each (document_embeddings_input in dev, clickhouse_events_json in hobby), which is why most topics happened to exist (created by other producers) but these four did not.

Changes

Expanded the kafka-init pre-create loop in both docker-compose.dev.yml and docker-compose.hobby.yml to cover the full set of output topics the combined consumer checks at startup:

  • clickhouse_events_json, clickhouse_ai_events_json, clickhouse_heatmap_events
  • clickhouse_ingestion_warnings, clickhouse_app_metrics2
  • events_plugin_ingestion_dlq / _overflow / _async
  • clickhouse_groups, clickhouse_person, clickhouse_person_distinct_id
  • log_entries, clickhouse_tophog

The loop is idempotent — existing topics are skipped — so pre-creating topics that other producers normally create first is harmless.

How did you test this code?

I'm an agent. I validated both compose files parse with docker compose -f <file> config after the change (and confirmed the YAML block-scalar shell loop survived hogli format:yaml). I did not spin up a full stack to observe the consumer starting cleanly.

🤖 Agent context

Authored with Claude Code (Opus 4.8). Traced the failing log line to single-ingestion-output.ts, enumerated the required output topics from analytics/outputs/registry.ts and their default values in nodejs/src/ingestion/config.ts, then mapped them to literal topic names via nodejs/src/config/kafka-topics.ts (no prefix/suffix in dev/hobby). Chose to pre-create the full output set rather than only the four observed failures, since any of them can be missing depending on fresh-environment ordering.

The combined analytics ingestion consumer checks topic existence at
startup for every output it produces to. Redpanda auto-creates topics
only on producer-first writes, so on a fresh stack the outputs no other
service writes to first (clickhouse_groups, clickhouse_person_distinct_id,
log_entries, clickhouse_tophog, ...) are missing and the check fails.

Pre-create the full output set in kafka-init for both dev and hobby.
@pl pl requested a review from a team June 1, 2026 12:09
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 1, 2026

Reviews (1): Last reviewed commit: "fix(ingestion): pre-create combined cons..." | Re-trigger Greptile

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

🎭 Playwright report · View test results →

⚠️ 1 flaky test:

  • Save an insight, make changes, discard them, and save a copy (chromium)

These issues are not necessarily caused by your changes.
Annoyed by this comment? Help fix flakies and failures and it'll disappear!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

❌ Hobby deploy smoke test: FAILED

Failing fast because: Container health check failed

Unhealthy containers:

  • worker: restarted 30x

Run 26753991975 | Consecutive failures: 1

@pl pl merged commit 7e8d6c7 into master Jun 1, 2026
203 of 204 checks passed
@pl pl deleted the pl/ingestion/kafka-init-missing-topics branch June 1, 2026 13:16
@deployment-status-posthog
Copy link
Copy Markdown

deployment-status-posthog Bot commented Jun 1, 2026

Deploy status

Environment Status Deployed At Workflow
dev ✅ Deployed 2026-06-01 13:39 UTC Run
prod-us ✅ Deployed 2026-06-01 13:57 UTC Run
prod-eu ✅ Deployed 2026-06-01 14:01 UTC Run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants