Skip to content

clickhouse user sync#1159

Merged
BilalG1 merged 95 commits intodevfrom
external-db-sync-clickhouse-default
Feb 13, 2026
Merged

clickhouse user sync#1159
BilalG1 merged 95 commits intodevfrom
external-db-sync-clickhouse-default

Conversation

@BilalG1
Copy link
Copy Markdown
Collaborator

@BilalG1 BilalG1 commented Feb 4, 2026

Summary by CodeRabbit

  • New Features

    • Real-time AI search with project-scoped analytics and dynamic query execution; streaming AI responses replace the placeholder flow.
    • External DB sync adds ClickHouse support: users sync, sync metadata tracking, tenancy-aware status, and per-mapping throttling.
    • AI assistant UI shows expandable tool-invocation results and streams via the real AI pipeline.
  • Chores

    • Dashboard dependencies and workspace exclusions updated; development OpenAI env var added; editor config flag toggled.
  • Tests

    • E2E coverage extended to validate ClickHouse user sync and analytics queries.

Comment thread apps/backend/scripts/clickhouse-migrations.ts Outdated
Comment thread apps/backend/scripts/clickhouse-migrations.ts
Comment thread apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-basics.test.ts Outdated
Comment thread packages/stack-shared/src/config/schema.ts
Comment thread apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-basics.test.ts Outdated
@github-actions github-actions Bot assigned BilalG1 and unassigned N2D4 Feb 10, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/stack-shared/src/config/db-sync-mappings.ts`:
- Around line 49-51: The ReplacingMergeTree deduplication fails across
partitions because PARTITION BY toYYYYMM(signed_up_at) puts deletion markers
(DeletedRow.deletedAt) into a different month than the original row's createdAt,
so replacements never occur; fix by updating the table engine/partitioning in
db-sync-mappings.ts: either remove the PARTITION BY clause so
ReplacingMergeTree(sync_sequence_id) deduplicates globally, or change
partitioning to a stable shard expression (e.g., PARTITION BY sipHash64(id) % N)
so both live and deleted rows for the same id land in the same partition, or
modify the DeletedRow production so it stores the original signed_up_at (instead
of deletedAt) so the deletion marker uses the same partitioning key; adjust
whichever you choose in the table definition that contains
ReplacingMergeTree(sync_sequence_id) and PARTITION BY toYYYYMM(signed_up_at).
🧹 Nitpick comments (3)
apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-basics.test.ts (1)

26-75: Consider extracting a shared polling helper to reduce duplication.

waitForClickhouseUser and waitForClickhouseUserDeletion share nearly identical polling structure (timeout, interval, loop, error). A small generic helper (similar to the existing waitForCondition imported from utils) would DRY this up:

♻️ Sketch
+async function waitForClickhouseCondition(
+  email: string,
+  predicate: (response: any) => boolean, // any: response shape from niceBackendFetch is untyped
+  label: string,
+) {
+  const timeoutMs = 120_000;
+  const intervalMs = 500;
+  const start = performance.now();
+
+  while (performance.now() - start < timeoutMs) {
+    const response = await runQueryForCurrentProject({
+      query: "SELECT primary_email, display_name FROM users WHERE primary_email = {email:String}",
+      params: { email },
+    });
+    if (predicate(response)) return response;
+    await wait(intervalMs);
+  }
+  throw new StackAssertionError(`Timed out waiting for ClickHouse ${label} for ${email}.`);
+}
apps/backend/src/lib/external-db-sync.ts (2)

328-330: dbType parameter is unused — function always returns the Postgres query.

getInternalDbFetchQuery accepts a dbType argument but ignores it, always returning mapping.internalDbFetchQuery. Since the ClickHouse path bypasses this function entirely (line 607 accesses mapping.internalDbFetchQueries.clickhouse directly), consider either removing the dbType parameter or centralizing the fetch-query selection here for both backends.


430-437: Hardcoded boolean column names couple this function to the "users" mapping.

Lines 433–436 normalize specific columns (primary_email_verified, is_anonymous, restricted_by_admin, sync_is_deleted) by name, which will silently skip normalization for any future mapping with different boolean fields, or break if column names change. Consider driving this from the mapping config (e.g., a list of boolean columns per mapping) so pushRowsToClickhouse stays generic.

Comment thread packages/stack-shared/src/config/db-sync-mappings.ts
Comment thread packages/stack-shared/src/config/db-sync-mappings.ts
Comment thread apps/backend/src/lib/tokens.tsx
@BilalG1 BilalG1 merged commit d09a180 into dev Feb 13, 2026
22 of 26 checks passed
@BilalG1 BilalG1 deleted the external-db-sync-clickhouse-default branch February 13, 2026 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants