-
-
Notifications
You must be signed in to change notification settings - Fork 899
fix(clickhouse): partition by insertion date to prevent "Too many parts" errors when partitioning by start time #2719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ts" errors when partitioning by start time
|
WalkthroughThis pull request introduces support for a new ClickHouse event repository variant (clickhouse_v2) alongside the existing v1. Changes include: adding "clickhouse_v2" as a valid environment variable and feature flag option; implementing a versioning mechanism in ClickhouseEventRepository with conditional routing to v1/v2 query builders and insert paths; creating new SQL schema definitions for task_events_v2 table with associated materialized views; adding V2 query builders and insert functions to the ClickHouse package; wiring a new clickhouseEventRepositoryV2 singleton; and updating routing logic across the event repository, feature flag, and OTLP exporter modules to support both versions. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/webapp/app/v3/otlpExporter.server.ts (1)
893-903: Useenvfromenv.server.tsinstead ofprocess.env.Lines 898-901 access
process.envdirectly, which violates the coding guidelines for the webapp. All environment variables should be accessed through theenvexport fromenv.server.ts.As per coding guidelines, apply this diff:
function initializeOTLPExporter() { return new OTLPExporter( eventRepository, clickhouseEventRepository, clickhouseEventRepositoryV2, - process.env.OTLP_EXPORTER_VERBOSE === "1", - process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT - ? parseInt(process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT, 10) + env.OTLP_EXPORTER_VERBOSE === "1", + env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT + ? parseInt(env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT, 10) : 8192 ); }Note: Ensure
OTLP_EXPORTER_VERBOSEandSERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMITare defined in theEnvironmentSchemainenv.server.ts.apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts (1)
184-186:insertManybypasses V2 partition strategy; always creates V1 inputs regardless of version.
insertManyalways callscreateEventToTaskEventV1Input, which returnsTaskEventV1Input[]withoutinserted_at. Whenthis._version === "v2", these V1 inputs are flushed viataskEventsV2.insert, causing events to be partitioned bynow64(3)(insertion time) instead of event-specific timestamps.The V2 schema partitions by
inserted_atto handle late-arriving events correctly, butinsertManydoesn't leverage the optionalinserted_atfield inTaskEventV2Input. Events sent hours or days after occurrence will be misplaced in today's partition instead of their correct event date partition, negating the V2 design benefit.
insertManyshould create version-aware inputs (V1 for"v1"version, V2 for"v2"version) with explicitinserted_attimestamps for V2 events.
🧹 Nitpick comments (2)
internal-packages/clickhouse/schema/010_add_task_events_v2.sql (1)
49-49: Consider ORDER BY precision for start_time queries.
toUnixTimestamp(start_time)reduces DateTime64(9) nanosecond precision to seconds. This may affect query efficiency for narrow time-range filters since ClickHouse relies on ORDER BY for sparse index granule skipping.If your typical queries filter by
start_timewith sub-second precision, consider keeping full precision in the sort key or validating that query patterns are unaffected.apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts (1)
39-70: Reduce code duplication between v1 and v2 initializers.Both
initializeClickhouseRepositoryandinitializeClickhouseRepositoryV2contain nearly identical code (URL parsing, password redaction, console logging). SincegetClickhouseClientalready validatesEVENTS_CLICKHOUSE_URL, the duplicate checks and URL handling in both initializers are redundant.Consider extracting the shared logic:
+function createClickhouseRepository(version: "v1" | "v2") { + if (!env.EVENTS_CLICKHOUSE_URL) { + throw new Error("EVENTS_CLICKHOUSE_URL is not set"); + } + + const url = new URL(env.EVENTS_CLICKHOUSE_URL); + url.searchParams.delete("secure"); + + const safeUrl = new URL(url.toString()); + safeUrl.password = "redacted"; + + console.log(`🗃️ Initializing Clickhouse event repository (${version})`, { url: safeUrl.toString() }); + + const clickhouse = getClickhouseClient(); + + return new ClickhouseEventRepository({ + clickhouse, + batchSize: env.EVENTS_CLICKHOUSE_BATCH_SIZE, + flushInterval: env.EVENTS_CLICKHOUSE_FLUSH_INTERVAL_MS, + maximumTraceSummaryViewCount: env.EVENTS_CLICKHOUSE_MAX_TRACE_SUMMARY_VIEW_COUNT, + maximumTraceDetailedSummaryViewCount: env.EVENTS_CLICKHOUSE_MAX_TRACE_DETAILED_SUMMARY_VIEW_COUNT, + maximumLiveReloadingSetting: env.EVENTS_CLICKHOUSE_MAX_LIVE_RELOADING_SETTING, + insertStrategy: env.EVENTS_CLICKHOUSE_INSERT_STRATEGY, + waitForAsyncInsert: env.EVENTS_CLICKHOUSE_WAIT_FOR_ASYNC_INSERT === "1", + asyncInsertMaxDataSize: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_MAX_DATA_SIZE, + asyncInsertBusyTimeoutMs: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_BUSY_TIMEOUT_MS, + version, + }); +} + -function initializeClickhouseRepository() { ... } -function initializeClickhouseRepositoryV2() { ... } +function initializeClickhouseRepository() { + return createClickhouseRepository("v1"); +} + +function initializeClickhouseRepositoryV2() { + return createClickhouseRepository("v2"); +}Also applies to: 72-103
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
apps/webapp/app/env.server.ts(1 hunks)apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts(14 hunks)apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts(3 hunks)apps/webapp/app/v3/eventRepository/index.server.ts(7 hunks)apps/webapp/app/v3/featureFlags.server.ts(1 hunks)apps/webapp/app/v3/otlpExporter.server.ts(4 hunks)internal-packages/clickhouse/schema/010_add_task_events_v2.sql(1 hunks)internal-packages/clickhouse/schema/011_add_task_events_v2_table_mvs.sql(1 hunks)internal-packages/clickhouse/src/index.ts(2 hunks)internal-packages/clickhouse/src/taskEvents.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead
Files:
apps/webapp/app/env.server.tsinternal-packages/clickhouse/src/taskEvents.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.tsinternal-packages/clickhouse/src/index.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use zod for validation in packages/core and apps/webapp
Files:
apps/webapp/app/env.server.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
apps/webapp/app/env.server.tsinternal-packages/clickhouse/src/taskEvents.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.tsinternal-packages/clickhouse/src/index.ts
apps/webapp/app/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access all environment variables through the
envexport ofenv.server.tsinstead of directly accessingprocess.envin the Trigger.dev webapp
Files:
apps/webapp/app/env.server.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
apps/webapp/**/*.{ts,tsx}: When importing from@trigger.dev/corein the webapp, use subpath exports from the package.json instead of importing from the root path
Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Files:
apps/webapp/app/env.server.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
apps/webapp/app/env.server.tsinternal-packages/clickhouse/src/taskEvents.tsapps/webapp/app/v3/featureFlags.server.tsapps/webapp/app/v3/eventRepository/index.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.tsapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.tsapps/webapp/app/v3/otlpExporter.server.tsinternal-packages/clickhouse/src/index.ts
🧠 Learnings (16)
📚 Learning: 2025-11-27T16:26:58.652Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.652Z
Learning: Applies to apps/webapp/app/**/*.{ts,tsx} : Access all environment variables through the `env` export of `env.server.ts` instead of directly accessing `process.env` in the Trigger.dev webapp
Applied to files:
apps/webapp/app/env.server.ts
📚 Learning: 2025-11-27T16:26:58.652Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-11-27T16:26:58.652Z
Learning: Applies to apps/webapp/**/*.{ts,tsx} : Follow the Remix 2.1.0 and Express server conventions when updating the main trigger.dev webapp
Applied to files:
apps/webapp/app/env.server.ts
📚 Learning: 2025-08-14T18:35:44.370Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2390
File: apps/webapp/app/env.server.ts:764-765
Timestamp: 2025-08-14T18:35:44.370Z
Learning: The BoolEnv helper in apps/webapp/app/utils/boolEnv.ts uses z.preprocess with inconsistent default value types across the codebase - some usages pass boolean defaults (correct) while others pass string defaults (incorrect), leading to type confusion. The helper should enforce boolean-only defaults or have clearer documentation.
Applied to files:
apps/webapp/app/env.server.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Export tasks with unique IDs within the project to enable proper task discovery and execution
Applied to files:
internal-packages/clickhouse/src/taskEvents.tsinternal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schedules.task()` for scheduled/cron tasks instead of regular `task()`
Applied to files:
internal-packages/clickhouse/src/taskEvents.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `schemaTask()` from `trigger.dev/sdk/v3` with Zod schema for payload validation
Applied to files:
internal-packages/clickhouse/src/taskEvents.tsinternal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `.withStreams()` to subscribe to realtime streams from task metadata in addition to run changes
Applied to files:
internal-packages/clickhouse/src/taskEvents.tsinternal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use metadata methods (set, del, replace, append, remove, increment, decrement, stream, flush) to update metadata during task execution
Applied to files:
internal-packages/clickhouse/src/taskEvents.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `tasks.trigger()` with type-only imports to trigger tasks from backend code without importing the task implementation
Applied to files:
internal-packages/clickhouse/src/taskEvents.tsinternal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use the `task()` function from `trigger.dev/sdk/v3` to define tasks with id and run properties
Applied to files:
internal-packages/clickhouse/src/taskEvents.tsinternal-packages/clickhouse/src/index.ts
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.
Applied to files:
apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use logger methods (debug, log, info, warn, error) from `trigger.dev/sdk/v3` for structured logging in tasks
Applied to files:
internal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `trigger.dev/sdk/v3` for all imports in Trigger.dev tasks
Applied to files:
internal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Attach metadata to task runs using the metadata option when triggering, and access/update it inside runs using metadata functions
Applied to files:
internal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use `yourTask.batchTrigger()` to trigger multiple runs of a task from inside another task
Applied to files:
internal-packages/clickhouse/src/index.ts
📚 Learning: 2025-11-27T16:27:35.290Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.290Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Subscribe to run updates using `runs.subscribeToRun()` for realtime monitoring of task execution
Applied to files:
internal-packages/clickhouse/src/index.ts
🧬 Code graph analysis (5)
internal-packages/clickhouse/src/taskEvents.ts (1)
internal-packages/clickhouse/src/client/types.ts (2)
ClickhouseWriter(145-160)ClickhouseReader(29-135)
apps/webapp/app/v3/eventRepository/index.server.ts (4)
apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts (2)
clickhouseEventRepositoryV2(11-14)clickhouseEventRepository(6-9)apps/webapp/app/v3/eventRepository/eventRepository.server.ts (1)
eventRepository(1476-1476)apps/webapp/app/v3/taskEventStore.server.ts (1)
getTaskEventStore(53-55)apps/webapp/app/v3/featureFlags.server.ts (2)
flags(81-81)FEATURE_FLAG(4-8)
apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts (2)
apps/webapp/app/env.server.ts (1)
env(1246-1246)internal-packages/clickhouse/src/index.ts (1)
ClickHouse(66-187)
apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts (1)
internal-packages/clickhouse/src/taskEvents.ts (4)
TaskEventV1Input(5-22)TaskEventV1Input(24-24)TaskEventV2Input(139-158)TaskEventV2Input(160-160)
internal-packages/clickhouse/src/index.ts (1)
internal-packages/clickhouse/src/taskEvents.ts (4)
insertTaskEventsV2(162-174)getTraceSummaryQueryBuilderV2(176-196)getTraceDetailedSummaryQueryBuilderV2(198-219)getSpanDetailsQueryBuilderV2(221-232)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
internal-packages/clickhouse/schema/011_add_task_events_v2_table_mvs.sql (1)
1-18: LGTM!The materialized view correctly aggregates v2 events into the shared v1 usage table. Using
start_time(event time) for bucketing rather thaninserted_atis the right choice—usage should be attributed to when events actually occurred.apps/webapp/app/v3/featureFlags.server.ts (1)
13-13: LGTM!The feature flag enum correctly includes the new
clickhouse_v2option, maintaining consistency with the existing values.apps/webapp/app/env.server.ts (1)
1153-1153: LGTM!The environment variable enum correctly mirrors the feature flag expansion, enabling configuration of
clickhouse_v2as the default store.internal-packages/clickhouse/src/index.ts (1)
179-186: LGTM!The
taskEventsV2getter follows the same pattern astaskEvents, cleanly exposing the V2 insert and query builder functions.internal-packages/clickhouse/schema/010_add_task_events_v2.sql (1)
1-51: Well-designed schema for preventing "Too many parts" errors.Partitioning by
inserted_atinstead ofstart_timeis the correct approach for handling late-arriving events, as it ensures all events inserted on the same day go to the same partition regardless of their actual event time.internal-packages/clickhouse/src/taskEvents.ts (1)
135-232: LGTM!The V2 table functions are correctly implemented:
TaskEventV2Inputappropriately extends V1 with the optionalinserted_atfield- Query builders correctly target
trigger_dev.task_events_v2table- Reusing V1 result types (
TaskEventSummaryV1Result, etc.) is appropriate since the output schema is identicalapps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts (1)
1001-1007: Good: Partition pruning for V2 queries.The
inserted_atfilter enables efficient partition pruning for the V2 table. The comment explaining that there's no upper bound is helpful for future maintainers.apps/webapp/app/v3/eventRepository/index.server.ts (1)
78-110: The code is type-safe and correct; no changes needed.The
flags()function uses overloaded signatures where providing adefaultValueparameter (as done here) guarantees a non-optional return type ofPromise<z.infer<typeof FeatureFlagCatalog[T]>>. For thetaskEventRepositoryflag, this resolves toPromise<"clickhouse" | "clickhouse_v2" | "postgres">, which exactly matches the function's return type. ThedefaultValueitself (env.EVENT_REPOSITORY_DEFAULT_STORE) is constrained to the same enum values via Zod validation, and theflags()function further validates all values viaflagSchema.safeParse(). Therefore, the variableflagat line 109 is guaranteed to be one of the three expected strings, making the return type-safe.
Because of the partitioning setup on
task_events_v1it is possible to run into Too many parts clickhouse errors because clickhouse isn't able to merge parts fast enough, caused by many very small parts being created. Because a part cannot span a partition, when we write out task_events_v1 records with "late arriving" start times, we would end up creating a ton of parts with only a few records each.The problem stems from using the start time as the partition key and the way start time works. We create two event records per span, with the first event representing the "start" of the span, and the second event representing the end of the span. Both events have the same start time. But spans can and very frequently have very large durations. So we'd receive the closing end span way after the start time, causing it to land in a different partition and creating the very small parts.
This change keeps almost everything the same about the task_events_v1 table, other than add a new column for "inserted at" and using that new column for the partition key. This way we won't have any late arriving partitions being created, and batch inserts will only create a single part.
We use the same mechanism to migrate new runs to using this new table as we did for migrating between postgresql and task_events_v1 with a new
taskEventStorevalue ofclickhouse_v2.One other nice thing is we are reusing the same "rollup" tables for aggregating event counts for both the task_events_v1 and task_events_v2 tables.