feat: entity-graph pipeline — neural rewrite, entity-oriented schema, backfill gap tracking#454
Conversation
Track per-installation, per-entity-type backfill run history with a new gw_backfill_runs table (unique on installation_id + entity_type). Extends sourceConfig on all 4 provider variants with an optional backfill field for depth and entityTypes configuration.
Service-auth endpoints for the backfill orchestrator and entity workers to read and upsert backfill run records. GET supports optional status filter, POST upserts on (installation_id, entity_type) conflict.
Add optional depth, entityTypes, and holdForReplay params to notifyBackfill() and forward them through relay → trigger → Inngest.
…nk call sites Forward depth and entityTypes from workspace integration's sourceConfig.backfill to notifyBackfill() at all 4 bulkLink call sites (GitHub, Vercel, Linear, Sentry). Uses type assertion to widen inline sourceConfig literals to the full union type.
…-time tRPC errors Pages under (app)/ with no dynamic URL segments were triggering static generation during build. The void prefetch() pattern dehydrates pending queries, causing client-side useSuspenseQuery to fall back to HTTP fetches against a non-running server (ECONNREFUSED).
Add get-backfill-history step that fetches prior completed runs from the gateway, then filters work units to skip entity types already covered by prior runs at equal or wider depth. Re-triggers with the same depth become near-instant no-ops; depth escalations still fetch the wider range.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR implements a comprehensive refactoring of webhook processing, backfill orchestration, and event management. It introduces gateway and relay service client abstractions, replaces webhook payload storage with a workspace events model, adds CLI tools for API key management and webhook streaming, consolidates provider integrations into a provider-agnostic system, and migrates type definitions across packages. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes This PR represents a significant architectural refactoring across multiple applications with heterogeneous changes: comprehensive backfill system refactoring with new gateway/relay clients, complete CLI implementation, workspace events infrastructure, provider abstraction consolidation, and type system migrations. While individual cohorts follow consistent patterns, the breadth of interconnected systems and multiple moving parts (backfill orchestration, service clients, event streaming, provider consolidation, type migrations) requires careful understanding of how changes interact across the codebase. Possibly related PRs
|
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/relay/src/routes/backfill.ts (1)
11-17:⚠️ Potential issue | 🟠 MajorRestrict
depthto supported values at ingress.
depthcurrently accepts any positive integer, but this flow is designed for7 | 30 | 90. Unsupported values can be accepted here and fail later downstream.Proposed fix
const triggerSchema = z.object({ installationId: z.string().min(1), provider: z.string().min(1), orgId: z.string().min(1), - depth: z.number().int().positive().default(30), + depth: z.union([z.literal(7), z.literal(30), z.literal(90)]).default(30), entityTypes: z.array(z.string()).optional(), holdForReplay: z.boolean().optional(), });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/relay/src/routes/backfill.ts` around lines 11 - 17, The triggerSchema currently allows any positive integer for depth which can cause downstream failures; update triggerSchema's depth validation (the depth field in triggerSchema) to only accept the supported values 7, 30, or 90 (and keep the default as 30) by replacing the current z.number().int().positive().default(30) with a validator that restricts to the allowed enum/union of literal values (e.g., z.union/z.enum or z.literal-based union) so invalid values are rejected at ingress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.gitmodules:
- Around line 1-3: The .gitmodules entry for submodule "sub/.github" (path =
sub/.github, url = git@github.com:lightfastai/.github.git) is unrelated to the
backfill depth/entityTypes work and needs clarification: either add a line to
the PR description explaining why this submodule is required for the backfill
(e.g., shared workflows/templates used by functions in the backfill pipeline and
where they are referenced), or remove the .gitmodules change from this branch
and open a separate PR that adds the submodule; update any docs or code comments
that reference sub/.github to point to the new PR or to the specific
workflow/template files if you keep it here.
In `@api/console/src/router/org/workspace.ts`:
- Around line 1263-1266: The constructed sourceConfig for the bulkLink mutations
never includes a backfill property, so the spread
...(integrations[0]?.sourceConfig.backfill && { depth: ..., entityTypes: ... })
never runs and notifyBackfill never receives backfill options; fix by ensuring
the newly built sourceConfig includes backfill (or directly pass
integrations[0]?.sourceConfig.backfill when calling notifyBackfill) so depth and
entityTypes are preserved—update the relevant bulkLink mutation code paths where
sourceConfig is created and where notifyBackfill is invoked (references:
integrations[0]?.sourceConfig.backfill, sourceConfig, and notifyBackfill) to
either copy backfill into sourceConfig or explicitly forward
integrations[0].sourceConfig.backfill to notifyBackfill.
In `@apps/gateway/src/routes/connections.ts`:
- Around line 620-637: The handler registered with
connections.post("/:id/backfill-runs", apiKeyAuth, async (c) => { ... })
currently calls const body: unknown = await c.req.json() without guarding
against malformed JSON; wrap that await in a try-catch, and on a thrown error
return c.json({ error: "invalid_json", details: error.message }, 400) (or
similar 400 payload consistent with the other endpoint) before proceeding to the
zod parse, so the route returns a 400 for client JSON parse errors rather than
an unhandled 500.
In `@apps/relay/src/routes/backfill.ts`:
- Around line 50-53: The deduplication key currently built in the
deduplicationId template (deduplicationId:
`backfill:${provider}:${installationId}:${orgId}:d=${depth}:e=${entityTypes ?
[...entityTypes].sort().join(",") : ""}`) omits holdForReplay so requests that
only differ by that flag collapse; update the template to include a normalized
representation of holdForReplay (for example add `:r=${String(holdForReplay)}`
or use JSON.stringify(holdForReplay)) so the deduplicationId includes the replay
toggle, retaining the existing entityTypes sorting logic.
In `@db/console/src/migrations/0033_red_bug.sql`:
- Around line 6-7: Add domain constraints to the migration so the "depth" and
"status" columns enforce valid values: modify the column definitions for "depth"
and "status" to include CHECK constraints (e.g., ALTER TABLE ... ADD CHECK
("depth" >= 0) to prevent negative depths) and constrain "status" either with a
CHECK ("status" IN (...allowed values...)) or a foreign-key reference to a
canonical status lookup table; update the migration's CREATE/ALTER statements
that define "depth" and "status" to include these checks so history filtering
can't be poisoned by arbitrary values.
- Line 19: The unique index gw_br_installation_entity_idx on
lightfast_gw_backfill_runs (covering installation_id, entity_type) must be
changed so it does not enforce uniqueness; replace it with a non-unique btree
index (or, if uniqueness is required for a narrower purpose, include a
discriminating column such as run_id or created_at instead of making
(installation_id, entity_type) unique). Locate the CREATE UNIQUE INDEX statement
for gw_br_installation_entity_idx and alter it to CREATE INDEX (or adjust the
indexed columns) so historical runs are preserved and not deduplicated.
In `@db/console/src/schema/tables/gw-backfill-runs.ts`:
- Around line 23-27: Add DB-level constraints for the gw-backfill-runs table:
change the depth column (integer("depth").notNull()) to include a CHECK
enforcing its valid domain (e.g. depth >= 0 or > 0 as appropriate for your
logic) and restrict the status column (varchar("status", { length: 50
}).notNull()) to the allowed lifecycle values using either a CHECK that status
IN ('idle','pending','running','completed','failed','cancelled') or convert it
to a proper enum type; update the generated migration to add these constraints
(or create the enum) so the database enforces them on write.
In `@db/console/src/schema/tables/workspace-integrations.ts`:
- Around line 110-113: Extract the repeated inline backfill shape into a single
shared type alias (e.g., Backfill) and replace each inline occurrence with that
alias to prevent drift; locate the four places where a property named backfill
is defined with the inline shape (the objects that use depth: 7 | 30 | 90 and
entityTypes: string[]) in this file (they appear around the
workspace-integrations table definitions) and refactor them to reference the new
Backfill type alias so all providers share the same definition.
---
Outside diff comments:
In `@apps/relay/src/routes/backfill.ts`:
- Around line 11-17: The triggerSchema currently allows any positive integer for
depth which can cause downstream failures; update triggerSchema's depth
validation (the depth field in triggerSchema) to only accept the supported
values 7, 30, or 90 (and keep the default as 30) by replacing the current
z.number().int().positive().default(30) with a validator that restricts to the
allowed enum/union of literal values (e.g., z.union/z.enum or z.literal-based
union) so invalid values are rejected at ingress.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (25)
.gitmodulesapi/console/src/lib/backfill.test.tsapi/console/src/lib/backfill.tsapi/console/src/router/org/workspace.tsapps/backfill/src/inngest/client.tsapps/backfill/src/routes/trigger.tsapps/backfill/src/workflows/backfill-orchestrator.test.tsapps/backfill/src/workflows/backfill-orchestrator.tsapps/backfill/src/workflows/step-replay.test.tsapps/backfill/src/workflows/workflow-contracts.test.tsapps/console/src/app/(app)/(user)/account/settings/general/page.tsxapps/console/src/app/(app)/(user)/account/teams/new/page.tsxapps/console/src/app/(app)/(user)/new/page.tsxapps/gateway/src/routes/connections.test.tsapps/gateway/src/routes/connections.tsapps/relay/src/routes/backfill.test.tsapps/relay/src/routes/backfill.tsdb/console/src/migrations/0033_red_bug.sqldb/console/src/migrations/meta/0033_snapshot.jsondb/console/src/migrations/meta/_journal.jsondb/console/src/schema/index.tsdb/console/src/schema/tables/gw-backfill-runs.tsdb/console/src/schema/tables/index.tsdb/console/src/schema/tables/workspace-integrations.tssub/.github
| [submodule "sub/.github"] | ||
| path = sub/.github | ||
| url = git@github.com:lightfastai/.github.git |
There was a problem hiding this comment.
Clarify the relationship between this submodule and the PR objectives.
This PR is focused on backfill depth/entityTypes pipeline features, but this change adds a .github submodule with no explanation in the PR description. Mixing unrelated infrastructure changes with feature work can make the PR harder to review and complicate the git history.
If this submodule is required for the backfill feature (e.g., shared workflows or templates), please document that dependency in the PR description. Otherwise, consider moving this change to a separate PR.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.gitmodules around lines 1 - 3, The .gitmodules entry for submodule
"sub/.github" (path = sub/.github, url = git@github.com:lightfastai/.github.git)
is unrelated to the backfill depth/entityTypes work and needs clarification:
either add a line to the PR description explaining why this submodule is
required for the backfill (e.g., shared workflows/templates used by functions in
the backfill pipeline and where they are referenced), or remove the .gitmodules
change from this branch and open a separate PR that adds the submodule; update
any docs or code comments that reference sub/.github to point to the new PR or
to the specific workflow/template files if you keep it here.
| ...(integrations[0]?.sourceConfig.backfill && { | ||
| depth: integrations[0].sourceConfig.backfill.depth, | ||
| entityTypes: integrations[0].sourceConfig.backfill.entityTypes, | ||
| }), |
There was a problem hiding this comment.
Backfill options are never actually propagated to notifyBackfill.
In Line 1263, Line 1419, Line 1566, and Line 1714, the spread reads integrations[0].sourceConfig.backfill, but those newly constructed sourceConfig objects do not set backfill in these flows. Result: depth/entityTypes are always omitted.
✅ Fix pattern (apply to all four bulkLink mutations)
// input schema
.input(
z.object({
workspaceId: z.string(),
gwInstallationId: z.string(),
+ backfill: z.object({
+ depth: z.union([z.literal(7), z.literal(30), z.literal(90)]),
+ entityTypes: z.array(z.string()).min(1),
+ }).optional(),
// ... existing provider-specific array field
}),
)
// integration construction
sourceConfig: {
// ...existing fields
+ ...(input.backfill ? { backfill: input.backfill } : {}),
} as WorkspaceIntegration["sourceConfig"],
// notifyBackfill
void notifyBackfill({
installationId: input.gwInstallationId,
provider: "github", // provider-specific
orgId: ctx.auth.orgId,
- ...(integrations[0]?.sourceConfig.backfill && {
- depth: integrations[0].sourceConfig.backfill.depth,
- entityTypes: integrations[0].sourceConfig.backfill.entityTypes,
- }),
+ ...(input.backfill && {
+ depth: input.backfill.depth,
+ entityTypes: input.backfill.entityTypes,
+ }),
});Also applies to: 1419-1422, 1566-1569, 1714-1717
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/console/src/router/org/workspace.ts` around lines 1263 - 1266, The
constructed sourceConfig for the bulkLink mutations never includes a backfill
property, so the spread ...(integrations[0]?.sourceConfig.backfill && { depth:
..., entityTypes: ... }) never runs and notifyBackfill never receives backfill
options; fix by ensuring the newly built sourceConfig includes backfill (or
directly pass integrations[0]?.sourceConfig.backfill when calling
notifyBackfill) so depth and entityTypes are preserved—update the relevant
bulkLink mutation code paths where sourceConfig is created and where
notifyBackfill is invoked (references: integrations[0]?.sourceConfig.backfill,
sourceConfig, and notifyBackfill) to either copy backfill into sourceConfig or
explicitly forward integrations[0].sourceConfig.backfill to notifyBackfill.
| "depth" integer NOT NULL, | ||
| "status" varchar(50) NOT NULL, |
There was a problem hiding this comment.
Add DB constraints for depth and status domain integrity.
Line 6 and Line 7 currently accept arbitrary values, which can poison history filtering semantics.
🛡️ Suggested constraints
CREATE TABLE "lightfast_gw_backfill_runs" (
...
"depth" integer NOT NULL,
"status" varchar(50) NOT NULL,
...
);
--> statement-breakpoint
+ALTER TABLE "lightfast_gw_backfill_runs"
+ ADD CONSTRAINT "gw_backfill_runs_depth_chk"
+ CHECK ("depth" IN (7, 30, 90));
+--> statement-breakpoint
+ALTER TABLE "lightfast_gw_backfill_runs"
+ ADD CONSTRAINT "gw_backfill_runs_status_chk"
+ CHECK ("status" IN ('queued','running','completed','failed'));
+--> statement-breakpoint🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@db/console/src/migrations/0033_red_bug.sql` around lines 6 - 7, Add domain
constraints to the migration so the "depth" and "status" columns enforce valid
values: modify the column definitions for "depth" and "status" to include CHECK
constraints (e.g., ALTER TABLE ... ADD CHECK ("depth" >= 0) to prevent negative
depths) and constrain "status" either with a CHECK ("status" IN (...allowed
values...)) or a foreign-key reference to a canonical status lookup table;
update the migration's CREATE/ALTER statements that define "depth" and "status"
to include these checks so history filtering can't be poisoned by arbitrary
values.
| ); | ||
| --> statement-breakpoint | ||
| ALTER TABLE "lightfast_gw_backfill_runs" ADD CONSTRAINT "lightfast_gw_backfill_runs_installation_id_lightfast_gw_installations_id_fk" FOREIGN KEY ("installation_id") REFERENCES "public"."lightfast_gw_installations"("id") ON DELETE cascade ON UPDATE no action;--> statement-breakpoint | ||
| CREATE UNIQUE INDEX "gw_br_installation_entity_idx" ON "lightfast_gw_backfill_runs" USING btree ("installation_id","entity_type");--> statement-breakpoint |
There was a problem hiding this comment.
Unique (installation_id, entity_type) index can destroy run history fidelity.
Line 19 enforces one row per entity type per installation, which can overwrite prior wider-depth coverage and degrade future gap-aware skipping behavior.
🛠️ Suggested schema direction
-CREATE UNIQUE INDEX "gw_br_installation_entity_idx" ON "lightfast_gw_backfill_runs" USING btree ("installation_id","entity_type");
+-- Preserve multiple runs; keep lookup index non-unique.
+CREATE INDEX "gw_br_installation_entity_idx" ON "lightfast_gw_backfill_runs" USING btree ("installation_id","entity_type");
+-- Optional: enforce idempotency per exact range instead.
+CREATE UNIQUE INDEX "gw_br_installation_entity_since_idx" ON "lightfast_gw_backfill_runs" USING btree ("installation_id","entity_type","since");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@db/console/src/migrations/0033_red_bug.sql` at line 19, The unique index
gw_br_installation_entity_idx on lightfast_gw_backfill_runs (covering
installation_id, entity_type) must be changed so it does not enforce uniqueness;
replace it with a non-unique btree index (or, if uniqueness is required for a
narrower purpose, include a discriminating column such as run_id or created_at
instead of making (installation_id, entity_type) unique). Locate the CREATE
UNIQUE INDEX statement for gw_br_installation_entity_idx and alter it to CREATE
INDEX (or adjust the indexed columns) so historical runs are preserved and not
deduplicated.
| depth: integer("depth").notNull(), | ||
|
|
||
| // Run lifecycle status: idle|pending|running|completed|failed|cancelled | ||
| status: varchar("status", { length: 50 }).notNull(), | ||
|
|
There was a problem hiding this comment.
Enforce depth and status domains at the DB layer.
depth and status are currently unconstrained. Invalid values can be written and later corrupt gap-aware run decisions.
Suggested direction
- depth: integer("depth").notNull(),
- status: varchar("status", { length: 50 }).notNull(),
+ depth: integer("depth").notNull(), // add DB CHECK: depth IN (7, 30, 90)
+ status: varchar("status", { length: 50 }).notNull(), // add DB CHECK or enum for allowed lifecycle statesAlso ensure the generated migration includes these constraints.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@db/console/src/schema/tables/gw-backfill-runs.ts` around lines 23 - 27, Add
DB-level constraints for the gw-backfill-runs table: change the depth column
(integer("depth").notNull()) to include a CHECK enforcing its valid domain (e.g.
depth >= 0 or > 0 as appropriate for your logic) and restrict the status column
(varchar("status", { length: 50 }).notNull()) to the allowed lifecycle values
using either a CHECK that status IN
('idle','pending','running','completed','failed','cancelled') or convert it to a
proper enum type; update the generated migration to add these constraints (or
create the enum) so the database enforces them on write.
| backfill?: { | ||
| depth: 7 | 30 | 90; | ||
| entityTypes: string[]; | ||
| }; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Extract a shared backfill type to avoid drift across providers.
The same shape is duplicated four times. A shared alias keeps future changes consistent.
♻️ Suggested refactor
+type BackfillConfig = {
+ depth: 7 | 30 | 90;
+ entityTypes: string[];
+};
sourceConfig: jsonb("source_config").$type<
| {
// ...
- backfill?: {
- depth: 7 | 30 | 90;
- entityTypes: string[];
- };
+ backfill?: BackfillConfig;
}
| {
// ...
- backfill?: {
- depth: 7 | 30 | 90;
- entityTypes: string[];
- };
+ backfill?: BackfillConfig;
}
| {
// ...
- backfill?: {
- depth: 7 | 30 | 90;
- entityTypes: string[];
- };
+ backfill?: BackfillConfig;
}
| {
// ...
- backfill?: {
- depth: 7 | 30 | 90;
- entityTypes: string[];
- };
+ backfill?: BackfillConfig;
}Also applies to: 128-131, 143-146, 159-162
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@db/console/src/schema/tables/workspace-integrations.ts` around lines 110 -
113, Extract the repeated inline backfill shape into a single shared type alias
(e.g., Backfill) and replace each inline occurrence with that alias to prevent
drift; locate the four places where a property named backfill is defined with
the inline shape (the objects that use depth: 7 | 30 | 90 and entityTypes:
string[]) in this file (they appear around the workspace-integrations table
definitions) and refactor them to reference the new Backfill type alias so all
providers share the same definition.
…sole - Restrict triggerSchema depth to 7 | 30 | 90 (reject arbitrary values) - Include holdForReplay in QStash deduplication key - Guard POST /:id/backfill-runs against malformed JSON (400 vs 500) - Remove dead sourceConfig.backfill spread from bulkLink notifyBackfill calls - Extract shared BackfillConfig type alias to prevent drift
…gw-installations Remove dead BackfillConfig type and backfill? fields from all 4 sourceConfig variants in workspace-integrations (never read or written). Add nullable backfillConfig JSONB column to gw-installations where backfill config logically belongs (installation-scoped). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove private flag and unused workspace dependencies (console-chunking, console-config, console-embed, chalk, ora). Add publish metadata, tsup config with .mjs output, and wire into CI/release workflows.
Entity worker now POSTs a run record to the gateway after pagination completes (success) and in the onFailure handler (failure). Both are best-effort with catch guards. This populates the data that the orchestrator's gap-aware filtering reads to skip covered entity types.
…delivery When holdForReplay is set, entity workers persist webhooks at the relay without delivering to Console (X-Backfill-Hold header). After all workers complete, the orchestrator drains held webhooks through the admin catchup endpoint so historical events arrive in chronological order as a batch.
…-types Extract backfillTriggerPayload, backfillRunRecord, and backfillDepthSchema into a single source of truth. Fixes inconsistent depth defaults across relay/backfill/gateway and tightens gateway POST /backfill-runs to enforce 7|30|90 (was any positive integer). Adds missing admin catchup replay tests.
afk-ralph.sh → scripts/afk-many.sh ralph-once.sh → scripts/afk-once.sh afk-review.sh → scripts/afk-review.sh
Implement browser-based auth flow (reuses existing /cli/auth page), org API key creation, and real-time webhook event streaming via SSE. - CLI: login (browser OAuth + org selection), logout, listen (SSE stream) - Console: /api/cli/login and /api/cli/setup routes with JWT verification - Relay: Redis Stream publishing on webhook receipt, SSE polling endpoint with API key auth, heartbeat, and auto-reconnect support Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…URL, add version injection - Remove `id` from heartbeat SSE events to prevent invalid Last-Event-ID on reconnect (hb-* is not a valid Redis stream ID) - Remove 5 unused deps: chalk, nanoid, @repo/console-chunking, @repo/console-config, @repo/console-embed - Consolidate 3 duplicate getBaseUrl() reads into single export from config.ts - Add LifecycleVariables to cliRouter for observability parity with other relay routers - Replace hardcoded version with build-time injection via tsup define
…ypes-run-tracking # Conflicts: # core/cli/package.json # core/cli/src/bin.ts # core/cli/tsup.config.ts # pnpm-lock.yaml
…validation Add BACKFILL_RUN_STATUSES and BACKFILL_TERMINAL_STATUSES to gateway-types for consistent status handling. DRY up gateway connections route using terminal status set. Add validation to relay estimate proxy endpoint. Extract GITHUB_RATE_LIMIT_BUDGET constant.
….invoke() Extract typed gateway-client and relay-client with centralized auth, timeouts, and correlation IDs. Replace sendEvent/waitForEvent fan-out with step.invoke() for direct orchestrator→worker coordination. Hoist run persistence from entity workers to orchestrator. Remove entity completed event schema, onFailure handler, and ~950 lines of code.
… clients Rewrite orchestrator tests for step.invoke() pattern with gateway/relay client mocks. Simplify entity worker tests by removing onFailure and persist-run assertions. Update step-replay tests with invoke support. Delete workflow-contracts tests (CEL filter contracts no longer needed). Update estimate tests to use gateway client mocks.
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
…ntitytypes-run-tracking # Conflicts: # pnpm-lock.yaml
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
- Replace `pnpm turbo lint typecheck` with `pnpm check` (biome) + `pnpm turbo typecheck` in both ci.yml and ci-core.yml — fixes "Could not find task lint in project" CI failure - Delete orphaned apps/gateway/eslint.config.js (referenced non-existent @repo/eslint-config/hono package) - Remove stale "eslintConfig" fields from console-trpc and console-clerk-m2m package.json files - Remove @repo/eslint-config from .changeset/pre.json initialVersions - Fix client.test.ts: remove includeContext/includeHighlights from default search body assertion — fields were removed from client.ts in 83e8af7 but the test was not updated
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
…DK type errors - Create @repo/vitest-config package to fix ../../vitest.shared boundary violations - Update vitest shared config for Vitest 4 (poolOptions → maxWorkers) - Add implicit dependency for ~ path alias in turbo.json boundaries - Fix PGlite multi-statement migration with custom migrateWithExec() - Fix client.related() to forward all request fields (depth, types) - Update client.test.ts to include required depth parameter
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
|
🚧 Skipped: PR exceeds review size limit. Please split into smaller PRs and re-run. |
Summary
This PR represents a complete architectural overhaul of the ingestion and neural pipeline, not just backfill enhancements.
Neural pipeline rewrite
Entity-oriented schema & providers
actorfield fromPostTransformEvent, validation, AI tools, and UIpull_requestandissuesonly for day 1providerConfigJSONB and dropconnectedBycolumnDB schema conventions
gateway-*/workspace-*naming conventionsstate/url, junctioncategory, edgelastSeenAt,ingestLogIdFKclustertable andclusterIdcolumn; clean up metric typesBackfill overhaul
@repo/console-backfillpackage; replace with gateway proxy + provider entity handlers(resource, entityType)gap filter withproviderResourceIdtrackingdepth,entityTypes, andholdForReplaythrough full chain:notifyBackfill()→ relay → backfill trigger → Inngestgw_backfill_runstable + GET/POST/connections/:id/backfill-runsendpointsSearch system
UI
ProviderSourceItemcomponent replacing 4 individual provider source componentsAPI & tRPC
Observability & testing
Test plan
pnpm typecheckpasses (150 tasks)pnpm lintpasses (133 tasks)pnpm --filter @lightfast/backfill test— 60 tests pass (includes gap-aware filtering tests)pnpm --filter @api/console test— 6 tests passpnpm --filter lightfast-gateway test— gateway backfill-runs endpoint tests passpnpm --filter lightfast-relay test— fault injection tests passget-backfill-historystep@jeevanpillay