Tasks: add blocked flow retry state#58204
Conversation
Greptile SummaryThis PR wires up blocked-flow state persistence and clean retry plumbing for one-task flows.
Confidence Score: 5/5Safe to merge — no correctness bugs found; all remaining findings are P2 style/performance suggestions. The null/undefined sentinel pattern for clearing blocked fields and endedAt is consistent throughout. SQLite migration is safe with PRAGMA guard. Retry gating double-checks both flow and task status. All manual checklist paths are covered by automated tests. No files require special attention; P2 notes on task-registry.ts and flow-registry.store.sqlite.ts do not block merge. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/tasks/flow-registry.store.sqlite.ts
Line: 191-202
Comment:
**SQL identifiers interpolated without sanitization in `ensureColumn`**
`tableName`, `columnName`, and `columnDefinition` are all interpolated directly into raw SQL strings via template literals:
```typescript
const rows = db.prepare(`PRAGMA table_info(${tableName})`).all()
db.exec(`ALTER TABLE ${tableName} ADD COLUMN ${columnName} ${columnDefinition};`);
```
DDL statements don't support `?`-style parameterization, so the interpolation itself is unavoidable. However, the function currently has no guard against non-literal arguments. The two current call sites are both hardcoded and safe, but if this utility is reused elsewhere with a runtime-derived string, it becomes a SQL injection surface.
Consider adding a small allowlist check or asserting that the arguments match a safe identifier pattern (e.g. `/^[A-Za-z_][A-Za-z0-9_]*$/`) before executing.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/tasks/task-registry.ts
Line: 1370-1395
Comment:
**`listTasksForFlowId` does a full O(n) scan with no index**
`listTasksForFlowId` iterates the entire `tasks` Map to find matching entries:
```typescript
return [...tasks.values()]
.map((task, insertionIndex) =>
task.parentFlowId?.trim() === normalizedFlowId ? { ...cloneTaskRecord(task), insertionIndex } : null,
)
...
```
The comparable `listTasksForSessionKey` avoids this by maintaining a dedicated `taskIdsBySessionKey` index (a `Map<string, Set<string>>`). When the task registry grows large, this becomes the only hot per-flow lookup that degrades linearly.
Since `findLatestTaskForFlowId` is on the retry critical path, it's worth introducing a `taskIdsByFlowId` index in the same style as `taskIdsBySessionKey` in a follow-up.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Tasks: add blocked flow retry state" | Re-trigger Greptile |
| function ensureColumn( | ||
| db: DatabaseSync, | ||
| tableName: string, | ||
| columnName: string, | ||
| columnDefinition: string, | ||
| ) { | ||
| const rows = db.prepare(`PRAGMA table_info(${tableName})`).all() as Array<{ name?: string }>; | ||
| if (rows.some((row) => row.name === columnName)) { | ||
| return; | ||
| } | ||
| db.exec(`ALTER TABLE ${tableName} ADD COLUMN ${columnName} ${columnDefinition};`); | ||
| } |
There was a problem hiding this comment.
SQL identifiers interpolated without sanitization in
ensureColumn
tableName, columnName, and columnDefinition are all interpolated directly into raw SQL strings via template literals:
const rows = db.prepare(`PRAGMA table_info(${tableName})`).all()
db.exec(`ALTER TABLE ${tableName} ADD COLUMN ${columnName} ${columnDefinition};`);DDL statements don't support ?-style parameterization, so the interpolation itself is unavoidable. However, the function currently has no guard against non-literal arguments. The two current call sites are both hardcoded and safe, but if this utility is reused elsewhere with a runtime-derived string, it becomes a SQL injection surface.
Consider adding a small allowlist check or asserting that the arguments match a safe identifier pattern (e.g. /^[A-Za-z_][A-Za-z0-9_]*$/) before executing.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tasks/flow-registry.store.sqlite.ts
Line: 191-202
Comment:
**SQL identifiers interpolated without sanitization in `ensureColumn`**
`tableName`, `columnName`, and `columnDefinition` are all interpolated directly into raw SQL strings via template literals:
```typescript
const rows = db.prepare(`PRAGMA table_info(${tableName})`).all()
db.exec(`ALTER TABLE ${tableName} ADD COLUMN ${columnName} ${columnDefinition};`);
```
DDL statements don't support `?`-style parameterization, so the interpolation itself is unavoidable. However, the function currently has no guard against non-literal arguments. The two current call sites are both hardcoded and safe, but if this utility is reused elsewhere with a runtime-derived string, it becomes a SQL injection surface.
Consider adding a small allowlist check or asserting that the arguments match a safe identifier pattern (e.g. `/^[A-Za-z_][A-Za-z0-9_]*$/`) before executing.
How can I resolve this? If you propose a fix, please make it concise.| export function listTasksForFlowId(flowId: string): TaskRecord[] { | ||
| ensureTaskRegistryReady(); | ||
| const normalizedFlowId = flowId.trim(); | ||
| if (!normalizedFlowId) { | ||
| return []; | ||
| } | ||
| return [...tasks.values()] | ||
| .map((task, insertionIndex) => | ||
| task.parentFlowId?.trim() === normalizedFlowId | ||
| ? { ...cloneTaskRecord(task), insertionIndex } | ||
| : null, | ||
| ) | ||
| .filter( | ||
| ( | ||
| task, | ||
| ): task is TaskRecord & { | ||
| insertionIndex: number; | ||
| } => Boolean(task), | ||
| ) | ||
| .toSorted(compareTasksNewestFirst) | ||
| .map(({ insertionIndex: _, ...task }) => task); | ||
| } | ||
|
|
||
| export function findLatestTaskForFlowId(flowId: string): TaskRecord | undefined { | ||
| const task = listTasksForFlowId(flowId)[0]; | ||
| return task ? cloneTaskRecord(task) : undefined; |
There was a problem hiding this comment.
listTasksForFlowId does a full O(n) scan with no index
listTasksForFlowId iterates the entire tasks Map to find matching entries:
return [...tasks.values()]
.map((task, insertionIndex) =>
task.parentFlowId?.trim() === normalizedFlowId ? { ...cloneTaskRecord(task), insertionIndex } : null,
)
...The comparable listTasksForSessionKey avoids this by maintaining a dedicated taskIdsBySessionKey index (a Map<string, Set<string>>). When the task registry grows large, this becomes the only hot per-flow lookup that degrades linearly.
Since findLatestTaskForFlowId is on the retry critical path, it's worth introducing a taskIdsByFlowId index in the same style as taskIdsBySessionKey in a follow-up.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/tasks/task-registry.ts
Line: 1370-1395
Comment:
**`listTasksForFlowId` does a full O(n) scan with no index**
`listTasksForFlowId` iterates the entire `tasks` Map to find matching entries:
```typescript
return [...tasks.values()]
.map((task, insertionIndex) =>
task.parentFlowId?.trim() === normalizedFlowId ? { ...cloneTaskRecord(task), insertionIndex } : null,
)
...
```
The comparable `listTasksForSessionKey` avoids this by maintaining a dedicated `taskIdsBySessionKey` index (a `Map<string, Set<string>>`). When the task registry grows large, this becomes the only hot per-flow lookup that degrades linearly.
Since `findLatestTaskForFlowId` is on the retry critical path, it's worth introducing a `taskIdsByFlowId` index in the same style as `taskIdsBySessionKey` in a follow-up.
How can I resolve this? If you propose a fix, please make it concise.
Summary
Testing
pnpm exec oxlint src/tasks/flow-registry.types.ts src/tasks/flow-registry.ts src/tasks/flow-registry.store.sqlite.ts src/tasks/flow-registry.test.ts src/tasks/flow-registry.store.test.ts src/tasks/task-registry.ts src/tasks/task-executor.ts src/tasks/task-executor.test.tspnpm exec vitest run src/tasks/task-executor.test.ts src/tasks/flow-registry.test.ts src/tasks/flow-registry.store.test.ts src/tasks/task-registry.test.ts -t "blocked|retry|flow|preserves endedAt|minimal defaults|auto-creates|fallback" --maxWorkers=1Manual checklist