fix(ops): expose processing/cancelled statuses through API and UI by nicoloboschi · Pull Request #1231 · vectorize-io/hindsight

nicoloboschi · 2026-04-23T12:21:19Z

Summary

Stop collapsing processing into pending in list/get operation API responses — the real DB status is now returned as-is
Cancel sets status='cancelled' instead of deleting the row, with a guard that only pending operations can be cancelled (409 otherwise)
Retry accepts both failed and cancelled operations
UI: added processing/cancelled status badges, filters, and aligned stats card colors/labels with the operations table
UI: added cancel/retry buttons in the operation detail dialog
DB migration: adds cancelled to the async_operations_status_check constraint
OpenAPI spec and all client SDKs (Python, TypeScript, Go, Rust) regenerated

Test plan

9 new API tests in test_operation_status.py covering list, get, filter, cancel→cancelled, retry from cancelled, and status validation guards
All 10 existing test_op_cancellation.py tests pass (cancel checkpoint, cascade delete, etc.)
Lint passes
Manual verification of UI status badges and dialog buttons

The API was collapsing 'processing' into 'pending' before returning operation status to clients. Cancel was deleting the operation row instead of preserving it with a 'cancelled' status. - Stop mapping processing→pending in list/get operation responses - Add 'processing' to OperationStatusResponse Literal type - Change cancel_operation to set status='cancelled' instead of DELETE - Guard cancel to only accept pending operations (409 otherwise) - Extend retry to accept both failed and cancelled operations - Add _check_op_alive support for cancelled status - Add DB migration for 'cancelled' in status check constraint - Add processing/cancelled badges and filters in operations UI - Add cancel/retry buttons in operation detail dialog - Align stats card status colors and labels with operations table - Regenerate OpenAPI spec and all client SDKs

…#1231

Investigation and fixes for test failures on latest main: 1. test_per_operation_llm_config (2 tests): defaults were hardcoded to 10, but #1121 reduced DEFAULT_LLM_MAX_RETRIES to 3. Drive assertions from the constant so this tracks future changes automatically. 2. test_sql_schema_safety: #1210 added a docstring on task_backend.py:136 that said "INSERTed into async_operations", which false-positived the unqualified-table regex (INTO+INSERT+bare table). Rephrased the prose. 3. test_memory_engine_execute_task_passes_through_defer_operation: #1231 made execute_task short-circuit when the async_operations row is missing (treat as cancelled). The test created a fresh operation_id without inserting a row, so the handler never ran. Insert a pending row before execute_task. 4. 4 worker claim_batch / scan tests: assertions were counting total claims across the whole DB. test_async_batch_retain.py submits pending async_operations without sharing an xdist group, so parallel xdist workers polluted each other. Put test_async_batch_retain.py in the "worker_tests" group and also scope the worker-test assertions to the banks each test created, as defense-in-depth. 5. test_refresh_content_respects_max_tokens: observed ~1.9x over cap under Gemini's non-determinism; the 1.5x tolerance was too tight. Bumped to 2.5x — still well under the ~20x a "cap ignored" regression would produce.

* fix(tests): repair 9 regressions surfaced on main Investigation and fixes for test failures on latest main: 1. test_per_operation_llm_config (2 tests): defaults were hardcoded to 10, but #1121 reduced DEFAULT_LLM_MAX_RETRIES to 3. Drive assertions from the constant so this tracks future changes automatically. 2. test_sql_schema_safety: #1210 added a docstring on task_backend.py:136 that said "INSERTed into async_operations", which false-positived the unqualified-table regex (INTO+INSERT+bare table). Rephrased the prose. 3. test_memory_engine_execute_task_passes_through_defer_operation: #1231 made execute_task short-circuit when the async_operations row is missing (treat as cancelled). The test created a fresh operation_id without inserting a row, so the handler never ran. Insert a pending row before execute_task. 4. 4 worker claim_batch / scan tests: assertions were counting total claims across the whole DB. test_async_batch_retain.py submits pending async_operations without sharing an xdist group, so parallel xdist workers polluted each other. Put test_async_batch_retain.py in the "worker_tests" group and also scope the worker-test assertions to the banks each test created, as defense-in-depth. 5. test_refresh_content_respects_max_tokens: observed ~1.9x over cap under Gemini's non-determinism; the 1.5x tolerance was too tight. Bumped to 2.5x — still well under the ~20x a "cap ignored" regression would produce. * fix(tests): extend bank-scoped claim filters to 3 more worker tests CI on the first fix commit surfaced the same cross-file isolation problem in three additional worker tests. Apply the same bank-scoped filter pattern so each assertion only counts claims for the bank the test actually created: - test_claim_batch_claims_pending_tasks - test_concurrent_workers_claim_different_tasks - test_worker_slot_limits_enforced (in this one the executor itself ignores leaked tasks so its slot-limit gating stays on our tasks) These flake under parallel xdist because claim_batch() is global across bank_id; any pending row from another test file gets scooped up. The per-test filter is defense-in-depth on top of putting test_async_batch_retain.py in the same xdist_group. * fix(tests): isolate more slot/executor worker tests from cross-file claims test-api CI after the previous fix surfaced four more worker tests flaking the same way: they assert on counts that include tasks the poller legitimately claims from other test files running in parallel. Same bank-scoped filter pattern applied in the executor, plus the poller-internal counter assertions relaxed to >= (our executor returns immediately for non-our-bank tasks, but the counter may see them briefly before the slot frees). Covers: - test_worker_fire_and_forget_nonblocking - test_consolidation_slots_reserved_when_retain_saturates - test_per_operation_slot_reservations (multi-bank variant) - test_shared_pool_usable_by_reserved_types (preemptive) * fix(ui): remove unnecessary \- escape in parseBucketIso regexes ESLint's no-useless-escape flags \- inside a character class when the dash is not between two chars. Move the dash to the boundary so it's always a literal without needing an escape. Pre-existing on main (introduced by #1245); surfaced when verify- generated-files started exercising this lint path again after #1248. * chore: sync generated files with committed sources verify-generated-files was failing because main's committed copies of two generated/auto-formatted files have drifted from what the scripts and ruff now produce: - hindsight-api-slim/hindsight_api/db_url.py: ruff format now collapses a 2-line list comprehension to 1 line (long-line threshold). - skills/hindsight-docs/references/developer/configuration.md: the doc-skill generator emits the Cohere output_dimensions entry that #1249 added to configuration.md but didn't regenerate the skill copy. Not functional changes — just aligning the committed outputs with the generators/formatters. * fix(tests): isolate test_recall_time_range hardcoded-UUID fixture This file inserts memory_units with three hardcoded UUIDs (00000000-…-000{1,2,3}). memory_units.id is a global primary key, so parallel xdist workers running these tests simultaneously hit pk_memory_units uniqueness violations (seen intermittently in test-api CI as fixture-setup ERRORs). Two defenses: - Share an xdist_group so the eight tests serialize on the same worker — prevents concurrent workers from inserting the same IDs. - Defensive pre-DELETE at fixture setup so a previous interrupted run's leftover rows don't poison the next setup. Flake, not a regression from this branch, but surfaces here so fixing it unblocks the PR. * fix(tests): filter claims in test_poller_without_tenant_extension_uses_public One more worker test that asserted len(claimed) == 3 without scoping to its own bank; scope the assertion to bank_id. Keeps the schema-None invariant on every claim since no tenant extension is configured.

* docs(ops): document processing + cancelled statuses from #1231 * docs(skills): mirror operations.md status update from #1231

nicoloboschi added 4 commits April 23, 2026 14:20

chore: regenerate docs skill openapi reference

b082b1f

chore: regenerate clients and openapi spec (full sync)

f927abe

fix(cli): handle processing/cancelled status variants in Rust CLI

befeba1

nicoloboschi merged commit 80982da into main Apr 23, 2026
53 of 54 checks passed

r266-tech added a commit to r266-tech/hindsight that referenced this pull request Apr 23, 2026

docs(ops): document processing + cancelled statuses from vectorize-io…

bdbf208

…#1231

r266-tech added a commit to r266-tech/hindsight that referenced this pull request Apr 23, 2026

docs(skills): mirror operations.md status update from vectorize-io#1231

fcc08f0

r266-tech mentioned this pull request Apr 23, 2026

docs(ops): document processing + cancelled statuses from #1231 #1238

Merged

nicoloboschi mentioned this pull request Apr 24, 2026

fix(tests): repair 9 regressions surfaced on main #1251

Merged

3 tasks

nicoloboschi pushed a commit that referenced this pull request Apr 24, 2026

docs(ops): document processing + cancelled statuses from #1231 (#1238)

e1c6092

* docs(ops): document processing + cancelled statuses from #1231 * docs(skills): mirror operations.md status update from #1231

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ops): expose processing/cancelled statuses through API and UI#1231

fix(ops): expose processing/cancelled statuses through API and UI#1231
nicoloboschi merged 4 commits intomainfrom
fix/expose-processing-cancelled-status

nicoloboschi commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nicoloboschi commented Apr 23, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant