feat(history): inline Retry button + retry-{any,own} RBAC + linkage + threshold + ops-hint (closes #47)#168
Conversation
…hreshold + ops-hint Closes #47. Salvages a previously-stalled agent attempt: the work was implemented but never committed; rebased onto current feat/multicloud-web-frontend tip which now includes PR #145 (cancel button + cancel-{any,own} RBAC + migration 000041_purchase_executions_created_by_user_id). The retry plumbing mirrors the cancel pattern row-for-row: same session-validation gate, same authorize-{any,own} matrix, same legacy-NULL-creator semantics for non-admins. Backend ======= * New POST /api/purchases/retry/{id} (AuthUser). Creates a new pending execution from a *failed* row's stored Recommendations and links the predecessor with retry_execution_id. Three writes happen in one tx (successor INSERT, predecessor UPSERT, suppression INSERT) so a crash mid-flow leaves either everything or nothing. * RBAC: ActionRetryOwn / ActionRetryAny in internal/auth/types.go, retry-own:purchases added to DefaultUserPermissions(). Admins short-circuit, retry-any holders may retry any row, retry-own holders may retry only their own. Test matrix mirrors cancel. * Already-retried guard: a row whose retry_execution_id is already set rejects with 409 + retry_execution_id detail so a stale-cache double-tap can't silently overwrite the lineage pointer. * Persistent-failure block (Q3): handler resolves an ops_hint from a small map of known-persistent failure substrings (FROM_EMAIL not configured, SES sandbox, etc.) and returns 409 with the hint in structured detail fields. The History row also surfaces the same hint inline, replacing the Retry button entirely. * Threshold soft-block (Q2): retry_attempt_n on the new row is predecessor.retry_attempt_n + 1; an attempt that would land at >= 5 is rejected with 409 + retry_attempt_n / threshold details. The frontend confirms-with-warning then re-POSTs with ?force=true to override. * Migration 000042 adds nullable retry_execution_id (self-FK ON DELETE SET NULL) and retry_attempt_n INT NOT NULL DEFAULT 0, plus a partial index on the FK for the rare populated case. store_postgres.go INSERT/UPSERT/SELECT cover the new columns; retry_execution_id is updateable via ON CONFLICT (the retry handler re-saves the predecessor to set the pointer), but created_by_user_id and retry_attempt_n stay INSERT-only so the scheduler's status-only UPSERTs don't rewrite provenance. * clientError grew an optional details map so ops_hint / retry_attempt_n / threshold can reach the frontend as structured JSON fields instead of substring-matched message text. Frontend ======== * History column doubles as the per-row action surface (renderActionCell): Cancel on cancellable pending rows (#46), Retry on retryable failed rows (this change), ops-hint badge on persistent failures, and inline lineage cross-links on retried rows. Status-driven mutually-exclusive decision tree. * canRetryFailedRow gates the button: status=failed AND no ops_hint AND no retry_execution_id (already retried) AND admin OR matching created_by_user_id. UX gate only — backend remains authoritative. * Threshold-reached rows render an "⚠ Retried 5× — click to override" button; the click confirms then POSTs with force=true. * RetryPurchaseResult exported from frontend/src/api so callers can consume execution_id / retry_attempt_n. Slice-aliasing fix ================== The retry handler's new execution borrowed failedExec.Recommendations by slice header. Mutations downstream (internal/purchase/execution.go writes exec.Recommendations[i].{Error,Purchased,PurchaseID}) would have seen through the alias and corrupted the historical failed row's in-memory representation visible to any caller still holding the failedExec pointer. Fixed via defensive copy (append([]RecommendationRecord(nil), failedExec.Recommendations...)) with a comment explaining the in-process aliasing footgun. Tests ===== * 14 new backend table tests covering admin/retry-any/retry-own paths, non-failed rejection, legacy-null-creator non-admin rejection, persistent-failure block + no-match, threshold block + force override, just-under-threshold allow, already-retried 409, missing-session rejection. * 11 new frontend jest tests covering button visibility per role, ops-hint replacement, threshold-button text + force=true wiring, lineage link rendering, retry_attempt_n badge, declined dialog skips API, accepted posts + reloads + toasts, API failure surfaces toast and re-enables button. * Auth permission-count tests bumped 7→8 (default user) and 9→10 (user + 2 groups) to account for the new retry-own grant.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
📝 WalkthroughWalkthroughIntroduces a complete "retry failed purchase execution" feature across frontend and backend. Adds UI gating and button interaction in the history table, implements a new authenticated retry API endpoint with authorization, detects persistent operator-fixable errors, creates successor executions via transactional logic, extends database schema with retry linkage metadata, and adds authorization rules and comprehensive test coverage. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Frontend as Frontend UI
participant API as API Handler
participant DB as Database
participant Email as Email Service
User->>Frontend: Click Retry button on failed row
Frontend->>Frontend: Show confirmation dialog
User->>Frontend: Accept confirmation
Frontend->>Frontend: Disable button, start loading
alt Is force override required?
Frontend->>API: POST /purchases/retry/{id}?force=true
else Normal retry
Frontend->>API: POST /purchases/retry/{id}
end
API->>DB: Lookup original execution
API->>DB: Validate state (failed, not already retried)
API->>DB: Resolve ops_hint from error
alt ops_hint exists (persistent error)
API-->>Frontend: HTTP 409 with ops_hint detail
Frontend->>Frontend: Show error toast, enable button
else force=false & threshold exceeded
API-->>Frontend: HTTP 409 with retry_attempt_n detail
Frontend->>Frontend: Show error toast, enable button
else Authorization denied
API-->>Frontend: HTTP 403
Frontend->>Frontend: Show error toast
else Success path
API->>DB: Start transaction
API->>DB: Create successor execution (pending, incremented RetryAttemptN)
API->>DB: Update original with retry_execution_id → successor id
API->>DB: Insert suppressions for successor
API->>DB: Commit transaction
API->>Email: Send approval email for successor
API-->>Frontend: HTTP 200 with response (new execution id, metadata)
Frontend->>Frontend: Show success toast
Frontend->>Frontend: Call loadHistory() to refresh table
Frontend->>API: GET /history
API->>DB: Fetch updated executions with retry metadata
API-->>Frontend: Return rows with retry linkage
Frontend->>Frontend: Re-render history table with updated rows
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
internal/config/store_postgres_pgxmock_test.go (1)
368-381: Consider asserting the new retry fields directly.The mock rows now match the expanded projection, but explicit checks for
RetryExecutionIDandRetryAttemptNwould make these tests catch scan-order regressions in the new columns too.Also applies to: 399-413
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/config/store_postgres_pgxmock_test.go` around lines 368 - 381, Add explicit assertions in the test that verify the scanned values for the new retry fields to catch scan-order regressions: after the row is added via pgxmock.NewRows(cols).AddRow(...) and the result is scanned into the execution struct (the code paths that currently assert other fields like Status, StepNumber, etc.), add checks that execution.RetryExecutionID (or the field name used in your struct) equals the expected nil/empty value and execution.RetryAttemptN equals the expected numeric value (0 in this mock); do this for the two locations mentioned around the cols/AddRow block and the similar block at lines ~399-413 so the test explicitly validates both RetryExecutionID and RetryAttemptN.internal/api/handler_router.go (1)
47-51: Defensively copydetailsto avoid shared-map mutation leaks.
map[string]anyis reference-typed; returning/storing it directly allows external mutation of error payload state. Cloning in constructor/accessor makes this safer and more predictable.♻️ Suggested hardening
func (e *clientError) Details() map[string]any { return e.details } +func cloneDetails(in map[string]any) map[string]any { + if in == nil { + return nil + } + out := make(map[string]any, len(in)) + for k, v := range in { + out[k] = v + } + return out +} + -func NewClientErrorWithDetails(code int, message string, details map[string]any) error { - return &clientError{message: message, code: code, details: details} +func NewClientErrorWithDetails(code int, message string, details map[string]any) error { + return &clientError{message: message, code: code, details: cloneDetails(details)} }-func (e *clientError) Details() map[string]any { return e.details } +func (e *clientError) Details() map[string]any { return cloneDetails(e.details) }Also applies to: 63-65
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/api/handler_router.go` around lines 47 - 51, Return a defensive copy of the details map instead of the internal reference: in clientError.Details() clone the e.details map into a new map and return that copy, and likewise make a copy when assigning/storing details in the clientError constructor/initializer (any function that constructs or sets clientError.details, e.g., NewClientError or the clientError literal assignment) to avoid external code mutating the internal map; perform a shallow copy of keys/values (map[string]any) since values are interface{}.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@frontend/src/history.ts`:
- Around line 598-602: The catch branch currently only shows err.message which
loses structured retry info; update the catch handling for retryError in
frontend/src/history.ts to cast/parses retryError as any and extract backend
retry fields (ops_hint, retry_attempt_n, threshold) and include them in the
toast (e.g., append ops_hint and attempt/threshold details to the message or
pass them in toast metadata) while still falling back to err.message/'unknown
error'; keep re-enabling the button (btn.disabled = false) and use the same
showToast call to surface the structured retry guidance to the user.
In `@internal/api/handler_purchases.go`:
- Around line 651-663: When creating the retry successor PurchaseExecution (the
newExecution variable of type config.PurchaseExecution), preserve the original
plan-scoped metadata by copying PlanID and StepNumber from the failed execution
(failedExec) into the newExecution struct; update the initializer to include
PlanID: failedExec.PlanID and StepNumber: failedExec.StepNumber so the retried
execution remains associated with the same plan and ramp step.
- Around line 581-593: The guard that returns the retry_execution_id leaks
cross-user metadata because it runs before RBAC; move the call to
authorizeSessionRetry(ctx, session, failedExec) to execute before the
already-retried check (the block that inspects failedExec.RetryExecutionID and
returns NewClientErrorWithDetails with retry_execution_id) so authorization is
enforced prior to revealing RetryExecutionID/descendant info; update the logic
around failedExec.RetryExecutionID, failedExec.ExecutionID, and the
NewClientErrorWithDetails call accordingly to ensure no retry_execution_id is
returned unless authorizeSessionRetry has succeeded.
---
Nitpick comments:
In `@internal/api/handler_router.go`:
- Around line 47-51: Return a defensive copy of the details map instead of the
internal reference: in clientError.Details() clone the e.details map into a new
map and return that copy, and likewise make a copy when assigning/storing
details in the clientError constructor/initializer (any function that constructs
or sets clientError.details, e.g., NewClientError or the clientError literal
assignment) to avoid external code mutating the internal map; perform a shallow
copy of keys/values (map[string]any) since values are interface{}.
In `@internal/config/store_postgres_pgxmock_test.go`:
- Around line 368-381: Add explicit assertions in the test that verify the
scanned values for the new retry fields to catch scan-order regressions: after
the row is added via pgxmock.NewRows(cols).AddRow(...) and the result is scanned
into the execution struct (the code paths that currently assert other fields
like Status, StepNumber, etc.), add checks that execution.RetryExecutionID (or
the field name used in your struct) equals the expected nil/empty value and
execution.RetryAttemptN equals the expected numeric value (0 in this mock); do
this for the two locations mentioned around the cols/AddRow block and the
similar block at lines ~399-413 so the test explicitly validates both
RetryExecutionID and RetryAttemptN.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0a6bfe37-5731-4364-9ce9-b79a8e5edf5e
📒 Files selected for processing (20)
frontend/src/__tests__/history-retry-button.test.tsfrontend/src/api/index.tsfrontend/src/api/purchases.tsfrontend/src/history.tsfrontend/src/types.tsinternal/api/handler.gointernal/api/handler_history.gointernal/api/handler_purchases.gointernal/api/handler_purchases_test.gointernal/api/handler_router.gointernal/api/router.gointernal/auth/service_group_test.gointernal/auth/service_test.gointernal/auth/types.gointernal/auth/types_test.gointernal/config/store_postgres.gointernal/config/store_postgres_pgxmock_test.gointernal/config/types.gointernal/database/postgres/migrations/000042_purchase_executions_retry_linkage.down.sqlinternal/database/postgres/migrations/000042_purchase_executions_retry_linkage.up.sql
Three actionable findings from CR's first review (PR #168): 1. **Major / security**: authorize before the already-retried guard. The `failedExec.RetryExecutionID` 409 ran ahead of `authorizeSessionRetry`, so any signed-in user who could guess a failed-row UUID would learn it had been retried AND get the descendant execution ID — a cross-user info leak. Reordered the gate sequence so RBAC runs first; the descendant UUID is now surfaced only after the caller has proven they're allowed to act on the row. New regression test `TestHandler_retryPurchase_AlreadyRetried_RBACBeforeLeak` asserts the 403 path doesn't carry retry_execution_id in the details map. 2. **Major / correctness**: preserve PlanID + StepNumber on retry successors. Without this, retrying a failed planned execution dropped the new row out of plan-scoped history and lost its ramp-step attribution. Both fields now propagate from the predecessor; for ad-hoc executions PlanID="" and StepNumber=0 so propagation is a no-op. New regression test `TestHandler_retryPurchase_PreservesPlanMetadata` asserts the propagation. 3. **Minor / UX**: surface structured retry hints in the failure toast. The catch path used to discard `ops_hint` / `retry_attempt_n` / `threshold` from a 4xx body and toast only `err.message`, leaving the user with a generic failure instead of the operator-actionable hint or override guidance the backend already returns. To wire this up: - `ApiError` gained an optional `details: Record<string, unknown>` field (frontend/src/api/types.ts). - `apiRequest` now extracts ALL non-`error` keys from a 4xx body into `error.details` (frontend/src/api/client.ts) so callers can branch on machine-readable hints without parsing message text. - The retry catch in history.ts inspects `err.details.ops_hint` first, then falls back to a "already retried n× (threshold k) — confirm the override prompt to force" message synthesised from `retry_attempt_n` + `threshold`, then to `err.message`. Plus two nits from the same review: 4. **Defensive copy of clientError details map**. The previous shape stored the constructor's input map by reference; a caller mutating the map after construction would reach into the error's payload. `cloneDetails` shallow-copies on both construction and `Details()` read so the error is sealed. 5. **Explicit retry-field assertions in pgxmock GetExecutionByID tests**. Two new assertions (one for the NULL/0 legacy-row case, one for a populated chain-position-2 case) guard against scan- order regressions in the new `retry_execution_id` / `retry_attempt_n` columns. A column reorder upstream would scan into the wrong destination and fail at least one of these. All 1337 frontend tests + ~16 backend retry/auth/config tests pass. gocyclo / gofmt / go vet pre-commit hooks pass.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
P2: substantial feature (retry failed purchases with RBAC, linkage, threshold, ops-hint). CI green, mergeable, 1337 frontend tests pass. Good overall but CodeRabbit flagged docstring coverage (28.57% vs 80% threshold) — worth addressing before merge. Backend-side save guard for invalid combos noted as out of scope; worth tracking as a follow-up issue. (triage agent wave2-F) |
…205) (#206) * fix(db): idempotently restore retry_execution_id column dropped by migration drift (closes #204, closes #205) Production /api/purchases/* surfaced 500 errors after PR #161 deployed, with the underlying CloudWatch error: ERROR: column "retry_execution_id" does not exist (SQLSTATE 42703) User-facing symptoms (dashboard "Upcoming Scheduled Purchases" panel): * "Failed to load purchase details: Internal server error" (#205) * "Failed to cancel purchase" (#204) Both buttons hit `Handler.GetExecutionByID` whose SELECT references the retry_execution_id and retry_attempt_n columns introduced by PR #168. Root cause: migration version drift. * PR #189 added 000042_recommendations_add_engine_to_key earlier on the same day. Deployed Lambdas applied it as version 42. * PR #168 then added 000042_purchase_executions_retry_linkage, colliding on version 42. * PR #195 resolved the on-disk duplicate by renaming RECOMMENDATIONS to 000043 — the wrong direction. Deployed DBs already had schema_migrations.version=42 marked as the engine content, so on the next migrate.Up() they: - skipped on-disk 000042 (now retry_linkage) entirely (DB version >= 42 already) - applied on-disk 000043 (now engine again) as a no-op via the ADD COLUMN IF NOT EXISTS guards already in that file Net effect: the retry_execution_id / retry_attempt_n columns were never added on prod, but the Go code expects them. Forward-only corrective. The new 000044 replays 000042's schema additions wrapped in IF NOT EXISTS / CREATE INDEX IF NOT EXISTS so it's a no-op on any DB that already has the columns (fresh deploys, manually-fixed envs) and additive on the broken ones. We don't try to repair schema_migrations history — golang-migrate only cares about the high-water version, and we can't tell from outside whether a given environment skipped 42 or applied 43 twice. The migration up + down sql files document the incident and the edge cases inline so future readers don't have to re-derive the history from the git log. Closes #204 Closes #205 * fix(db): repair partially-fixed DBs in 000044, not just no-op (CR pass 1) CR pass 1 on PR #206 flagged that the original 000044 only added columns when missing — a manually-fixed DB with the column but no FK, or with retry_attempt_n nullable, would still be broken after this migration. Rewrite as five idempotent steps that each check pg_catalog / information_schema for the precise piece they own, so every reachable partial state is repaired: 1. Add retry_execution_id column if missing. 2. Add the FK on retry_execution_id -> execution_id if no FK on that column exists. Match by column relationship (pg_constraint join pg_attribute) NOT by constraint name, so a fresh-deploy DB whose 000042 ran cleanly with the auto-generated `purchase_executions_retry_execution_id_fkey` is recognised as already-FK'd and we don't add a second. 3. Add retry_attempt_n column if missing (as nullable, so step 4 can backfill before flipping to NOT NULL). 4. Backfill any NULLs and SET DEFAULT 0 / SET NOT NULL unconditionally — Postgres treats those as no-ops when the constraint is already as requested. 5. Create the partial index if missing. Down stays simple: DROP COLUMN ... CASCADE removes the FK regardless of its name. Comments inline document each reachable partial state and why each step is safe to re-run. Migration is a no-op on healthy DBs, fully repairing on broken ones.
… RBAC (#286) (#299) * feat(api,history): in-dashboard purchase approval + approve-{any,own} RBAC (closes #286) Adds an inline Approve button on pending Purchase History rows + the session-authed backend dispatch that lets admins (and `approve-own` holders for their own pending rows) approve without round-tripping to the SES email. Mirrors the same dual-auth shape as the Cancel button (PR #145) and Retry button (PR #168). Backend ------- * `internal/auth/types.go`: new constants `ActionApproveOwn` / `ActionApproveAny` next to the existing cancel-/retry- block. `DefaultUserPermissions()` adds `approve-own:purchases` so every authenticated user can approve pending executions they themselves created. `approve-any` exists for future operator roles; no default non-admin grant. * `internal/api/handler_purchases.go::approvePurchase` refactored into the same three-mode dispatch as `cancelPurchase`: 1. Session present + RBAC matches → session-authed approve via `approvePurchaseViaSession` regardless of token. 2. token != "" → legacy email-link path, unchanged. 3. token == "" → session-authed dashboard branch. Permission-denied (403) on the session check falls through to the token branch so a logged-in non-admin who is the per-account contact_email recipient can still approve via the email link. * New `approvePurchaseViaSession` mirrors `cancelPurchaseViaSession` minus the suppression-cleanup tx (approve doesn't drop suppressions — those persist until cancel/expiry). Status flip + `ApprovedBy = session.Email` stamp + SavePurchaseExecution. * New `authorizeSessionApprove` mirror of `authorizeSessionCancel`: admin → ok; else `approve-any` → ok; else `approve-own` AND `created_by_user_id == session.UserID` → ok; else 403. No new migration: existing `purchase_executions.approved_by` (TEXT, migration 000035) and `created_by_user_id` (UUID FK, migration 000041) are sufficient — the email field stamps the actor and the UUID drives the own-ness check. Frontend -------- * `frontend/src/api/purchases.ts`: new `approvePurchase(executionId)` caller wrapping `POST /purchases/approve/{id}` with bearer-session auth (no token in URL — backend's session-first dispatch picks the auth path). * `frontend/src/api/index.ts`: re-export the new function from the api barrel. * `frontend/src/history.ts`: - new `canApprovePendingRow` predicate mirroring `canCancelPendingRow` (admin → yes; non-admin → only own pending rows; legacy null-creator rows out of reach via this UI). - `renderActionCell` now renders Approve + Cancel side-by-side for pending rows where the session qualifies for both. Each predicate is checked independently so a custom role with only one verb renders just that button. - new click handler on `.history-approve-btn` mirroring the cancel-button pattern: confirmDialog (non-destructive) → `api.approvePurchase` → reload history + success toast. Failed approve surfaces an error toast + re-enables the button. RI exchange ----------- `approveRIExchange` has the same shape gap (token-only auth) but mutates a state-machine and triggers downstream cloud-side exchange execution — bigger surface than fits in this PR. Filing as a follow-up issue tracking the symmetric work. Tests ----- Backend * `internal/auth/types_test.go` + `service_test.go` + `service_group_test.go`: bumped permission-count assertions (8→9, 10→11) and added `ActionApproveOwn:ResourcePurchases` membership check in `DefaultUserPermissions returns user access`. * `internal/api/handler_purchases_test.go`: added `.Maybe()` HasPermissionAPI mocks (returning false for both verbs) on five existing approve-purchase tests so the new session-first dispatch falls through to the token branch they exercise. * `internal/api/coverage_extras_test.go`: `TestHandler_approvePurchase_EmptyToken` now asserts the empty-token + zero-handler case returns SOME error (recovered from panic via `defer recover`) — the empty-token path is no longer a pre-flight 400, since after #286 it's the dispatch into the session-authed branch. * `internal/api/handler_test.go::TestHandler_HandleRequest_ApprovePurchase`: same `.Maybe()` mocks for the end-to-end approve route test. Frontend * New `frontend/src/__tests__/history-approve-button.test.ts`: 8 tests covering admin/regular/anonymous render branches, non-pending-row absence, side-by-side Approve+Cancel render, declined confirmDialog, accepted confirm + reload + toast, and API-failure error toast + button re-enable. Verification: - `go test ./...` clean (4265 passed, 0 failed, 6 skipped). - `npm test` clean (1469 passed, 0 failed). - `npm run build` + `npx tsc --noEmit` clean. Token-only email-link path is unchanged — backwards-compat mandatory; existing pre-#286 tests on that path continue to pass. * fix(history,api): disable both row-actions during click + tighten empty-token test (CR pass on PR #299) Addresses both actionable items from CodeRabbit's first review on PR #299: 1. **history.ts:603 — disable BOTH Approve and Cancel during a click** (Minor / Quick win): after #286 the two buttons render side-by-side on pending rows, but the click handlers were disabling only the clicked button. A quick double-click could fire conflicting requests on the same row before the reload completes. Extracted a small `sameRowActions(btn)` helper that returns every row-action button (`.history-approve-btn`, `.history-cancel-btn`) in the same `<td>` as the clicked button (falls back to `[btn]` when no parent cell exists — test fixtures may not wrap in a table). Both Approve and Cancel handlers now disable the full set while the API is in flight and re-enable the full set on failure. Successful requests trigger a full history reload that re-renders the row, so the row-action sibling state doesn't matter on the happy path. 2. **coverage_extras_test.go:52 — fail loudly on panic instead of swallowing** (Major / Quick win): the prior `recover()` swallow could let a panic in any new dispatch branch pass the test without ever asserting on `err`, masking real regressions on the empty-token path. The test now (a) `t.Fatalf`s on any panic from `approvePurchase` so the panic path is itself a failure signal, and (b) wires a minimal `MockConfigStore` returning a clean "execution not found" error from `GetExecutionByID` so the dispatch reaches a proper `NewClientError(404)` rather than nil-deref'ing on `h.config`. `require.Error(...)` then asserts the contract: empty token + no session must NOT silently succeed. Tests (regression coverage for #1): * `history-approve-button.test.ts` gains two new cases: - "approve click disables BOTH Approve and Cancel for the row while in flight (CR pass)" — uses an unresolved promise from the api mock to capture the in-flight state and asserts both buttons are disabled until the promise resolves. - "approve API failure re-enables BOTH Approve and Cancel for the row (CR pass)" — asserts both buttons are re-enabled after the API rejection path runs. Verification: - `go test ./...` clean (4265 passed / 0 failed / 6 skipped). - `npm test` clean (1471 passed / 0 failed). - `npx tsc --noEmit` clean.
Summary
Closes #47.
Adds an inline Retry button to failed Purchase History rows, mirroring the cancel-button infrastructure landed in #145. Three reviewer-confirmed design choices (yes / yes / yes) are implemented:
retry_execution_id(self-FK onpurchase_executions) +retry_attempt_ncounter. The History UI renders inline lineage links so users can navigate predecessor↔successor without scrolling.retry_attempt_n >= 5rejects with a structured 409; the frontend confirms-with-warning then re-POSTs?force=true. Prevents an obviously-stuck deployment from accumulating dozens of dead retry rows.errormatches a known-persistent-misconfig pattern (FROM_EMAIL not configured,SES sandbox, etc.) replace the Retry button entirely with an operator-actionable badge — retrying without changing config is guaranteed to fail again.Salvages a previously-stalled agent attempt: rebased onto current
feat/multicloud-web-frontendtip (which now includes #145 + thecreated_by_user_idmigration000041). The retry RBAC verbs (retry-own/retry-any) are added alongside the cancel verbs, withDefaultUserPermissions()grantingretry-own:purchasesto every authenticated user.Notable details
Recommendationsslice is now a defensive deep copy of the failed row's, so the downstream purchase pipeline (internal/purchase/execution.go) can't mutate the historical row's in-memory representation through the shared backing array. DB rows were already isolated; this closes the in-process aliasing footgun.clientErrorgrew adetailsmap so the API can surfaceops_hint/retry_attempt_n/thresholdas structured JSON fields rather than substring-matched message text. The response writer flattens those alongsideerrorin the body.persistRetryExecutionguarantees the FK constraint: successor INSERT → predecessor UPSERT → suppression INSERTs in one tx; a crash mid-flow leaves either everything or nothing.000042adds nullableretry_execution_id(self-FK ON DELETE SET NULL) +retry_attempt_n INT NOT NULL DEFAULT 0+ a partial index on the FK. Legacy rows without backfill look like fresh first-retry candidates.Test plan
go build ./...cleango test ./internal/api/... ./internal/auth/... ./internal/config/...— 14 new retry tests + bumped permission countsnpx tsc --noEmitcleannpx jest— 1337/1337 frontend tests pass, 11 new retry-button testsnpm run build— webpack production build cleanretry_attempt_n=1)Salvage notes
The previous agent stalled mid-3-pass-review with the work uncommitted in the worktree. This PR:
persistentFailureHints(case-INsensitive, not case-sensitive),retryPurchasecyclomatic complexity from 20 → ≤10 by extractingloadAndValidateRetryRequest,checkRetryRateGates, andpersistRetryExecution,Trigger CodeRabbit:
@coderabbitai reviewSummary by CodeRabbit