fix(security): approval-flow hardening — crypto/rand tokens + per-account contact gating by cristim · Pull Request #101 · LeanerCloud/CUDly

cristim · 2026-04-27T13:07:28Z

Summary

Closes 3 HIGH findings from the security review on the approval flow:

H1 — Approval tokens generated with uuid.New().String() (122 bits entropy in a known format) → swapped all 4 generation sites for a new pkg/common.GenerateApprovalToken() returning 32 random bytes from crypto/rand, hex-encoded (256 bits unpredictability, uniform output space).
H3 — authorizeApprovalAction fell back to the global notification email as a catch-all approver when no recommendation had a per-account contact_email. Anyone holding the global mailbox could approve spend on any account. Closed: resolveApprovalRecipients no longer adds globalNotify to the approver set; authorizeApprovalAction returns 403 with a clear "set the cloud account's contact_email before approving" message when no contact is configured.
H2 — RI exchange single-use enforcement was already in place at the SQL layer (TransitionRIExchangeStatus(pending → processing) is atomic; concurrent second approval gets 409). Audited and documented; no code change needed.

Why pkg/common

The four token-generation sites span both Go modules — pkg/exchange/auto.go lives in the pkg/ submodule and cannot import internal/. pkg/common is the shared base used by every provider, so the helper lands there.

Behavior change to flag for operators

Any execution whose recommendations all reference cloud accounts WITHOUT a contact_email will fail approval until contact_email is set on the account. Previously the global notification email could approve on those accounts' behalf — that's the exact loophole this PR closes.

Files

pkg/common/tokens.go (NEW) — GenerateApprovalToken() + tests.
pkg/exchange/auto.go — RI exchange uses the helper; removes uuid import.
internal/api/handler_purchases.go — new newPendingExecution + persistExecutionAndSuppressions helpers (gocyclo extraction); tightened resolveApprovalRecipients so global notify is never an approver; authorizeApprovalAction rejects with a clear error when no per-account contact_email exists.
internal/api/handler_plans.go — plan execution token uses the helper.
internal/purchase/notifications.go — scheduled notification token uses the helper.
Test updates: existing approval-flow tests now wire a per-account ContactEmail through MockConfigStore.GetCloudAccountFn (the prior tests relied on the global-fallback this PR removes); 4 new tests cover token uniqueness, the no-contact-email rejection, and the security-hardened recipient policy.

Test plan

go test -short -race -count=1 ./internal/api/... ./internal/purchase/... green.
(cd pkg && go test -short -race -count=1 ./common/... ./exchange/...) green.
gocyclo -over 10 clean.
All pre-commit hooks pass.
Operator action before merge: query purchase_history/purchase_executions for accounts that have approved purchases via the global mailbox (no per-account contact_email). Notify those owners that they need to set contact_email on the account before the next approval cycle, or future approvals will 403.

Follow-up PRs

PR3 (input cap), PR4 (IAM wildcards), PR5 (supply chain) — independent of this PR.

🤖 Generated with claude-flow

Summary by CodeRabbit

Security Improvements
- Approval tokens now use cryptographically-generated tokens for stronger protection.
Validation Changes
- Approvals/cancellations require per-account contact emails; global notification address is not an approver.
- Asynchronous approval messages must include and validate an actor email against per-account approvers.
Transactional Behavior
- Creation of planned executions and plan update now occur atomically so partial writes roll back on failure.
Error Handling
- Token-generation and contact-email lookup failures surface and prevent creation/approval actions.
Tests
- Added coverage for token generation, transactional rollback, recipient-resolution, and async gating.

…t contact gating Addresses 3 HIGH findings from the security review: H1 — Approval tokens were generated with `uuid.New().String()`. UUID v4 is 122 bits of entropy in a known format with version + variant nibbles fixed, which is fine for identity but a poor primary credential. Switched all four token-generation sites to a new `pkg/common.GenerateApprovalToken()` that returns 32 random bytes from `crypto/rand`, hex-encoded — full 256 bits of unpredictability and a uniform output space. Sites updated: - internal/api/handler_purchases.go (web purchase flow) - internal/api/handler_plans.go (plan execution creation) - internal/purchase/notifications.go (scheduled purchase notifications) - pkg/exchange/auto.go (RI exchange automation, in pkg/ submodule) The helper lives in pkg/common because pkg/exchange (a separate Go module) cannot import internal/. All four sites now propagate the rand.Read error rather than panicking — auto.go returns an ExchangeOutcome with the error fielded so the surrounding loop continues processing the rest of the recs. H3 — `authorizeApprovalAction` previously fell back to the global Settings notification email as a catch-all approver when no recommendation had a per-account contact_email. Anyone holding the global notify mailbox could approve spend on any account with the policy. Closed the loophole: `resolveApprovalRecipients` no longer adds globalNotify to the approver set — only per-account contact_email values count. authorizeApprovalAction returns 403 with a clear message ("set the cloud account's contact_email before approving") when no contact_email is configured. Behaviour change to call out for operators: any execution whose recommendations all reference accounts WITHOUT a contact_email will fail approval until contact_email is set on the account. Existing tenants who relied on the global mailbox are affected. H2 — RI exchange single-use enforcement. Audit confirmed it's already in place at the SQL layer: `TransitionRIExchangeStatus(pending → processing)` is atomic, and a concurrent second approval attempt returns 409 ("already processed"). `subtle.ConstantTimeCompare` on the token compare. No code change needed; documenting the audit here so future readers don't think H2 was missed. Tests: 4 new tests covering token uniqueness/length, the no-contact-email rejection path, and updates to existing approval-flow tests so they wire a per-account ContactEmail through MockConfigStore.GetCloudAccountFn (the prior tests relied on the global-fallback that this PR removes).

coderabbitai · 2026-04-27T13:07:37Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a02cc89e-eeb2-4ee0-8538-da44237bcfc1

📥 Commits

Reviewing files that changed from the base of the PR and between b0d4d7f and 2017748.

📒 Files selected for processing (2)

internal/api/handler_plans_test.go
internal/scheduler/scheduler_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

internal/scheduler/scheduler_test.go
internal/api/handler_plans_test.go

📝 Walkthrough

Walkthrough

Approval tokens are now crypto-random via common.GenerateApprovalToken(). Execution creation and plan next-execution updates can run inside shared DB transactions (*Tx variants). SQS approve/cancel messages must include actor_email validated against per-account contact_email. Tests and mocks updated for tx-aware APIs and recipient-resolution behavior.

Changes

Cohort / File(s)	Summary
Token core & tests `pkg/common/tokens.go`, `pkg/common/tokens_test.go`	Add `common.GenerateApprovalToken()` producing 32 random bytes (64-char hex) and unit tests validating length, uniqueness, decoding, and non-zero output.
Execution creation & plan tx `internal/api/handler_plans.go`, `internal/api/handler_plans_test.go`, `internal/config/store_postgres.go`, `internal/config/store_postgres_pgxmock_test.go`, `internal/config/interfaces.go`, `internal/mocks/stores.go`	Move execution insert + plan next-execution bump into `WithTx` transactional flow; add `UpdatePurchasePlanTx` interface and Postgres TX implementation; tests assert rollback on mid-loop failures and transaction framing.
Purchase flows & approvals `internal/api/handler_purchases.go`, `internal/api/handler_purchases_test.go`, `internal/api/handler_test.go`, `internal/purchase/notifications.go`, `internal/purchase/approvals.go`, `internal/purchase/messages.go`, `internal/purchase/coverage_extra_test.go`	Switch to `GenerateApprovalToken()` with error propagation; extract persistence helpers; require per-account `contact_email` for approver set (global notify excluded from approvers); SQS handlers require and validate `actor_email`; tests updated for recommendation→cloud-account contact lookup and negative cases.
Async message & actor enforcement `internal/purchase/messages.go`, `internal/purchase/approvals.go`, `internal/purchase/coverage_extra_test.go`	Add `ActorEmail` to `AsyncMessage`; verify actor + token before invoking Approve/Cancel; reject legacy payloads missing `actor_email`; tests added for missing/invalid actor and token short-circuiting.
Mocks & test helpers (tx-aware) `internal/api/mocks_test.go`, `internal/purchase/mocks_test.go`, `internal/analytics/collector_test.go`, `internal/scheduler/scheduler_test.go`, `internal/server/test_helpers_test.go`, `internal/mocks/stores.go`	Introduce `UpdatePurchasePlanTx` to multiple test mocks with fallback-to-non-tx behavior; adjust test stubs (e.g., `GetCloudAccount`) to be expectation-aware so tx-based flows compile and test correctly.
Exchange & failure recording `pkg/exchange/auto.go`	Manual exchanges use `GenerateApprovalToken()`; token-gen failures log and persist failed `ExchangeRecord` with explicit `mode`; `saveFailedRecord` extended to accept mode and call sites updated.
Docstrings & minor edits `internal/purchase/approvals.go`, `internal/purchase/messages.go`	Clarified comments about approval gating and actor attribution; updated Approve/Cancel call sites to pass validated `actor_email`.

sequenceDiagram
    participant Client
    participant Handler as API/SQS Handler
    participant TokenGen as common.GenerateApprovalToken
    participant DB as Postgres (WithTx)
    participant Resolver as Recipient Resolver
    participant CloudAPI as ConfigStore.GetCloudAccount

    Client->>Handler: submit approve/cancel (HTTP session or SQS msg with actor_email)
    Handler->>DB: Get PurchaseExecution (includes Recommendations)
    DB-->>Handler: execution with CloudAccountIDs

    Handler->>TokenGen: GenerateApprovalToken() (when creating execution)
    TokenGen-->>Handler: token / error

    Handler->>Resolver: resolveApprovalRecipients(recommendations)
    Resolver->>CloudAPI: GetCloudAccount(cloudAccountID)
    CloudAPI-->>Resolver: CloudAccount{ContactEmail}

    alt ContactEmail found
        Resolver-->>Handler: approvers includes ContactEmail
        Handler->>DB: WithTx { SavePurchaseExecutionTx / UpdatePurchasePlanTx / ApproveExecution/CancelExecution }
        DB-->>Handler: commit
        Handler-->>Client: success
    else ContactEmail missing or lookup error
        Resolver-->>Handler: approvers empty or error
        Handler-->>Client: reject (403 or propagate error)
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(history): inline Cancel button + cancel-any/own RBAC verbs #145 — overlaps on cancel/approval token and session handling tests in internal/api/coverage_extras_test.go.

Suggested labels

urgency/this-sprint, type/bug

Poem

🐰 I nibble bytes of true-random light,
Tokens spun from cryptographic night,
Each account must wear its own contact crest,
No global gate to shortcut the test,
Secure hops, rolled back if one step isn't right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 57.89% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main security-focused changes: replacing UUID tokens with cryptographically secure random tokens and enforcing per-account contact email for approvals.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/approval-token-entropy

_{Review rate limit: 3/5 reviews remaining, refill in 23 minutes and 24 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cristim · 2026-04-27T13:07:40Z

@coderabbitai review

coderabbitai · 2026-04-27T13:07:44Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/api/handler_purchases.go (1)
820-850: ⚠️ Potential issue | 🟠 Major

The SQS approval path remains unprotected by email verification and needs follow-up action.

The authorization gap is confirmed. authorizeApprovalAction correctly enforces session ↔ contact_email matching for the HTTP path, but the asynchronous SQS path (handleApproveMessage / handleCancelMessage) bypasses this entirely. These handlers pass an empty actor directly to ApproveExecution / CancelExecution without ever checking the approver email list. Both methods validate only the token and never verify that the caller matches the approved contacts—a token holder can approve any execution.

The code itself labels this the "legacy token-only path" and includes explicit comments acknowledging the attribution fallback behavior, but no deprecation notice or removal timeline exists. The SQS handlers remain wired into the active message dispatcher.

Either deprecate the SQS flow explicitly or add a comment in messages.go documenting this authorization gap as a known limitation pending future closure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/handler_purchases.go` around lines 820 - 850, The SQS approval
path lacks email verification: locate the asynchronous handlers
handleApproveMessage and handleCancelMessage and either (A) deprecate the SQS
flow by adding a clear deprecation comment and TODO with removal timeline and
link to the secure HTTP path (authorizeApprovalAction / ApproveExecution /
CancelExecution), or (B) add an explicit comment in messages.go documenting the
known authorization gap (that SQS messages use token-only attribution and do not
verify session ↔ contact_email), explain the risk, and add a TODO referencing a
follow-up to enforce approver email checks in ApproveExecution/CancelExecution;
include the function names (handleApproveMessage, handleCancelMessage,
ApproveExecution, CancelExecution, authorizeApprovalAction) in the comment so
reviewers can find the exact code paths.

🧹 Nitpick comments (2)

pkg/exchange/auto.go (1)
277-291: LGTM, with a minor consistency observation.

Token-gen failure correctly aborts the manual exchange and surfaces as result.Failed via the Error != "" branch in processRecommendation. Note that unlike processAutoExchange (which calls saveFailedRecord on its failure paths), this branch logs but does not persist a failed-status record — so an operator auditing the DB won't see this failure, only the log line. Given how rare crypto/rand.Read failures are in practice, this is fine; flagging only so the asymmetry is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/exchange/auto.go` around lines 277 - 291, In processManualExchange, after
GenerateApprovalToken fails, persist a failed-status record the same way
processAutoExchange does by calling saveFailedRecord with the relevant ctx,
params (RunAutoExchangeParams), rec (ReshapeRecommendation), offeringID, and the
error string instead of only logging; ensure the resulting ExchangeOutcome still
returns the Error message and other fields as currently done so the DB audit
shows the failure consistently with other failure paths.
internal/api/coverage_extras_test.go (1)
41-52: LGTM — test setup correctly aligned with per-account contact-email gating.

The added accountID, Recommendations with CloudAccountID, and GetCloudAccountFn returning a CloudAccount whose ContactEmail matches the session approver match the new authorizeApprovalAction requirement (403 unless per-account contact_email resolves to the approver). The error path under test (ApproveExecution returning "approval failed") is now reached only after authorization succeeds, which is what this test intends to exercise.

One minor cleanup opportunity: with H3 removing globalNotify from resolveApprovalRecipients, the GetGlobalConfig stub returning NotificationEmail: &approver (lines 54–56) may no longer influence authorization here. If it's still referenced by another code path on this branch it's fine to leave; otherwise consider trimming to keep the test minimal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/coverage_extras_test.go` around lines 41 - 52, The test
currently stubs GetGlobalConfig to return NotificationEmail (lines referenced)
but since authorizeApprovalAction now uses per-account contact_email via
GetCloudAccount (mocked by mockConfig.GetCloudAccountFn) and
resolveApprovalRecipients no longer uses globalNotify, remove or simplify the
GetGlobalConfig stub to keep the test minimal: delete the
mockConfig.GetGlobalConfig assignment or return a zero-value/global config
without NotificationEmail; ensure the rest of the test still sets accountID,
Recommendations (with CloudAccountID) and mockConfig.GetCloudAccountFn so
authorizeApprovalAction and ApproveExecution exercise the intended authorization
and error path.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/api/handler_purchases.go`:
- Around line 820-850: The SQS approval path lacks email verification: locate
the asynchronous handlers handleApproveMessage and handleCancelMessage and
either (A) deprecate the SQS flow by adding a clear deprecation comment and TODO
with removal timeline and link to the secure HTTP path (authorizeApprovalAction
/ ApproveExecution / CancelExecution), or (B) add an explicit comment in
messages.go documenting the known authorization gap (that SQS messages use
token-only attribution and do not verify session ↔ contact_email), explain the
risk, and add a TODO referencing a follow-up to enforce approver email checks in
ApproveExecution/CancelExecution; include the function names
(handleApproveMessage, handleCancelMessage, ApproveExecution, CancelExecution,
authorizeApprovalAction) in the comment so reviewers can find the exact code
paths.

---

Nitpick comments:
In `@internal/api/coverage_extras_test.go`:
- Around line 41-52: The test currently stubs GetGlobalConfig to return
NotificationEmail (lines referenced) but since authorizeApprovalAction now uses
per-account contact_email via GetCloudAccount (mocked by
mockConfig.GetCloudAccountFn) and resolveApprovalRecipients no longer uses
globalNotify, remove or simplify the GetGlobalConfig stub to keep the test
minimal: delete the mockConfig.GetGlobalConfig assignment or return a
zero-value/global config without NotificationEmail; ensure the rest of the test
still sets accountID, Recommendations (with CloudAccountID) and
mockConfig.GetCloudAccountFn so authorizeApprovalAction and ApproveExecution
exercise the intended authorization and error path.

In `@pkg/exchange/auto.go`:
- Around line 277-291: In processManualExchange, after GenerateApprovalToken
fails, persist a failed-status record the same way processAutoExchange does by
calling saveFailedRecord with the relevant ctx, params (RunAutoExchangeParams),
rec (ReshapeRecommendation), offeringID, and the error string instead of only
logging; ensure the resulting ExchangeOutcome still returns the Error message
and other fields as currently done so the DB audit shows the failure
consistently with other failure paths.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ec5c4164-ae28-4468-8913-7d247d903bc3

📥 Commits

Reviewing files that changed from the base of the PR and between e9ab8ba and 040de8c.

📒 Files selected for processing (9)

internal/api/coverage_extras_test.go
internal/api/handler_plans.go
internal/api/handler_purchases.go
internal/api/handler_purchases_test.go
internal/api/handler_test.go
internal/purchase/notifications.go
pkg/common/tokens.go
pkg/common/tokens_test.go
pkg/exchange/auto.go

- internal/purchase/messages.go: document the SQS approve/cancel authorization gap explicitly. The async path bypasses the per-account contact_email gating enforced by authorizeApprovalAction in the HTTP handler — record this as a known limitation with a TODO referencing the two viable closures (deprecate SQS, or carry session email + enforce in ApproveExecution/CancelExecution). No behaviour change. - pkg/exchange/auto.go: persist a failed-status record when GenerateApprovalToken fails in processManualExchange so the DB audit trail mirrors the auto-mode failure paths. Parameterise saveFailedRecord with a `mode` argument ("auto" or "manual") rather than hardcoding "auto", keeping each path's mode accurate. Existing auto-mode callsites pass "auto" verbatim. - internal/api/coverage_extras_test.go: simplify the GetGlobalConfig stub in the approve/cancel error-path tests to return an empty GlobalConfig — with per-account contact_email gating, the global notify mailbox no longer participates in authorization, so the NotificationEmail value was dead weight in the test fixture.

cristim · 2026-04-27T21:31:25Z

@coderabbitai review

coderabbitai · 2026-04-27T21:31:30Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

cristim · 2026-04-27T21:48:40Z

@coderabbitai full review

coderabbitai · 2026-04-27T21:48:45Z

✅ Actions performed

Full review triggered.

coderabbitai

🧹 Nitpick comments (2)

pkg/exchange/auto.go (1)

442-460: Consider constraining mode to constants to prevent silent typos.

saveFailedRecord now accepts an arbitrary string; one typo could silently degrade downstream filtering/UI behavior.

🧩 Optional hardening of mode values

+type ExchangeMode string
+
+const (
+	ExchangeModeAuto   ExchangeMode = "auto"
+	ExchangeModeManual ExchangeMode = "manual"
+)
...
-func saveFailedRecord(ctx context.Context, params RunAutoExchangeParams, rec ReshapeRecommendation, offeringID, paymentDueStr, errMsg, mode string) {
+func saveFailedRecord(ctx context.Context, params RunAutoExchangeParams, rec ReshapeRecommendation, offeringID, paymentDueStr, errMsg string, mode ExchangeMode) {
...
-		Mode:               mode,
+		Mode:               string(mode),

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/exchange/auto.go` around lines 442 - 460, The saveFailedRecord function
accepts an arbitrary mode string which risks silent typos; change it to use a
constrained Mode type and constants (e.g., type Mode string with const ModeAuto
= "auto", ModeManual = "manual") and update saveFailedRecord's signature to
accept Mode (or validate the incoming string against those constants at the top
of saveFailedRecord) and set ExchangeRecord.Mode using the constant/validated
value; also update callers of saveFailedRecord (call sites passing
"auto"/"manual") to use the new Mode constants or ensure they pass validated
values so downstream filtering/UI sees only the expected values.

internal/api/handler_plans.go (1)

259-274: Preserve partial progress on loop errors instead of returning 0.

At Line 261 (and similarly Line 273), returning 0 can mask already persisted executions from earlier iterations. Returning created makes retries and incident handling safer.

♻️ Proposed adjustment

-		if err != nil {
-			return 0, fmt.Errorf("failed to generate approval token: %w", err)
-		}
+		if err != nil {
+			return created, fmt.Errorf("failed to generate approval token after creating %d executions: %w", created, err)
+		}
...
-		if err := h.config.SavePurchaseExecution(ctx, execution); err != nil {
-			return 0, fmt.Errorf("failed to save execution: %w", err)
-		}
+		if err := h.config.SavePurchaseExecution(ctx, execution); err != nil {
+			return created, fmt.Errorf("failed to save execution after creating %d executions: %w", created, err)
+		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/api/handler_plans.go` around lines 259 - 274, The error paths inside
the loop currently return 0 (e.g., after GenerateApprovalToken failure and after
SavePurchaseExecution failure) which hides any executions already persisted;
update those return statements to return the current "created" count instead of
0 so partial progress is preserved. Specifically, in the block where
GenerateApprovalToken is called and where h.config.SavePurchaseExecution(ctx,
execution) is called, replace the "return 0, fmt.Errorf(...)" with "return
created, fmt.Errorf(...)" (use the existing created variable in the surrounding
function) so callers and retry logic can detect how many executions were
successfully created; ensure this change is made in the same function that
constructs config.PurchaseExecution and references plan.RampSchedule.CurrentStep
and execution.ExecutionID.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/api/handler_plans.go`:
- Around line 259-274: The error paths inside the loop currently return 0 (e.g.,
after GenerateApprovalToken failure and after SavePurchaseExecution failure)
which hides any executions already persisted; update those return statements to
return the current "created" count instead of 0 so partial progress is
preserved. Specifically, in the block where GenerateApprovalToken is called and
where h.config.SavePurchaseExecution(ctx, execution) is called, replace the
"return 0, fmt.Errorf(...)" with "return created, fmt.Errorf(...)" (use the
existing created variable in the surrounding function) so callers and retry
logic can detect how many executions were successfully created; ensure this
change is made in the same function that constructs config.PurchaseExecution and
references plan.RampSchedule.CurrentStep and execution.ExecutionID.

In `@pkg/exchange/auto.go`:
- Around line 442-460: The saveFailedRecord function accepts an arbitrary mode
string which risks silent typos; change it to use a constrained Mode type and
constants (e.g., type Mode string with const ModeAuto = "auto", ModeManual =
"manual") and update saveFailedRecord's signature to accept Mode (or validate
the incoming string against those constants at the top of saveFailedRecord) and
set ExchangeRecord.Mode using the constant/validated value; also update callers
of saveFailedRecord (call sites passing "auto"/"manual") to use the new Mode
constants or ensure they pass validated values so downstream filtering/UI sees
only the expected values.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5578279b-0e3f-4a4d-830e-8c3b47da33b6

📥 Commits

Reviewing files that changed from the base of the PR and between e9ab8ba and 9f23edb.

📒 Files selected for processing (10)

internal/api/coverage_extras_test.go
internal/api/handler_plans.go
internal/api/handler_purchases.go
internal/api/handler_purchases_test.go
internal/api/handler_test.go
internal/purchase/messages.go
internal/purchase/notifications.go
pkg/common/tokens.go
pkg/common/tokens_test.go
pkg/exchange/auto.go

…token-entropy PR #145 (`feat(history): inline Cancel button + cancel-any/own RBAC verbs`) landed on the base while #101 was in flight. Both touch `internal/api/handler_purchases.go`'s pending-execution construction in `executePurchase`: - #101 hardens the path by extracting `newPendingExecution(req, total, savings)` so the approval token comes from `crypto/rand` via `common.GenerateApprovalToken` instead of the previous inline `uuid.New().String()`. - #145 adds `CreatedByUserID: resolveCreatorUserID(session)` to every pending execution so the cancel-own RBAC path (issue #46) can identify the creator on later cancellation. The two changes are semantically independent — keeping both. Resolution: keep #101's `newPendingExecution(...)` call, then set `execution.CreatedByUserID = resolveCreatorUserID(session)` immediately after construction. The helper stays signature-stable (no session parameter to plumb) and the cancel-own stamp is in place for every code path that goes through `executePurchase`. Other files in the merge auto-resolved cleanly (`internal/api/coverage_extras_test.go`, `internal/api/handler_purchases_test.go`, `internal/api/handler_test.go`). `go build ./...`, `go test ./...`, `go test ./...` from `pkg/` all clean.

…on plan create Two CodeRabbit nitpicks from review 4184423401: 1. `pkg/exchange/auto.go` — introduce a typed `ExchangeMode` with `ExchangeModeAuto` / `ExchangeModeManual` constants and tighten `saveFailedRecord`'s `mode` parameter from `string` to that type. The previous round of fixes parameterised `mode` as a raw string, which let a caller-side typo silently leak into `ExchangeRecord.Mode` and break downstream filtering. `ExchangeRecord.Mode` itself stays `string` for serialization stability — this is call-site discipline, not a schema change. Updated all five call sites (1 manual, 4 auto) to pass the constant. 2. `internal/api/handler_plans.go` — `createPurchaseExecutions` now returns the running `created` count alongside an error rather than `0`. Callers that retry or audit can see how many executions made it to the DB before the loop bailed; the error message also names the count for log/incident clarity. The current caller drops the count on error, so this is a no-op functionally and a future-proofing fix for retry paths. `go build ./...`, `go test ./...`, `go test ./...` from `pkg/` all clean.

cristim · 2026-04-27T23:09:17Z

@coderabbitai review

coderabbitai · 2026-04-27T23:09:22Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/api/handler_purchases.go (1)
965-995: ⚠️ Potential issue | 🔴 Critical

The new approver gate is still bypassable through the async token-only path.

This hardens only the HTTP route. internal/purchase/messages.go still forwards approve/cancel SQS messages with just execution_id + token, and internal/purchase/approvals.go accepts those calls after token validation alone. That leaves a live path where a replayed or forwarded async payload can approve/cancel without the session-email check added here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/api/handler_purchases.go` around lines 965 - 995, The async SQS
approve/cancel path is still authorisable by token only; update the message
forwarding and processing so they perform the same approver-email check as
authorizeApprovalAction: when creating/forwarding SQS messages in the message
sender (messages.go), include the initiating actor/session email (or reject if
not present), and in the approval processor in approvals.go, after token
validation call resolveApprovalRecipients(...) and verify the provided actor
email matches one of the approvers (same logic as in authorizeApprovalAction,
including actorLower normalization and NewClientError-style rejection); ensure
missing or non-matching actor info causes the async message to be rejected and
logged.

🧹 Nitpick comments (1)

pkg/exchange/auto.go (1)

442-451: Use the new mode constants for success-path record writes too.

You introduced ExchangeMode to avoid string typos; reusing it in success-path record.Mode assignments would complete that safety improvement.

♻️ Proposed small consistency patch

diff --git a/pkg/exchange/auto.go b/pkg/exchange/auto.go
@@
-        Mode:               "manual",
+        Mode:               string(ExchangeModeManual),
@@
-        Mode:               "auto",
+        Mode:               string(ExchangeModeAuto),

Also applies to: 456-470

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/exchange/auto.go` around lines 442 - 451, Replace raw string assignments
to record.Mode in the success-path record writes with the new ExchangeMode
constants: use ExchangeModeAuto or ExchangeModeManual (convert to string if
record.Mode is a plain string) wherever record.Mode is set (the success-path
writes near the same area as the saveFailedRecord usage); ensure any comparison
or assignment that previously used literal "auto"/"manual" uses ExchangeMode and
the corresponding ExchangeModeAuto/ExchangeModeManual to avoid typos.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/api/handler_plans.go`:
- Around line 259-279: The loop in createPlannedPurchases writes each
PurchaseExecution (constructed in execution and persisted via
h.config.SavePurchaseExecution) one-by-one so partial failures leave persisted
executions and allow retries to duplicate them; change the flow to make the
batch atomic: either wrap all SavePurchaseExecution calls in a single DB
transaction exposed by h.config (e.g., SavePurchaseExecutionTx/BeginTx/Commit)
and only call updatePlanNextExecutionDate after Commit, or if a transaction API
isn't available, make creation idempotent by checking for an existing execution
(by PlanID+ScheduledDate or ApprovalToken) before saving and skip duplicates,
then call updatePlanNextExecutionDate only when all executions are confirmed
saved. Ensure references: createPlannedPurchases, PurchaseExecution,
SavePurchaseExecution, updatePlanNextExecutionDate, and ApprovalToken are
updated accordingly.

In `@internal/api/handler_purchases.go`:
- Around line 858-866: lookupContactEmail currently swallows GetCloudAccount
errors and returns an empty string, which makes transient lookup failures appear
as "no contact_email" in the handler; modify lookupContactEmail to return
(string, error) (or a second boolean/error sentinel) so the caller can
distinguish "no email configured" from "lookup failed", then update the branch
in handler_purchases.go that currently checks if len(contactEmails) == 0 to
instead check the returned error first (if err != nil, propagate/return the
error as a retriable server error) and only fall back to globalNotify when the
call succeeded and truly returned no contact email; update any callers of
lookupContactEmail accordingly (preserve existing logging but surface the error
to the handler rather than converting it to an empty string).

---

Outside diff comments:
In `@internal/api/handler_purchases.go`:
- Around line 965-995: The async SQS approve/cancel path is still authorisable
by token only; update the message forwarding and processing so they perform the
same approver-email check as authorizeApprovalAction: when creating/forwarding
SQS messages in the message sender (messages.go), include the initiating
actor/session email (or reject if not present), and in the approval processor in
approvals.go, after token validation call resolveApprovalRecipients(...) and
verify the provided actor email matches one of the approvers (same logic as in
authorizeApprovalAction, including actorLower normalization and
NewClientError-style rejection); ensure missing or non-matching actor info
causes the async message to be rejected and logged.

---

Nitpick comments:
In `@pkg/exchange/auto.go`:
- Around line 442-451: Replace raw string assignments to record.Mode in the
success-path record writes with the new ExchangeMode constants: use
ExchangeModeAuto or ExchangeModeManual (convert to string if record.Mode is a
plain string) wherever record.Mode is set (the success-path writes near the same
area as the saveFailedRecord usage); ensure any comparison or assignment that
previously used literal "auto"/"manual" uses ExchangeMode and the corresponding
ExchangeModeAuto/ExchangeModeManual to avoid typos.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 43aaec54-8c68-4848-85f2-625e61bfc8ac

📥 Commits

Reviewing files that changed from the base of the PR and between 9f23edb and e3d3ca5.

📒 Files selected for processing (6)

internal/api/coverage_extras_test.go
internal/api/handler_plans.go
internal/api/handler_purchases.go
internal/api/handler_purchases_test.go
internal/api/handler_test.go
pkg/exchange/auto.go

🚧 Files skipped from review as they are similar to previous changes (1)

internal/api/coverage_extras_test.go

cristim · 2026-04-28T14:14:44Z

P1 — fixes two HIGH security findings from the approval flow: (H1) tokens were generated with uuid.New() (122-bit, known format) rather than crypto/rand, and (H3) the global notification email acted as a catch-all approver bypassing per-account contact gating. Operator action required before merge: ensure all accounts with active purchase executions have contact_email set, or future approvals will 403 until configured. CI green, MERGEABLE. (triage agent wave2-E)

The HTTP approve/cancel route enforces a per-account contact_email gate via authorizeApprovalAction, but the SQS message handler accepted any payload after a valid token alone — letting a replayed or forwarded async message approve or cancel a purchase without the email check. Add an actor_email field to AsyncMessage and require it for approve/ cancel. handleApproveMessage and handleCancelMessage now run the new verifyAsyncApprovalActor helper, which: 1. Rejects messages without actor_email (legacy / replayed payloads have no field — they are NOT silently accepted). 2. Validates the approval token using constant-time compare. 3. Resolves the per-account contact_email approver list from the execution's recommendations and asserts actor_email matches one (case-insensitive, trimmed). Empty approver list is rejected with the same policy as authorizeApprovalAction. Legacy in-flight messages without actor_email are rejected by design (option (a) of the hardening plan): backfilling tokenless approvals is exactly the threat model this fix addresses, and any stranded action can be re-issued via the HTTP route. The approver-list algorithm duplicates ~15 lines from internal/api/handler_purchases.go gatherAccountContactEmails because internal/api and internal/purchase share no transitive dependency; a shared helper would cost more than keeping the two loops in sync. Both sides cross-reference each other in comments so future edits don't drift. Tests cover the four scenarios called out in CodeRabbit's review: matching actor → success, non-matching actor → reject, missing actor field → reject, token mismatch → reject (existing behaviour, no regression). MockConfigStore.GetCloudAccount in internal/purchase gains the same isExpected pattern internal/api/mocks_test.go uses so existing tests that don't care about account lookups stay green.

…opagation Two CodeRabbit findings on PR #101, both touching internal/api purchase flow. Finding 2 — partial-failure orphaned rows. createPlannedPurchases looped SavePurchaseExecution outside any transaction, then called UpdatePurchasePlan once at the end. A mid-loop failure (e.g. transient DB error on row 4 of 5) committed rows 1-3, skipped the plan's next_execution_date bump, and on retry duplicated rows 1-3. Wrap the loop and the plan update in a single WithTx so both classes of corruption are impossible — the caller can safely retry transient errors knowing nothing was committed. Approach: WithTx (option (a)). The codebase already exposes WithTx + SavePurchaseExecutionTx and uses both elsewhere (purchase suppressions, RI exchange flows). Adding the matching UpdatePurchasePlanTx variant costs less than introducing idempotency keys for one caller, and TX semantics give the caller a clean retry contract instead of "may-or-may-not-have-rows-from-the-previous-attempt". UpdatePurchasePlan now delegates to UpdatePurchasePlanTx via a single-call WithTx, sharing SQL between the two variants. UpdatedAt is stamped on the public method before the WithTx call so existing nil-DB tests that inspect UpdatedAt without exercising the DB still see it set. The tx variant also stamps if zero, so direct callers that bundle this with other writes don't have to remember. A regression test (TestHandler_createPlannedPurchases_MidLoopFail- ureRollsBack) drives SavePurchaseExecutionTx through the mock to fail on the third call and asserts UpdatePurchasePlan / Tx is never reached — the rollback semantics CodeRabbit asked for. Finding 3 — silent error swallow in lookupContactEmail. lookupContactEmail's old signature returned bare string and logged DB errors as "". The caller couldn't distinguish "no contact email configured" (legitimate fall-through to globalNotify) from a real transient lookup failure that should surface as retriable. A DB blip could silently change who is allowed to approve, which is exactly the wrong failure mode for an authorisation lookup. New signature: (string, error). Real error → propagated. Account not found (pgx.ErrNoRows → nil, nil from the postgres store) → ("", nil), same fall-through as before. Account found but empty ContactEmail → ("", nil), unchanged. The single caller gatherAccountContactEmails wraps and returns; resolveApprovalRecipi- ents already plumbed err through, so callers up the chain work unchanged. A regression test (TestHandler_resolveApprovalRecipients_LookupErr- orPropagates) mocks GetCloudAccount to return a transient error and asserts the error is wrapped (errors.Is preserves the chain) and no fallback approver list is returned. Mock surface updates: every config.StoreInterface mock in the codebase (internal/api, internal/purchase, internal/analytics, internal/scheduler, internal/server, internal/mocks) gains UpdatePurchasePlanTx, with the same isExpected fall-through pattern used for SavePurchaseExecutionTx so tests that don't care about the Tx variant stay green. The pgxmock-backed UpdatePurchasePlan_NotFound test now scripts the matching Begin / Rollback frame.

Two success-path record.Mode assignments in pkg/exchange/auto.go still used the literal strings "manual" and "auto" while the failed-record path (saveFailedRecord) was already routed through ExchangeMode constants ExchangeModeManual / ExchangeModeAuto. Replace the literals with string(ExchangeModeManual) and string(ExchangeModeAuto) so the constants are the single source of truth for valid values — closes the residual call-site discipline gap CodeRabbit flagged on PR #101. The ExchangeRecord.Mode storage type stays string for serialization stability; this change is purely about preventing typo drift at write sites.

cristim · 2026-04-28T16:50:48Z

All four CodeRabbit findings from the 2026-04-27 review (run id 43aaec54-8c68-4848-85f2-625e61bfc8ac) addressed across three atomic commits.

Finding 1 — CRITICAL: async SQS path bypassed the approver gate

Commit a36e0d557 fix(security): close async-SQS bypass of approver gate (#101 CRITICAL)

Added actor_email field to internal/purchase/messages.go AsyncMessage. Required for approve / cancel messages — empty / missing field is rejected before any state mutation.
New verifyAsyncApprovalActor helper in internal/purchase/messages.go runs the same approver check authorizeApprovalAction runs on the HTTP path: load execution → constant-time token compare → resolve per-account contact_email approver list → match actor_email (case-insensitive, trimmed). Empty approver list rejected with the same policy as the HTTP path.
Legacy in-flight messages — option (a), reject. Decided against backfilling tokenless approvals: that would reopen the exact threat model this fix closes. Stranded actions can be re-issued via the HTTP route gated by authorizeApprovalAction. Documented in the commit body.
Approver-list algorithm duplicates ~15 lines from internal/api/handler_purchases.go gatherAccountContactEmails; both sides cross-reference each other in comments. internal/api and internal/purchase share no transitive dependency, and a shared helper in internal/config would cost more than keeping the two loops in sync. Will hoist if a third caller appears.
Tests in internal/purchase/coverage_extra_test.go: matching actor → success, non-matching actor → reject, missing actor field → reject, token mismatch → reject, no approvers configured → reject. Existing happy-path tests updated for the new field.

Finding 2 — `createPlannedPurchases` partial-failure orphaned rows

Commit b49c49564 fix(api): atomic createPlannedPurchases + lookupContactEmail error propagation

Approach: WithTx (option (a)), not idempotent-keys. The codebase already exposes WithTx + SavePurchaseExecutionTx (internal/config/interfaces.go) and uses both elsewhere (purchase suppressions, RI exchange flows). Adding the matching UpdatePurchasePlanTx cost less than introducing idempotency keys for one caller, and TX semantics give the caller a clean retry contract.
createPlannedPurchases now wraps the per-row execution inserts and the plan's next_execution_date bump in a single WithTx. Mid-loop failure rolls everything back — no orphaned rows, no stale plan pointer, safe to retry on transient errors.
UpdatePurchasePlan delegates to UpdatePurchasePlanTx via a single-call WithTx, sharing SQL between the two variants. UpdatedAt is stamped on the public method so existing nil-DB tests that inspect it without exercising the DB still pass.
Regression test TestHandler_createPlannedPurchases_MidLoopFailureRollsBack: drives SavePurchaseExecutionTx to fail on row 3 of 5, asserts UpdatePurchasePlan / UpdatePurchasePlanTx is never reached.

Finding 3 — `lookupContactEmail` swallowed lookup errors

Same commit b49c49564.

Signature changed from (string) to (string, error). Real DB error → propagated. Account not found (pgx.ErrNoRows → nil, nil from the postgres store) → ("", nil). Account found but empty ContactEmail → ("", nil). Single caller gatherAccountContactEmails wraps and propagates; resolveApprovalRecipients already plumbed err through, so up-chain callers work unchanged.
Regression test TestHandler_resolveApprovalRecipients_LookupErrorPropagates: mocks GetCloudAccount to return a transient error, asserts the error is wrapped (errors.Is preserves the chain) and no fallback approver list is returned.

Finding 4 — nitpick: success-path `Mode` literals

Commit b0d4d7f98 chore(exchange): use ExchangeMode constants in success-path writes

pkg/exchange/auto.go lines 316 and 429 now use string(ExchangeModeManual) and string(ExchangeModeAuto) instead of bare string literals, matching the saveFailedRecord path that already routed through the constants. ExchangeRecord.Mode storage type stays string for serialization stability.

Verification

go test ./... from repo root — all packages PASS.
go test ./... from pkg/ submodule — all packages PASS.
gh pr view 101 --json mergeStateStatus was CLEAN before push, plain push (no rebase).
Pre-commit hooks (gocyclo, AWS secret scanner, gosec, trivy, full Go test suite) passed on every commit.

Mock surface updates

Every config.StoreInterface mock in the codebase gained UpdatePurchasePlanTx (with the same isExpected fall-through pattern used for SavePurchaseExecutionTx): internal/api, internal/purchase, internal/analytics, internal/scheduler, internal/server, internal/mocks. The pgxmock-backed UpdatePurchasePlan_NotFound test now scripts the matching Begin / Rollback frame.

@coderabbitai review

coderabbitai · 2026-04-28T16:51:08Z

@cristim Thank you for the thorough remediation write-up. The three-commit breakdown is clear and well-documented:

a36e0d5 closes the CRITICAL async-SQS bypass with verifyAsyncApprovalActor mirroring the HTTP gate, including the deliberate reject-legacy-in-flight-messages decision and its rationale.
b49c495 addresses both the orphaned-row risk (atomic WithTx wrapping execution inserts + plan bump) and the swallowed lookupContactEmail errors, with regression tests for both.
b0d4d7f tidies up the ExchangeMode literal inconsistency on success paths.

Triggering a review of the new commits now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

internal/api/handler_plans_test.go (1)

274-342: This test still bypasses the transaction boundary.

The shared MockConfigStore.WithTx fallback just invokes the callback, so this case only checks the inner loop. If the rollback guarantee is part of the regression, consider a tx-aware spy or a pgxmock-backed assertion so the test fails when the handler stops using WithTx.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/api/handler_plans_test.go` around lines 274 - 342, The test
currently bypasses transaction semantics because MockConfigStore.WithTx simply
invokes the callback; change the test to enforce a transaction boundary by
making the mock WithTx either (a) expect to be called and run the callback in a
tx-aware spy that records/blocks calls to
UpdatePurchasePlan/UpdatePurchasePlanTx and simulates a rollback when
SavePurchaseExecutionTx returns an error, or (b) replace the mock with a
pgxmock-backed store so real BEGIN/ROLLBACK occur; specifically alter
TestHandler_createPlannedPurchases_MidLoopFailureRollsBack to set an expectation
on MockConfigStore.WithTx (or use a tx-capable fake) and ensure
SavePurchaseExecutionTx failures cause the WithTx to abort/rollback and that
UpdatePurchasePlan / UpdatePurchasePlanTx are not invoked by
Handler.createPlannedPurchases.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/scheduler/scheduler_test.go`:
- Around line 91-101: The mock method UpdatePurchasePlanTx discards the tx
parameter when invoking m.Called, causing expectations to not match; change the
call to include the transaction by calling m.Called(ctx, tx, plan) so registered
expectations for UpdatePurchasePlanTx receive the same (ctx, tx, plan) tuple,
leaving the fallback to m.UpdatePurchasePlan(ctx, plan) unchanged.

---

Nitpick comments:
In `@internal/api/handler_plans_test.go`:
- Around line 274-342: The test currently bypasses transaction semantics because
MockConfigStore.WithTx simply invokes the callback; change the test to enforce a
transaction boundary by making the mock WithTx either (a) expect to be called
and run the callback in a tx-aware spy that records/blocks calls to
UpdatePurchasePlan/UpdatePurchasePlanTx and simulates a rollback when
SavePurchaseExecutionTx returns an error, or (b) replace the mock with a
pgxmock-backed store so real BEGIN/ROLLBACK occur; specifically alter
TestHandler_createPlannedPurchases_MidLoopFailureRollsBack to set an expectation
on MockConfigStore.WithTx (or use a tx-capable fake) and ensure
SavePurchaseExecutionTx failures cause the WithTx to abort/rollback and that
UpdatePurchasePlan / UpdatePurchasePlanTx are not invoked by
Handler.createPlannedPurchases.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c2e8ef5d-4886-4b18-aa48-a127c3940de1

📥 Commits

Reviewing files that changed from the base of the PR and between e3d3ca5 and b0d4d7f.

📒 Files selected for processing (17)

internal/analytics/collector_test.go
internal/api/handler_plans.go
internal/api/handler_plans_test.go
internal/api/handler_purchases.go
internal/api/handler_purchases_test.go
internal/api/mocks_test.go
internal/config/interfaces.go
internal/config/store_postgres.go
internal/config/store_postgres_pgxmock_test.go
internal/mocks/stores.go
internal/purchase/approvals.go
internal/purchase/coverage_extra_test.go
internal/purchase/messages.go
internal/purchase/mocks_test.go
internal/scheduler/scheduler_test.go
internal/server/test_helpers_test.go
pkg/exchange/auto.go

✅ Files skipped from review due to trivial changes (2)

internal/server/test_helpers_test.go
internal/purchase/approvals.go

Address CodeRabbit r3 findings: - internal/scheduler/scheduler_test.go: MockConfigStore.UpdatePurchasePlanTx now forwards tx to m.Called(ctx, tx, plan) so registered expectations match the same (ctx, tx, plan) tuple as the canonical mock in internal/mocks/stores.go. Previously the discarded tx broke any test that registered an On("UpdatePurchasePlanTx", ...) expectation. - internal/api/handler_plans_test.go: TestHandler_createPlannedPurchases_ MidLoopFailureRollsBack now registers an explicit WithTx expectation and asserts createPlannedPurchases ran the save loop inside it. The default WithTx mock invokes fn(nil) regardless of whether the handler actually uses a transaction, so a future refactor dropping the tx boundary would have passed silently. The inner loop error is now captured via withTxFnErr so the "failed to save execution" wrapping is still asserted on, just one level closer to its source. No production code changed; test-only hardening.

cristim · 2026-04-29T12:51:19Z

CR round 4 follow-up

Addressed both findings from the latest CodeRabbit review (commit 2017748):

Actionable: internal/scheduler/scheduler_test.go L91-101 — fixed.
MockConfigStore.UpdatePurchasePlanTx now forwards tx to m.Called(ctx, tx, plan) so registered expectations match the same (ctx, tx, plan) tuple as the canonical mock in internal/mocks/stores.go. Previously the discarded tx parameter would silently break any test that registered an On("UpdatePurchasePlanTx", ...) expectation.

Nitpick: internal/api/handler_plans_test.go L274-342 — fixed.
TestHandler_createPlannedPurchases_MidLoopFailureRollsBack now registers an explicit On("WithTx", ...) expectation and asserts (via require.True(t, withTxCalled, ...)) that createPlannedPurchases ran the save loop inside it. The default MockConfigStore.WithTx fallback invokes fn(nil) regardless of whether the handler actually uses a transaction, so a future refactor that drops the tx boundary would have passed the previous version of this test silently. The inner-loop error wrapping ("failed to save execution") is now captured via withTxFnErr and asserted directly, one level closer to its source.

No production code changed; this is test-only hardening.

Local go test -count=1 ./... passes; pre-commit hooks (gofmt, vet, gosec, trivy, full test suite) green.

@coderabbitai review

coderabbitai · 2026-04-29T12:52:34Z

@cristim: Both fixes look solid — thank you for the detailed write-up.

internal/scheduler/scheduler_test.go — forwarding tx into m.Called(ctx, tx, plan) is the correct fix; silently discarding it would have allowed any On("UpdatePurchasePlanTx", mock.Anything, <specific-tx>, plan) expectation to never match, masking test intent.
internal/api/handler_plans_test.go — the explicit On("WithTx", ...) + require.True(t, withTxCalled) guard is a meaningful improvement; the previous fallback-only approach would have let a future removal of the tx boundary pass silently, defeating the purpose of the regression test.

Triggering a review of commit 20177482 now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

…n email-link cancel (#216) * fix(purchases): admin / cancel-* sessions bypass contact_email gate on email-link cancel The deep-link cancel flow (frontend purchases-deeplink.ts) always POSTs /api/purchases/cancel/:id with both an X-Authorization session AND the URL token from the email link. cancelPurchase took the token branch unconditionally → authorizeApprovalAction → 403 when the execution had no per-account contact_email available (e.g. AWS ambient-credentials recs where CloudAccountID is nil, or any execution whose recommendation accounts simply have an empty contact_email column): Failed to cancel purchase: no per-account contact email configured for this execution; set the cloud account's contact_email before approving The same admin (or any user with cancel-any:purchases / cancel-own matching the creator) could already cancel the same execution from the History page Cancel button — that path goes through cancelPurchaseViaSession → authorizeSessionCancel (RBAC matrix) and never touches contact_email. The deep-link UX was inconsistent. Fix: pre-check the session in cancelPurchase before falling into the token branch. When the caller carries a valid session AND authorizeSessionCancel approves them, take the session-authed path regardless of whether a token is in the URL. Tokenless / no-session callers (forwarded email, shared inbox, scripted flow without auth) still hit the per-account contact_email gate from PR #101. Approve flow stays strict — the dashboard has no admin approve override either, so widening it via the email link would change the security policy. Out of scope for this fix. Helpers: - New tryGetSession returns *Session or nil silently. tryResolveActorEmail collapses to a one-line wrapper. Tests: - New TestHandler_cancelPurchase_DeepLink_AdminBypassesContactEmailGate: admin session + token + ambient-credentials execution → 200, status flips, CancelledBy stamped, GetGlobalConfig (the token branch's signature call) is asserted NOT called. - New TestHandler_cancelPurchase_DeepLink_CancelOwnBypassesContactEmailGate: non-admin with cancel-own + matching creator → 200. - New TestHandler_cancelPurchase_DeepLink_NonPrivilegedSessionStillHitsContactGate: pins the security model — a logged-in user without admin / cancel-* MUST still go through authorizeApprovalAction. - Existing TestHandler_cancelPurchase, TestHandler_cancelPurchase_PurchaseError, and TestHandler_HandleRequest_CancelPurchase get HasPermissionAPI mock stubs returning false so the new pre-check correctly falls through to the token branch (preserving their original assertions). go test ./... — every package green. * fix(purchases): propagate transient auth errors instead of silent fallback (CR #216) CodeRabbit on PR #216 flagged that the new session-authed pre-check in cancelPurchase silently swallowed every error from authorizeSessionCancel and treated them all as "fall through to the contact_email gate". That masks transient auth-service failures (HasPermissionAPI returning a wrapped error, h.auth nil, network blip) behind a misleading "set the contact_email" 403 — exactly the user-facing message #216 was meant to stop conflating with real failures. Distinguish the two cases via a new local helper isPermissionDenied, which uses the existing IsClientError + .code accessor to detect a 403 ClientError specifically. authorizeSessionCancel returns 403 ClientError on legitimate denials (admin role mismatch, missing cancel-* verb, cancel-own creator mismatch) and a wrapped non-ClientError on transient backend failures. case err == nil: → session-authed cancel case isPermissionDenied(): → fall through to contact_email gate default: → return err (propagate) Regression test TestHandler_cancelPurchase_DeepLink_TransientAuthErrorPropagates simulates a HasPermissionAPI failure ("auth backend timeout") and asserts: * the error surfaces with "permission check failed" wrapped, * the contact_email message is NOT in the error, * GetGlobalConfig (the token branch's signature call) is NOT reached. Existing 403-fall-through path covered by TestHandler_cancelPurchase_DeepLink_NonPrivilegedSessionStillHitsContactGate. * fix(purchases): isPermissionDenied uses strict type assertion (CR pass 2 #216) CodeRabbit pass 2 on PR #216 flagged that the helper introduced in pass 1 relied on IsClientError, which goes through errors.As — that unwraps the error chain. A wrapped 403 ClientError (e.g. fmt.Errorf("permission check failed: %w", NewClientError(403, ...))) would still be classified as permission denied, even though the wrapper changes the failure's outer category and signals a different intent. Switch to a direct *clientError type assertion (no unwrapping). Only an exact, un-wrapped 403 ClientError now triggers the fall-through to the contact_email gate; anything wrapping a 403 propagates as the wrapper's own failure mode. This preserves the propagate-vs-fall-through split's original intent against future code that might decide to wrap auth-layer errors for context. Regression: TestIsPermissionDenied table-driven tests pin all six cases including the new "wrapped 403 is NOT denial" invariant.

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

cristim mentioned this pull request Apr 27, 2026

fix(security): supply-chain hardening — Docker SHA pinning + required pre-commit + multi-module govulncheck #105

Merged

5 tasks

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

cristim added 2 commits April 28, 2026 01:07

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

Comment thread internal/api/handler_plans.go Outdated

Comment thread internal/api/handler_purchases.go

cristim added triaged Item has been triaged priority/p1 Next up; this sprint severity/high Significant harm urgency/now Drop other things impact/all-users Affects every user effort/m Days type/security Security finding labels Apr 28, 2026

cristim added 3 commits April 28, 2026 18:35

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread internal/scheduler/scheduler_test.go

cristim merged commit eb7bf49 into feat/multicloud-web-frontend Apr 29, 2026
3 checks passed

cristim deleted the fix/approval-token-entropy branch April 30, 2026 14:00

cristim mentioned this pull request Apr 30, 2026

fix(purchases): admin / cancel-* sessions bypass contact_email gate on email-link cancel #216

Merged

3 tasks

coderabbitai Bot mentioned this pull request May 5, 2026

feat(api,history): in-dashboard purchase approval + approve-{any,own} RBAC (#286) #299

Merged

5 tasks

Conversation

cristim commented Apr 27, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why pkg/common

Behavior change to flag for operators

Files

Test plan

Follow-up PRs

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

❌ Failed checks (1 warning)

Uh oh!

cristim commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cristim commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026

Uh oh!

cristim commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cristim commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cristim commented Apr 28, 2026

Uh oh!

cristim commented Apr 28, 2026

Finding 1 — CRITICAL: async SQS path bypassed the approver gate

Finding 2 — createPlannedPurchases partial-failure orphaned rows

Finding 3 — lookupContactEmail swallowed lookup errors

Finding 4 — nitpick: success-path Mode literals

Verification

Mock surface updates

Uh oh!

coderabbitai Bot commented Apr 28, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cristim commented Apr 29, 2026

CR round 4 follow-up

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cristim commented Apr 27, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Finding 2 — `createPlannedPurchases` partial-failure orphaned rows

Finding 3 — `lookupContactEmail` swallowed lookup errors

Finding 4 — nitpick: success-path `Mode` literals