feat(credstore): bundle ops + atomic SetBundle (INT-431) by rianjs · Pull Request #6 · open-cli-collective/cli-common

rianjs · 2026-05-16T13:37:15Z

Closes #5 · Jira INT-431 (child of epic INT-310). Builds on #4.

Third cli-common unit, per §2.1 (bundle API), §1.5.1 (SetBundle atomicity), §1.5.2 (allowlist scope), §1.7 (config clear removes whole bundle).

Surface

ListBundle(profile) ([]string, error) — sorted keys; valid empty profile → (nil,nil); not allowlist-gated.
DeleteBundle(profile) ([]string, error) — removes every key; no fail-fast (attempts all, aggregates failed keys in the error); idempotent; not allowlist-gated (§1.7).
SetBundle(profile, kv, ...SetOpt) (Result, error) — §1.5.1: validate-all → no-overwrite-conflict-gate / overwrite-snapshot → forward write with the real overwrite flag → rollback on mid-bundle failure → best-effort snapshot clear. Result{Written,Restored,Deleted,Untouched}.
backend interface extended with listKeys() (the extension INT-430 review anticipated); memoryBackend gains setHook/deleteHook §2.1 test seams.

Atomicity highlights

Forward writes pass the real overwrite flag → backend's INT-430 atomic no-overwrite guard still defends a racer.
ErrExists at write (racer owns the key) → that key is not touched in rollback; only our own writes undone; key reported Untouched.
Any other (ambiguous) failure → every attempted key incl. the failing one rolled back (restore prior / delete, tolerating ErrNotFound).
Rollback-step failures surfaced explicitly ("keyring may be inconsistent" + affected keys).
Snapshot is call-scoped, best-effort cleared (no false zeroization claim — consistent with INT-430).

Tests (97 funcs, `-race`)

List/Delete (empty/invalid/closed/idempotent/cross-profile/ungated), Delete attempt-all-on-failure, SetBundle happy/profile-errors/no-overwrite-conflict/overwrite, induced rollback (new-only delete, mixed restore+delete, ErrExists-racer, rollback-failure surfacing), allowlist validate-all-first, snapshot hygiene (prior value restored).

Out of scope (later INT-310 units)

Real OS backends + selection (PR4), Linux fail-closed (PR5), redaction (PR6), migration helpers (PR7), config show/clear surfaces, v0.1.0.

Architect review

Codex (reused session) converged after 2 revision rounds (majors: rollback must cover the failing key for ambiguous errors; profile-empty must error not (nil,nil); DeleteBundle attempt-all; forward writes must use the real overwrite flag) to blockers=0 majors=0; final minor (Result/Untouched wording) applied.

Add ListBundle/DeleteBundle and the atomic-ish SetBundle per the Secret-Handling Standard 2.1, with the 1.5.1 contract and 1.7 config-clear semantics. - backend interface gains listKeys(); memoryBackend implements it plus unobtrusive setHook/deleteHook test seams (2.1) for rollback paths. - ListBundle/DeleteBundle: sorted keys under a profile; not allowlist- gated (operate on stored reality). DeleteBundle attempts every key (no fail-fast), aggregates failures. Valid empty profile -> (nil,nil). - SetBundle: validate-all (syntax + allowlist) first; no-overwrite fails on any pre-existing target (nothing written); overwrite snapshots prior values pre-write. Forward writes use the real overwrite flag so the backend's atomic no-overwrite guard holds. On mid-bundle failure, roll back every attempted key (ErrExists racer excepted), surface rollback failures as inconsistent-state errors, best-effort clear the call-scoped snapshot. Result reports Written/Restored/Deleted/Untouched. Closes #5 [INT-431]

rianjs · 2026-05-16T13:39:05Z

Blocker
None.

Major
None.

Minor

SetBundle’s rollback branch treats any errors.Is(writeErr, ErrExists) as the no-overwrite racer case. The converged contract was narrower: only !overwrite && ErrExists means “K is not ours.” Change isExists := errors.Is(writeErr, ErrExists) to something like isRacerExists := !so.overwrite && errors.Is(writeErr, ErrExists). If an overwrite backend ever incorrectly returns ErrExists, the safer behavior is the ambiguous-error rollback path.
Snapshot cleanup is skipped if an overwrite snapshot get fails after earlier prior values were captured. In-memory makes this unlikely, but the stated contract is “snapshot cleared before return.” Use a defer clearSnapshot(snap) once snap exists, or clear before returning from that get error path.

Nit
None.

The core INT-431 decisions are implemented: validate-all-first, pre-write conflict gate writes nothing, forward writes use the real overwrite flag, overwrite snapshots prior values, ambiguous rollback covers the failing key, ErrExists racer handling leaves K alone, DeleteBundle attempts all deletes, profile-empty returns *RefError, and bundle reads/deletes are not allowlist-gated. The remaining items are hardening/contract polish, not architecture blockers.

rianjs · 2026-05-16T13:40:34Z

TDD Coverage Assessment

Overall: coverage is strong. All major paths from the spec are exercised. A few genuine gaps below; the nits are take-or-leave.

Gap — Missing test cases

G1 · DeleteBundle invalid-profile error branch not tested
TestDeleteBundle only checks the empty-string case (ErrRefEmpty). The invalid-char path (DeleteBundle("bad.prof")) is never asserted, unlike ListBundle which checks both. bundle.go:88 delegates to validateProfile, which has two exit paths; only one is covered for DeleteBundle.

G2 · TestSetBundleRollbackNewKeysDeleted — Deleted slice is wrong (and reveals a spec question)
bundle_test.go:190 asserts res.Deleted == ["a", "b", "c"]. But c is the key whose set failed (non-ErrExists error), so it is included in rbKeys and rolled back via delete. The backend never wrote c, so delete will return ErrNotFound, which bundle.go:226 tolerates (no error → appended to res.Deleted). The test passes today because ErrNotFound is swallowed. This is arguably intentional, but it means Deleted conflates "we wrote it and deleted it" with "we never wrote it but tried to delete it (ErrNotFound)". Worth an explicit comment or a separate sentinel — the test assertion is not wrong per the current code, but it is fragile: any future change that treats ErrNotFound differently during rollback will silently break the semantics without the test catching it.

G3 · TestSetBundleRollbackFailureSurfaced — Deleted assertion is questionable
bundle_test.go:271 asserts res.Deleted == ["b"]. Key b was never successfully written (its set failed), yet it appears in Deleted. Same ErrNotFound-tolerance issue as G2. The test logic is consistent with the implementation, but a reader expecting Deleted to mean "we wrote it and then deleted it during rollback" will be confused. A comment anchoring the expected behavior would harden this.

G4 · No test for SetBundle where exists() returns an error during pre-write state collection
bundle.go:157-161 checks s.be.exists() for every key and returns early on error. There is no seam or test driving this path — if exists ever fails (e.g., future backend), the early-return behavior is untested. Minor given memory backend always succeeds, but the code path is real.

G5 · No test for SetBundle where get() fails during snapshot construction
bundle.go:178-183 calls s.be.get() for each pre-existing key when building the overwrite snapshot. A backend error here returns early (Result{}, err). No test covers this. Same caveat as G4 — low risk on memory, but untested branch.

Minor — Assertion quality / sentinel specificity

M1 · TestDeleteBundleAttemptsAllOnFailure — partial return value not fully verified
bundle_test.go:108 checks deleted == ["b"] but does not verify that "a" and "c" are still present in the store after the failed deletes. The attempt-all contract says failed keys stay in the store; asserting via s.Exists or s.ListBundle would make the test a complete contract test rather than just an error-message check.

M2 · TestSetBundleNoOverwriteConflict — only checks one-conflict scenario
A single conflicting key is tested. The no-overwrite path collects all conflicts (bundle.go:166-173) and names them all in the error. Worth a sub-case with two pre-existing keys to verify the error names both and neither is mutated.

M3 · TestSetBundleRollbackRestoresPriorValues — snapshot hygiene not verified
The test checks res.Restored and res.Deleted, and reads a back to confirm restoration. It does not verify that b (a truly new key that was written then deleted) is absent. bundle_test.go:217 checks b — this is present, so actually covered. No issue here. ✓

Nit

N1 · TestSetBundleAllowlistEnforced — only one allowed key passes, one fails
A quick check that two disallowed keys both appear in the error message would round out the validate-all-first proof, matching the plural wording in bundle.go:148-151 (per-key loop, returns on first bad key — so this is actually fail-fast, not validate-all-first for the allowlist). The comment at bundle_test.go:284 says "validate-all-first" but the implementation stops at the first disallowed key. Nit: the comment is misleading; the allowlist check is fail-fast, not batch.

N2 · openMem helper defined in store_test.go, mustSet in bundle_test.go
Fine for a same-package test suite, but if the file count grows, a testhelper_test.go would make helpers easier to find.

Summary table

Area	Status
ListBundle: empty/invalid/closed/cross-profile/ungated	✅ all covered
DeleteBundle: idempotent/closed/attempt-all-on-failure/ungated	✅ covered; invalid-char branch missing (G1)
SetBundle: happy/profile+empty errors/no-overwrite conflict/overwrite	✅ covered
SetBundle rollback: new-keys-deleted	✅ covered (assertion nuance: G2)
SetBundle rollback: prior-values-restored	✅ covered
SetBundle rollback: ErrExists-racer-not-touched	✅ covered
SetBundle rollback: rollback-failure-surfaced	✅ covered (assertion nuance: G3)
SetBundle allowlist validate-all-first	⚠️ comment says validate-all-first but impl is fail-fast (N1)
SetBundle snapshot hygiene (exists/get errors)	❌ untested (G4, G5)

Address PR review: - Rollback now treats only !overwrite && ErrExists as the racer case (isRacerExists); a stray ErrExists on the overwrite path takes the safer ambiguous rollback path (Codex). - defer clearSnapshot(snap) once the snapshot exists, so the §1.5.1 'cleared before return' contract holds on the get-error path too; removed the explicit success/rollback clear calls (Codex). - Document that Result.Deleted is 'keys the rollback guarantees absent' (includes a key whose own write failed), anchoring the semantics. - Tests: DeleteBundle invalid-char profile; assert failed-delete keys remain present; multi-conflict no-overwrite names all conflicts; fix the misleading 'validate-all-first' comment. [INT-431]

rianjs · 2026-05-16T13:42:57Z

Addressed review in 6184a1a:

Codex

Minor — racer condition too broad: now isRacerExists := !so.overwrite && errors.Is(writeErr, ErrExists); a stray ErrExists on the overwrite path takes the safer ambiguous rollback (failedKey rolled back).
Minor — snapshot clear skipped on get-error path: defer clearSnapshot(snap) once the snapshot exists; explicit success/rollback clears removed. Contract now holds on every return.

TDD assessment

G1: added DeleteBundle invalid-char profile case.
G2/G3: documented Result.Deleted semantics ("keys the rollback guarantees absent", incl. a key whose own write failed) on the struct and at the delete branch.
Minor: TestDeleteBundleAttemptsAllOnFailure now asserts the failed-delete keys remain present.
Minor: no-overwrite conflict test now pre-sets two keys and asserts the error names both.
Nit: fixed the misleading "validate-all-first" comment (validation precedes all writes; fail-fast on first disallowed key).
G4/G5 (declined): the exists()/get() early-return error branches in pre-write collection are unreachable with the memory backend (those only error when closed, which Store guards before reaching them); they are defensive forward-compat for real backends. Driving them would require new getHook/existsHook seams — speculative test surface for an unreachable path on the only current backend. Will be covered naturally when real backends land (PR4).

Local: build, go test -race, golangci-lint, tidy all clean.

rianjs · 2026-05-16T13:44:42Z

Verified latest PR #6 head 6184a1a and CI run 25963456009: completed successfully.

Blocker
None.

Major
None.

Minor
None.

Nit
None.

The implementation now matches the converged INT-431 design. SetBundle validates before writes, preserves the backend no-overwrite guard, snapshots pre-write values, distinguishes !overwrite && ErrExists from ambiguous failures, rolls back the right key set, and clears snapshots on all return paths after snapshot allocation. DeleteBundle attempts all deletes and reports failures without allowlist gating. Profile-empty behavior is correctly a *RefError, and the Result semantics are now explicit enough for callers and tests.

monit-reviewer

Automated PR Review

Reviewed commit: 6184a1a

Summary

Reviewer	Findings
harness-engineering:harness-architecture-reviewer	3
harness-engineering:harness-enforcement-reviewer	2
harness-engineering:harness-knowledge-reviewer	3
harness-engineering:harness-self-documenting-code-reviewer	1

harness-engineering:harness-architecture-reviewer (3 findings)

💡 Suggestion - credstore/memory.go:44

setHook fires on both forward writes and rollback restore writes (s.be.set(itemKey[k], prior, true) in the rollback path). A future test that installs a setHook to fail key X's forward write, where X has a prior snapshot value, will inadvertently block the rollback restore for X — silently converting a forward-failure test into a rollback-failure test. The invariant that hooked keys must be new (no prior value) is implicit and unguarded. Consider a separate restoreHook or a boolean isRollback parameter to scope failures precisely.

💡 Suggestion - credstore/bundle.go:238

Result.Deleted includes keys whose forward write failed and were never stored, not only keys physically removed during rollback. The field name implies physical deletion occurred; callers using it for audit or change-tracking will see a key they never wrote reported as deleted. Documentation was added, but the semantic mismatch is a lasting callsite hazard. Consider renaming to 'Absent' or 'RollbackAbsent'.

💡 Suggestion - credstore/bundle_test.go:229

No test covers SetBundle with WithOverwrite() where every key in kv already exists and a mid-bundle failure occurs — the pure-restore rollback path (no delete calls in rollback). The existing rollback tests cover the mixed case (one existing + two new) but not the all-restore case. A bug in snapshot/restore logic for exclusively pre-existing keys would go undetected.

harness-engineering:harness-enforcement-reviewer (2 findings)

💡 Suggestion - credstore/bundle_test.go:282

TestSetBundleRollbackFailureSurfaced only asserts res.Deleted but does not assert res.Restored == nil and res.Untouched == nil. A regression that incorrectly populated those fields would go undetected. The other rollback tests do assert all Result fields.

💡 Suggestion - credstore/bundle.go:188

On the success path, Result.Written is set to append([]string(nil), keys...) — a copy of the sorted input key set — rather than the accumulated 'written' slice. While equivalent today, using 'written' directly would be more correct by construction: it reflects actual successful backend calls rather than re-deriving them from the input.

harness-engineering:harness-knowledge-reviewer (3 findings)

⚠️ Should Fix - credstore/bundle_test.go:295

TestSetBundleRollbackFailureSurfaced does not verify that key 'a' remains present in the store after its rollback (delete) fails. The test claims to surface the 'keyring may be inconsistent' scenario but never asserts s.Exists("p","a")==true, which is the core invariant of the inconsistency it advertises. Without this, a bug that silently cleaned up 'a' would pass the test.

💡 Suggestion - credstore/bundle_test.go:195

TestSetBundleNoOverwriteConflict seeds 'exists' and 'also' as conflicting keys but only verifies that 'exists'=='old' after the conflict gate fires. The analogous assertion for 'also' is omitted, leaving one of the two reported conflicting keys unverified.

💡 Suggestion - credstore/memory.go:67

deleteHook is consulted before the ErrNotFound guard. If a hook returns a non-nil error for a key that does not exist, the caller receives the hook error instead of ErrNotFound. In SetBundle rollback, ErrNotFound is tolerated but arbitrary hook errors trigger rbFailed — so a deleteHook installed on a missing key would incorrectly classify a clean 'already absent' case as a rollback failure.

harness-engineering:harness-self-documenting-code-reviewer (1 findings)

💡 Suggestion - credstore/bundle_test.go:289

TestSetBundleAllowlistEnforced only exercises the rejection path (one disallowed key → ErrKeyNotAllowed). There is no test for the acceptance path where all bundle keys are in the allowlist and the write succeeds. A bug that gates all writes when an allowlist is configured (e.g., an off-by-one in checkAllowed) would go undetected.

5 info-level observations excluded. Run with --verbose to include.

4 PR discussion threads considered.

Completed in 5m 36s | $1.12 | sonnet | daemon 0.2.116 | Glorfindel

Field	Value
Model	sonnet
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	5m 36s wall · 15m 27s compute (Reviewers: 3m 14s · Synthesis: 2m 20s)
Cost	$1.12
Tokens	118.0k in / 57.3k out
Turns	6

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	28.8k	9.0k	13.6k	15.1k (1h)	$0.20
harness-engineering:harness-architecture-reviewer	sonnet	17.8k	9.9k	2.1k	15.6k (1h)	$0.22
harness-engineering:harness-enforcement-reviewer	sonnet	17.8k	8.0k	2.1k	15.6k (1h)	$0.19
harness-engineering:harness-knowledge-reviewer	sonnet	17.8k	8.3k	2.1k	15.6k (1h)	$0.19
harness-engineering:harness-self-documenting-code-reviewer	sonnet	17.8k	10.6k	2.1k	15.7k (1h)	$0.23
security:security-code-auditor	haiku	18.1k	11.5k	0	18.1k (1h)	$0.09

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer · 2026-05-16T13:50:26Z

+		t.Fatalf("nothing written when a key is disallowed: %+v", res)
+	}
+	if ok, _ := s.Exists("default", "a"); ok {
+		t.Fatal("no key may be written when a sibling is disallowed (validation precedes all writes)")


🟡 Medium (harness-engineering:harness-knowledge-reviewer): TestSetBundleRollbackFailureSurfaced does not verify that key 'a' remains present in the store after its rollback (delete) fails. The test claims to surface the 'keyring may be inconsistent' scenario but never asserts s.Exists("p","a")==true, which is the core invariant of the inconsistency it advertises. Without this, a bug that silently cleaned up 'a' would pass the test.

Reply to this thread when addressed.

Fixed in bdc1cb7. TestSetBundleRollbackFailureSurfaced now asserts s.Exists("default","a")==true — 'a' was written then its rollback delete failed, so it is leaked; that residue is the inconsistency the test advertises. Also asserts Restored/Untouched nil.

monit-reviewer · 2026-05-16T13:50:27Z

@@ -38,6 +44,11 @@ func (b *memoryBackend) set(itemKey, value string, overwrite bool) error {
 	if b.m == nil {


🔵 Low (harness-engineering:harness-architecture-reviewer): setHook fires on both forward writes and rollback restore writes (s.be.set(itemKey[k], prior, true) in the rollback path). A future test that installs a setHook to fail key X's forward write, where X has a prior snapshot value, will inadvertently block the rollback restore for X — silently converting a forward-failure test into a rollback-failure test. The invariant that hooked keys must be new (no prior value) is implicit and unguarded. Consider a separate restoreHook or a boolean isRollback parameter to scope failures precisely.

Reply to this thread when addressed.

Addressed in bdc1cb7 via documentation + a demonstrating test. setHook fires on every set incl. rollback restores; documented the invariant on the field and added TestSetBundleAllRestoreRollback which uses the prescribed distinguish-by-call pattern (fail forward write of c, allow its restore). Chose docs+pattern over a second hook/param to avoid widening the test-only seam surface.

monit-reviewer · 2026-05-16T13:50:27Z

+			if err := s.be.delete(itemKey[k]); err != nil && !errors.Is(err, ErrNotFound) {
+				rbFailed = append(rbFailed, k)
+				continue
+			}


🔵 Low (harness-engineering:harness-architecture-reviewer): Result.Deleted includes keys whose forward write failed and were never stored, not only keys physically removed during rollback. The field name implies physical deletion occurred; callers using it for audit or change-tracking will see a key they never wrote reported as deleted. Documentation was added, but the semantic mismatch is a lasting callsite hazard. Consider renaming to 'Absent' or 'RollbackAbsent'.

Reply to this thread when addressed.

Fixed in bdc1cb7. Renamed Result.Deleted -> Result.Absent ("rollback guarantees these keys are not present") with an updated doc comment. Done now because the package is pre-v0.1.0 with no consumers — cheapest point to remove the callsite hazard.

monit-reviewer · 2026-05-16T13:50:27Z

+	}
+	// b was never written (set failed) so it rolls back via delete →
+	// ErrNotFound tolerated → reported Deleted; a's rollback failed.
+	eqStrings(t, "Deleted", res.Deleted, []string{"b"})


🔵 Low (harness-engineering:harness-enforcement-reviewer): TestSetBundleRollbackFailureSurfaced only asserts res.Deleted but does not assert res.Restored == nil and res.Untouched == nil. A regression that incorrectly populated those fields would go undetected. The other rollback tests do assert all Result fields.

Reply to this thread when addressed.

Fixed in bdc1cb7. The test now also asserts res.Restored == nil and res.Untouched == nil.

monit-reviewer · 2026-05-16T13:50:27Z

+		t.Fatalf("a not restored to prior value: %q (snapshot must capture pre-write state)", v)
+	}
+	if ok, _ := s.Exists("default", "b"); ok {
+		t.Fatal("new key b not rolled back")


🔵 Low (harness-engineering:harness-architecture-reviewer): No test covers SetBundle with WithOverwrite() where every key in kv already exists and a mid-bundle failure occurs — the pure-restore rollback path (no delete calls in rollback). The existing rollback tests cover the mixed case (one existing + two new) but not the all-restore case. A bug in snapshot/restore logic for exclusively pre-existing keys would go undetected.

Reply to this thread when addressed.

Fixed in bdc1cb7. Added TestSetBundleAllRestoreRollback: all targets pre-exist, mid-bundle failure → pure-restore rollback (Restored=[a,b,c], Absent/Written nil, prior values verified).

monit-reviewer · 2026-05-16T13:50:27Z

+	s := openMem(t, "a", "b")
+	res, err := s.SetBundle("default", map[string]string{"a": "1", "x": "2"})
+	if !errors.Is(err, ErrKeyNotAllowed) {
+		t.Fatalf("err = %v, want ErrKeyNotAllowed", err)


🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): TestSetBundleAllowlistEnforced only exercises the rejection path (one disallowed key → ErrKeyNotAllowed). There is no test for the acceptance path where all bundle keys are in the allowlist and the write succeeds. A bug that gates all writes when an allowlist is configured (e.g., an off-by-one in checkAllowed) would go undetected.

Reply to this thread when addressed.

Fixed in bdc1cb7. TestSetBundleAllowlistEnforced now also covers the acceptance path (all keys allowed → success, Written=[a,b]), guarding against a bug that gates all writes when an allowlist is set.

monit-reviewer · 2026-05-16T13:50:27Z

+		return nil
+	}
+	res, err := s.SetBundle("default", map[string]string{"a": "1", "b": "2", "c": "3"})
+	if err == nil || !strings.Contains(err.Error(), `"c"`) {


🔵 Low (harness-engineering:harness-knowledge-reviewer): TestSetBundleNoOverwriteConflict seeds 'exists' and 'also' as conflicting keys but only verifies that 'exists'=='old' after the conflict gate fires. The analogous assertion for 'also' is omitted, leaving one of the two reported conflicting keys unverified.

Reply to this thread when addressed.

Fixed in bdc1cb7. TestSetBundleNoOverwriteConflict now asserts the second conflicting key ('also') is also unmutated (=="old2").

monit-reviewer · 2026-05-16T13:50:27Z

+				}
+				snap[k] = v
+			}
+		}


🔵 Low (harness-engineering:harness-enforcement-reviewer): On the success path, Result.Written is set to append([]string(nil), keys...) — a copy of the sorted input key set — rather than the accumulated 'written' slice. While equivalent today, using 'written' directly would be more correct by construction: it reflects actual successful backend calls rather than re-deriving them from the input.

Reply to this thread when addressed.

Fixed in bdc1cb7. Success path returns the accumulated 'written' slice (reflects actual successful backend calls) instead of re-deriving from the input keys.

monit-reviewer · 2026-05-16T13:50:27Z

 	if b.m == nil {
 		return ErrStoreClosed
 	}
+	if b.deleteHook != nil {


🔵 Low (harness-engineering:harness-knowledge-reviewer): deleteHook is consulted before the ErrNotFound guard. If a hook returns a non-nil error for a key that does not exist, the caller receives the hook error instead of ErrNotFound. In SetBundle rollback, ErrNotFound is tolerated but arbitrary hook errors trigger rbFailed — so a deleteHook installed on a missing key would incorrectly classify a clean 'already absent' case as a rollback failure.

Reply to this thread when addressed.

Fixed in bdc1cb7. memoryBackend.delete now consults deleteHook AFTER the not-found check, so a hook on an already-absent key returns ErrNotFound (tolerated by rollback) instead of being misclassified as a rollback failure.

Address daemon review: - Result.Deleted -> Absent: the field includes keys whose own write failed (never stored), so 'Deleted' implied physical removal that did not occur. Renamed pre-v0.1.0 while there are no consumers. - memoryBackend.delete consults deleteHook AFTER the not-found check, so a hook on an already-absent key can't misclassify a clean rollback as a failure. Documented that setHook fires on rollback restores too. - SetBundle success path returns the accumulated 'written' slice. - Tests: assert leaked key 'a' is still present in the rollback-failure-surfaced case (the actual inconsistency) + Restored/ Untouched nil; add all-restore rollback (pure restore, distinguish-by- call hook); allowlist acceptance path; assert both conflicting keys unmutated; DeleteBundle invalid-char profile. [INT-431]

monit-reviewer

Automated PR Review

Reviewed commit: bdc1cb7 | Previous: 6184a1a (incremental)

Summary

Reviewer	Findings
harness-engineering:harness-architecture-reviewer	3
harness-engineering:harness-knowledge-reviewer	1
harness-engineering:harness-self-documenting-code-reviewer	1

harness-engineering:harness-architecture-reviewer (3 findings)

⚠️ Should Fix - credstore/bundle.go:222

Keys whose rollback step fails (rbFailed) are surfaced only in the error string ("keyring may be inconsistent") and not in any Result field. Callers needing to programmatically identify which keys are in an undefined state — e.g., to prompt the user to delete specific keys manually or trigger automated remediation — must parse free-form error text, which is fragile. The existing Result has Written/Restored/Absent/Untouched to classify clean outcomes; consider adding a RollbackFailed []string (or Inconsistent []string) field for the inconsistent-state keys. This is pre-v0.1.0 with no consumers, so the cost of the API change is low.

💡 Suggestion - credstore/bundle_test.go:318

TestSetBundleRollbackFailureSurfaced has two remaining assertion gaps on the rbFailed path: (1) res.Written is not asserted nil — a regression that populates Written on a rollback-failure return would go undetected; (2) res.Absent is not asserted to exclude key 'a' — since 'a' failed its rollback it should not appear in Absent, but no assertion verifies len(res.Absent)==1 or that 'a' is absent from res.Absent. Other rollback tests assert all Result fields; this test should do the same.

💡 Suggestion - credstore/bundle_test.go:254

TestSetBundleAllRestoreRollback uses a call-count closure (calls) to let the first invocation of setHook on 'default/c' fail the forward write while allowing the second invocation (rollback restore) to succeed. This is fragile: it assumes the rollback restore for 'c' is exactly the second call and that no reordering or retry occurs. The test has no assertion on rollback ordering, so any future change to rollback iteration order would silently misbehave rather than fail fast. Consider an 'armed' bool flag reset by the forward-write failure, or at minimum assert calls == 2 at the end of the test to pin the invariant.

harness-engineering:harness-knowledge-reviewer (1 findings)

💡 Suggestion - credstore/bundle.go:183

Result.Untouched merges two semantically distinct cases: (1) keys never attempted because a failure occurred before reaching them, and (2) the racer-owned ErrExists key deliberately left intact. A caller cannot distinguish 'never attempted' from 'owned by a concurrent writer' without also inspecting the error type, fragmenting the decision across both Result and error. Since this is pre-v0.1.0 with no consumers, consider splitting these into separate fields (e.g., NotReached []string and RacerOwned []string), or at minimum documenting the conflation explicitly in the Result struct godoc.

harness-engineering:harness-self-documenting-code-reviewer (1 findings)

💡 Suggestion - credstore/bundle_test.go:1

No test exercises SetBundle with a kv map containing an invalid key (empty string or a key with a disallowed character). The validate-all-first path in SetBundle would catch this, but without a test the 'no writes occur when any key is invalid' guarantee — which mirrors the allowlist path already covered by TestSetBundleAllowlistEnforced — could silently regress in a future refactor.

3 info-level observations excluded. Run with --verbose to include.

13 PR discussion threads considered.

Completed in 5m 03s | $0.99 | sonnet | daemon 0.2.116 | Glorfindel

Field	Value
Model	sonnet
Mode	Re-review · Cycle 2 · Session resumed
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	5m 03s wall · 13m 16s compute (Reviewers: 2m 40s · Synthesis: 2m 01s)
Cost	$0.99
Tokens	208.5k in / 47.3k out
Turns	7

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	49.6k	5.7k	13.6k	36.0k (1h)	$0.22
harness-engineering:harness-architecture-reviewer	sonnet	18.8k	8.8k	18.8k	0	$0.15
harness-engineering:harness-enforcement-reviewer	sonnet	18.8k	7.7k	18.8k	0	$0.13
harness-engineering:harness-knowledge-reviewer	sonnet	18.8k	8.4k	18.8k	0	$0.14
harness-engineering:harness-self-documenting-code-reviewer	sonnet	18.8k	8.8k	18.8k	0	$0.15
security:security-code-auditor	haiku	19.1k	6.5k	19.1k	0	$0.05
discussion-summarizer	—	64.5k	1.3k	51.1k	13.4k (1h)	$0.15

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer · 2026-05-16T14:06:05Z

+	if !isRacerExists {
+		rbKeys = append(rbKeys, failedKey)
+	}
+	sort.Strings(rbKeys)


🟡 Medium (harness-engineering:harness-architecture-reviewer): Keys whose rollback step fails (rbFailed) are surfaced only in the error string ("keyring may be inconsistent") and not in any Result field. Callers needing to programmatically identify which keys are in an undefined state — e.g., to prompt the user to delete specific keys manually or trigger automated remediation — must parse free-form error text, which is fragile. The existing Result has Written/Restored/Absent/Untouched to classify clean outcomes; consider adding a RollbackFailed []string (or Inconsistent []string) field for the inconsistent-state keys. This is pre-v0.1.0 with no consumers, so the cost of the API change is low.

Reply to this thread when addressed.

Fixed in b8695c0. Added Result.RollbackFailed []string (sorted), populated on the rbFailed return path; godoc states it always accompanies a non-nil error and that such a key appears in no other slice. Callers can now remediate without parsing the error text.

monit-reviewer · 2026-05-16T14:06:05Z

+		!strings.Contains(msg, "keyring may be inconsistent") {
+		t.Fatalf("error must surface rollback failure + affected key: %v", err)
+	}
+	// b was never written (set failed) so it rolls back via delete →


🔵 Low (harness-engineering:harness-architecture-reviewer): TestSetBundleRollbackFailureSurfaced has two remaining assertion gaps on the rbFailed path: (1) res.Written is not asserted nil — a regression that populates Written on a rollback-failure return would go undetected; (2) res.Absent is not asserted to exclude key 'a' — since 'a' failed its rollback it should not appear in Absent, but no assertion verifies len(res.Absent)==1 or that 'a' is absent from res.Absent. Other rollback tests assert all Result fields; this test should do the same.

Reply to this thread when addressed.

Fixed in b8695c0. TestSetBundleRollbackFailureSurfaced now asserts res.Written == nil and res.RollbackFailed == [a]; the existing eqStrings("Absent", {"b"}) already pins that a is excluded from Absent.

monit-reviewer · 2026-05-16T14:06:05Z

+			}
+		}
+		return nil
+	}


🔵 Low (harness-engineering:harness-architecture-reviewer): TestSetBundleAllRestoreRollback uses a call-count closure (calls) to let the first invocation of setHook on 'default/c' fail the forward write while allowing the second invocation (rollback restore) to succeed. This is fragile: it assumes the rollback restore for 'c' is exactly the second call and that no reordering or retry occurs. The test has no assertion on rollback ordering, so any future change to rollback iteration order would silently misbehave rather than fail fast. Consider an 'armed' bool flag reset by the forward-write failure, or at minimum assert calls == 2 at the end of the test to pin the invariant.

Reply to this thread when addressed.

Fixed in b8695c0. Added if calls != 2 at the end of TestSetBundleAllRestoreRollback, pinning that exactly the forward-write failure (call 1) and c's rollback restore (call 2) hit the hook — any rollback-order change now fails fast.

monit-reviewer · 2026-05-16T14:06:05Z

+		}
+	} else {
+		for _, k := range keys {
+			if existed[k] {


🔵 Low (harness-engineering:harness-knowledge-reviewer): Result.Untouched merges two semantically distinct cases: (1) keys never attempted because a failure occurred before reaching them, and (2) the racer-owned ErrExists key deliberately left intact. A caller cannot distinguish 'never attempted' from 'owned by a concurrent writer' without also inspecting the error type, fragmenting the decision across both Result and error. Since this is pre-v0.1.0 with no consumers, consider splitting these into separate fields (e.g., NotReached []string and RacerOwned []string), or at minimum documenting the conflation explicitly in the Result struct godoc.

Reply to this thread when addressed.

Declining the split, by design. Untouched deliberately means "target keys this call did not change" — both never-attempted and racer-owned keys satisfy that, and the documented contract is that the racer case is identified via errors.Is(err, ErrExists) (the error names the key). Splitting into NotReached/RacerOwned would fragment a single "unchanged" concept across two fields while the disambiguator (the error sentinel) is already the documented mechanism. The struct godoc already states the conflation explicitly. Adding RollbackFailed (above) is different: those keys are in an undefined state with no other signal, so a dedicated field earns its keep.

monit-reviewer · 2026-05-16T14:06:05Z

@@ -0,0 +1,371 @@
+package credstore


🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): No test exercises SetBundle with a kv map containing an invalid key (empty string or a key with a disallowed character). The validate-all-first path in SetBundle would catch this, but without a test the 'no writes occur when any key is invalid' guarantee — which mirrors the allowlist path already covered by TestSetBundleAllowlistEnforced — could silently regress in a future refactor.

Reply to this thread when addressed.

Fixed in b8695c0. Added TestSetBundleInvalidKeyNoWrites (table: empty key → ErrRefEmpty, bad char → ErrRefInvalidChar) asserting nothing is written and a sibling stays absent — mirroring TestSetBundleAllowlistEnforced for the syntax-validation path.

Surface keys whose SetBundle rollback step itself failed in a machine- readable Result.RollbackFailed slice rather than only in the error string, so callers can remediate (e.g. prompt to delete) without parsing free-form text. A key there appears in no other Result slice. Test hardening: - TestSetBundleRollbackFailureSurfaced asserts Written nil and RollbackFailed == [a], confirming a is excluded from Absent. - TestSetBundleAllRestoreRollback pins the distinguish-by-call invariant with calls == 2 so a rollback-order change fails fast. - New TestSetBundleInvalidKeyNoWrites: a syntactically invalid key in kv fails the whole call with *RefError and writes nothing. [INT-431]

monit-reviewer

Automated PR Review

Reviewed commit: b8695c0 | Previous: bdc1cb7 (incremental)

Approved with 3 non-blocking suggestions below. Address at your discretion.

Summary

Reviewer	Findings
harness-engineering:harness-architecture-reviewer	1
harness-engineering:harness-knowledge-reviewer	1
harness-engineering:harness-self-documenting-code-reviewer	1

harness-engineering:harness-architecture-reviewer (1 findings)

💡 Suggestion - credstore/bundle.go:58

bundleBareKeys calls s.be.listKeys() which returns every stored key across all profiles, then filters by prefix in Go — an O(total-keys) scan per ListBundle, DeleteBundle, and SetBundle call. The backend interface is now locked to a full-scan contract. A future OS-keyring backend cannot scope enumeration to a profile without a separate round-trip or an O(total) scan. Changing listKeys to accept a prefix string (listKeys(prefix string)) before any real backends land would avoid a later breaking change. Not a correctness issue today.

harness-engineering:harness-knowledge-reviewer (1 findings)

💡 Suggestion - credstore/bundle.go:101

DeleteBundle returns (deletedKeys []string, error) where failed-key names appear only in the error string. A caller that receives a partial-failure error cannot programmatically identify which keys were attempted-and-failed vs. skipped without parsing free-form text. This is inconsistent with the pattern just established by SetBundle's Result.RollbackFailed []string. Consider returning a structured error type (or a parallel failedKeys []string) so callers can remediate without error-string parsing.

harness-engineering:harness-self-documenting-code-reviewer (1 findings)

💡 Suggestion - credstore/bundle_test.go:194

SetBundle has no cross-profile isolation test. TestListBundle (line ~52) and TestDeleteBundle (line ~138) both explicitly assert that operations on one profile leave a second profile intact. SetBundle has no analogous assertion. The prefix-key scheme makes cross-profile leakage safe by construction today, but a future refactor to joinItemKey or bundleBareKeys could silently break isolation without a test to catch it.

1 info-level observations excluded. Run with --verbose to include.

18 PR discussion threads considered.

Completed in 5m 12s | $1.38 | sonnet | daemon 0.2.116 | Glorfindel

Field	Value
Model	sonnet
Mode	Re-review · Cycle 3 · Session resumed
Reviewers	hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine	claude · sonnet
Reviewed by	pr-review-daemon · monit-pr-reviewer
Duration	5m 12s wall · 14m 35s compute (Reviewers: 3m 19s · Synthesis: 1m 22s)
Cost	$1.38
Tokens	228.8k in / 51.3k out
Turns	6

Per-workstream usage

Workstream	Model	In	Out	Cache read	Cache create	Cost
hybrid-synthesis	sonnet	62.0k	4.4k	13.6k	48.4k (1h)	$0.25
harness-engineering:harness-architecture-reviewer	sonnet	19.6k	9.2k	2.1k	17.4k (1h)	$0.22
harness-engineering:harness-enforcement-reviewer	sonnet	19.6k	10.2k	2.1k	17.4k (1h)	$0.23
harness-engineering:harness-knowledge-reviewer	sonnet	19.6k	8.0k	2.1k	17.4k (1h)	$0.20
harness-engineering:harness-self-documenting-code-reviewer	sonnet	19.6k	10.5k	2.1k	17.4k (1h)	$0.24
security:security-code-auditor	haiku	19.9k	7.0k	0	19.8k (1h)	$0.07
discussion-summarizer	—	68.7k	2.0k	53.2k	15.5k (1h)	$0.18

_{Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.}

monit-reviewer · 2026-05-16T14:14:46Z

+// "<key>" half of every "<profile>/<key>" Item.Key). Caller holds s.mu.
+func (s *Store) bundleBareKeys(profile string) ([]string, error) {
+	all, err := s.be.listKeys()
+	if err != nil {


🔵 Low (harness-engineering:harness-architecture-reviewer): bundleBareKeys calls s.be.listKeys() which returns every stored key across all profiles, then filters by prefix in Go — an O(total-keys) scan per ListBundle, DeleteBundle, and SetBundle call. The backend interface is now locked to a full-scan contract. A future OS-keyring backend cannot scope enumeration to a profile without a separate round-trip or an O(total) scan. Changing listKeys to accept a prefix string (listKeys(prefix string)) before any real backends land would avoid a later breaking change. Not a correctness issue today.

Reply to this thread when addressed.

monit-reviewer · 2026-05-16T14:14:46Z

+	if err := validateProfile(profile); err != nil {
+		return nil, err
+	}
+	targets, err := s.bundleBareKeys(profile)


🔵 Low (harness-engineering:harness-knowledge-reviewer): DeleteBundle returns (deletedKeys []string, error) where failed-key names appear only in the error string. A caller that receives a partial-failure error cannot programmatically identify which keys were attempted-and-failed vs. skipped without parsing free-form text. This is inconsistent with the pattern just established by SetBundle's Result.RollbackFailed []string. Consider returning a structured error type (or a parallel failedKeys []string) so callers can remediate without error-string parsing.

Reply to this thread when addressed.

monit-reviewer · 2026-05-16T14:14:46Z

+	memOf(t, s).setHook = func(ik string) error {
+		if ik == "default/c" {
+			return errors.New("backend exploded")
+		}


🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): SetBundle has no cross-profile isolation test. TestListBundle (line ~52) and TestDeleteBundle (line ~138) both explicitly assert that operations on one profile leave a second profile intact. SetBundle has no analogous assertion. The prefix-key scheme makes cross-profile leakage safe by construction today, but a future refactor to joinItemKey or bundleBareKeys could silently break isolation without a test to catch it.

Reply to this thread when addressed.

monit-reviewer suggested changes May 16, 2026

View reviewed changes

monit-reviewer approved these changes May 16, 2026

View reviewed changes

rianjs merged commit eaa1d23 into main May 16, 2026
4 checks passed

rianjs deleted the feat/INT-431-bundle-ops branch May 16, 2026 14:15

This was referenced May 16, 2026

credstore: real OS keyring backends + auto/env/config selection (§1.4) #7

Closed

feat(credstore): real OS keyring backends + auto/env/config selection (INT-432) #8

Merged

		@@ -38,6 +44,11 @@ func (b *memoryBackend) set(itemKey, value string, overwrite bool) error {
		if b.m == nil {

Conversation

rianjs commented May 16, 2026

Surface

Atomicity highlights

Tests (97 funcs, -race)

Out of scope (later INT-310 units)

Architect review

Uh oh!

rianjs commented May 16, 2026

Uh oh!

rianjs commented May 16, 2026

TDD Coverage Assessment

Gap — Missing test cases

Minor — Assertion quality / sentinel specificity

Nit

Summary table

Uh oh!

rianjs commented May 16, 2026

Uh oh!

rianjs commented May 16, 2026

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monit-reviewer left a comment

Choose a reason for hiding this comment

Automated PR Review

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Tests (97 funcs, `-race`)