Skip to content

fix(oauth): retry secret-lock on Windows ERROR_ACCESS_DENIED (delete-pending race)#445

Open
gnanam1990 wants to merge 1 commit into
mainfrom
fix/oauth-lock-windows-access-denied
Open

fix(oauth): retry secret-lock on Windows ERROR_ACCESS_DENIED (delete-pending race)#445
gnanam1990 wants to merge 1 commit into
mainfrom
fix/oauth-lock-windows-access-denied

Conversation

@gnanam1990

@gnanam1990 gnanam1990 commented Jul 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

createSecretFile (internal/oauth/encrypt.go) takes a <secret>.lock with O_CREATE|O_EXCL and retries on contention — but it only treated os.ErrExist as contention. On Windows, when a concurrent holder's defer os.Remove(lockPath) leaves the lock file in a "delete pending" state, a racer's O_EXCL create returns ERROR_ACCESS_DENIED (os.ErrPermission), not ErrExist — so it fell through to a hard oauth: create token secret lock: Access is denied error instead of retrying.

This treats os.ErrPermission as retryable contention too, mirroring the identical handling already present in acquireFileLock (lock.go, from #261) for the exact same OS quirk. That sibling lock got the fix; this second lock loop was missed.

Impact: fixes the intermittently-failing Windows CI test TestLoadOrCreateSecretConcurrentConverges that has been red-flagging the Windows smoke job on unrelated PRs — and fixes a real Windows bug where concurrent token-secret creation (parallel logins/refreshes) could spuriously fail.

Verification

  • macOS: go build ./..., go vet, gofmt clean; go test ./internal/oauth/ green; the concurrent-converge test passes under -race (5×).
  • ⚠️ The Windows delete-pending ERROR_ACCESS_DENIED path can't be reproduced off Windows — the definitive confirmation is this PR's Windows smoke job going green (it was failing on that exact test before this change).

Linked issue

Team / internal-cycle PR — no separate issue; fixes a CI-blocking Windows flake surfaced across recent PRs.

Checklist

  • go build ./..., go vet ./..., go test ./internal/oauth/ pass locally (non-Windows).
  • gofmt clean.
  • Fix mirrors the existing lock.go handling; explanatory comment added.
  • Windows CI green — the real proof; watch this PR's Windows smoke job.

Summary by CodeRabbit

  • Bug Fixes
    • Improved secret file creation on Windows by handling a permission-related lock contention case correctly.
    • Reduced spurious failures when multiple processes try to create secrets at the same time.

…reateSecretFile

createSecretFile's secret-lock loop only retried on os.ErrExist. On Windows a
concurrent holder's `defer os.Remove(lockPath)` leaves the lock file in a
"delete pending" state, so a racer's O_EXCL create returns ERROR_ACCESS_DENIED
(os.ErrPermission) rather than ErrExist — which fell through to a hard
"create token secret lock: Access is denied" error instead of retrying. Mirrors
the identical handling already present in acquireFileLock (lock.go, from #261).

Fixes the flaky Windows-CI failure TestLoadOrCreateSecretConcurrentConverges
that was intermittently red-flagging every PR's Windows smoke job.

Verified on macOS: build/vet/gofmt clean, oauth package green, concurrent-
converge test passes under -race (5x). The Windows delete-pending path can't be
reproduced off Windows, so the definitive proof is the Windows CI on this PR.
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Zero automated PR review

Verdict: No blockers found

Blockers

  • None found.

Validation

  • [pass] Diff hygiene: git diff --check
  • [pass] Tests: go test ./...
  • [pass] Build: go run ./cmd/zero-release build
  • [pass] Smoke build: go run ./cmd/zero-release smoke

Scope

Head: 55d1d322c672
Changed files (1): internal/oauth/encrypt.go

This deterministic review checks validation status and basic diff hygiene. A human reviewer still owns product judgment and design quality.

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6313177c-f857-4d67-ad7d-60161d290bfa

📥 Commits

Reviewing files that changed from the base of the PR and between b63fe69 and 55d1d32.

📒 Files selected for processing (1)
  • internal/oauth/encrypt.go

Walkthrough

The lock-file acquisition retry logic in createSecretFile is updated to retry when the lock create error is os.ErrPermission, in addition to the existing os.ErrExist retry, addressing Windows-specific delete-pending behavior during concurrent secret creation.

Changes

Lock Acquisition Retry Fix

Layer / File(s) Summary
Retry on permission errors during lock creation
internal/oauth/encrypt.go
Lock-file creation retry logic now also retries on os.ErrPermission, not just os.ErrExist, to handle Windows delete-pending semantics as contention.

Estimated code review effort: 1 (Trivial) | ~3 minutes

Suggested labels: bug, windows, oauth

Suggested reviewers: none specifically identified from the provided context

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the Windows lock retry fix for delete-pending ERROR_ACCESS_DENIED contention.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/oauth-lock-windows-access-denied

Comment @coderabbitai help to get the list of available commands.

@Vasanthdev2004 Vasanthdev2004 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve — genuine root-cause fix, and it kills the CI flake for real

Worth it: yes. On Windows a concurrent holder's defer os.Remove(lockPath) leaves the .lock file delete-pending, so a racer's O_EXCL create returns ERROR_ACCESS_DENIED (os.ErrPermission) rather than ErrExist. The old createSecretFile loop only treated ErrExist as contention, so parallel token-secret creation (concurrent logins/refreshes) could spuriously hard-fail with oauth: create token secret lock: Access is denied. Treating ErrPermission as retryable is exactly right — and it mirrors the already-merged, identical handling in acquireFileLock (lock.go, #261); that sibling lock got the fix and this one was simply missed. 6 lines, precedented, well-commented.

Verified on Windows (isolated worktree): go test ./internal/oauth/ -run TestLoadOrCreateSecret -count=5 → 5/5 green (the concurrent-converge test that flakes now passes reliably); full package green; go build/vet/gofmt clean. I also confirmed it against the original racy test at -count=100 → 100/100. CI all green including Smoke (windows-latest) — same-repo branch, so that Windows pass is real.

Security: safe. The new ErrPermission branch is read-only (readSecretFileRetry only); the sole write path (writeNewSecretFile, temp-file + atomic rename) runs only after O_EXCL exclusively acquired the lock, so the broadened retry can't double-write or corrupt the secret. Fail-closed behavior untouched.

Nit (non-blocking): broadening to ErrPermission means a genuine permission failure (dir truly unwritable) now surfaces as a timed out waiting for token secret lock timeout rather than the immediate Access is denied — cosmetic diagnostics, and the same tradeoff already accepted in lock.go #261.

This is the correct fix of the #445/#451 pair — see my note on #451. Thanks @gnanam1990.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants