Conversation
Replace the second `mv` in cmd_put with `ln -f` so the strong key and weak alias share a single inode — concurrent cmd_get observers can no longer see strong-HIT / weak-MISS, which was silently re-triggering the Socratic interview. Add cmd_get_with_fallback (Option C) plus a `get-fallback` dispatch entry to self-heal half-pair states (legacy v1.6.1 entries written before this fix, or selectively-invalidated aliases) by recreating the missing side as a hardlink on first read. Closes #70 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add two fixtures pinning the W1.4 hardlink fix: - test-5way.sh spawns 5 concurrent cmd_put calls and asserts every strong/weak pair shares a single inode and identical content. - test-self-heal.sh seeds half-pair states (strong-only, then weak-only) and asserts get-fallback restores the missing side with a matching inode. Both detect macOS vs Linux at runtime to pick the portable `stat -f %i` / `stat -c %i` invocation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🔍 동작 개요
📋 변경 사항
🔄 시나리오 다이어그램sequenceDiagram
participant Caller as 호출자
participant GF as get-fallback
participant SK as 강력한<br/>키 조회
participant WK as 약한<br/>별칭 조회
participant SH as Self-heal<br/>로직
participant Disk as 디스크
Caller->>GF: get-fallback(strong_key, weak_key)
GF->>SK: cmd_get(strong_key) 시도
SK->>Disk: 강력한 키 파일 읽기
alt 강력한 키 존재
Disk-->>SK: 콘텐츠 반환
SK-->>GF: 성공 (exit 0)
GF->>SH: 약한 별칭<br/>self-heal?
SH->>Disk: hardlink<br/>생성
GF-->>Caller: 콘텐츠 반환<br/>(exit 0)
else 강력한 키 부재
Disk-->>SK: 실패
SK-->>GF: 실패
GF->>WK: cmd_get(weak_key) 시도
WK->>Disk: 약한 별칭 파일 읽기
alt 약한 별칭 존재
Disk-->>WK: 콘텐츠 반환
WK-->>GF: 소프트 hit (exit 2)
GF-->>Caller: 콘텐츠 반환<br/>(exit 2)
else 약한 별칭도 부재
Disk-->>WK: 실패
WK-->>GF: 실패
GF-->>Caller: 모두 miss<br/>(exit 1)
end
end
⏱️ 예상 코드 리뷰 소요 시간🎯 3 (Moderate) | ⏱️ ~25 분 🔗 관련 가능성 있는 PR
🐰 축제의 시
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request addresses a race condition in the preview cache by using hardlinks for weak aliases, ensuring that strong and weak keys are published atomically. It also introduces a get-fallback command to self-heal incomplete cache entries and includes new concurrency tests. While the hardlink implementation in cmd_put is sound, the new cmd_get_with_fallback function has several issues: it lacks input sanitization for the keys (leading to potential path traversal), strips trailing newlines from the JSON output due to command substitution, and uses a non-atomic cp operation for repairs.
| cmd_get_with_fallback() { | ||
| # I-8 / issue #70 (W1.4) — Option C self-heal. If a v1.6.1 put landed | ||
| # the strong key but the weak alias is missing (e.g. legacy entry | ||
| # written before the hardlink fix, or an alias that was independently | ||
| # invalidated), restore the alias on the fly so subsequent | ||
| # pre-Socratic probes hit immediately. Conversely, if the strong key | ||
| # is missing but the weak alias resolves (rare but possible after a | ||
| # selective `invalidate STRONG`), repair the strong key from the weak | ||
| # entry. Either path returns the cached JSON on stdout, exit 0; full | ||
| # miss returns exit 1 (matches cmd_get). | ||
| local strong="$1" | ||
| local weak="$2" | ||
| local out | ||
| if out=$(cmd_get "$strong"); then | ||
| # Strong hit — opportunistically ensure the weak alias is in place | ||
| # so the NEXT pre-Socratic probe on this idea avoids the cmd_get | ||
| # round-trip on the strong key entirely. | ||
| if [[ -n "$weak" && "$weak" != "$strong" && ! -f "$CACHE_DIR/$weak.json" ]]; then | ||
| ln -f "$CACHE_DIR/$strong.json" "$CACHE_DIR/$weak.json" 2>/dev/null || true | ||
| fi | ||
| printf '%s' "$out" | ||
| return 0 | ||
| fi | ||
| if out=$(cmd_get "$weak"); then | ||
| # Weak hit but strong miss — repair the strong key from the weak | ||
| # file. Hardlink keeps both names pointed at the same inode so | ||
| # later puts and TTL checks stay consistent. | ||
| if [[ -n "$strong" && "$strong" != "$weak" ]]; then | ||
| ln -f "$CACHE_DIR/$weak.json" "$CACHE_DIR/$strong.json" 2>/dev/null || \ | ||
| cp "$CACHE_DIR/$weak.json" "$CACHE_DIR/$strong.json" | ||
| fi | ||
| printf '%s' "$out" | ||
| return 0 | ||
| fi | ||
| return 1 | ||
| } |
There was a problem hiding this comment.
The cmd_get_with_fallback implementation has several issues that should be addressed:
- Security (Path Traversal): The
strongandweakkeys are used directly in file paths without sanitization. This violates the repository's rules. Variables used in file paths should be sanitized with a strict allowlist (e.g.,tr -dc '[:alnum:]._-') and IDs should use strict regex validation to prevent potential injection or traversal issues. - Correctness (Newline Preservation): Capturing the output of
cmd_getusingout=$(...)and then printing it withprintf '%s' "$out"strips all trailing newlines from the cached JSON. This can break tools that expect a trailing newline. Callingcmd_getdirectly avoids this and is more efficient as it streams the output. - Atomicity: The fallback to
cp(line 365) is non-atomic. If a concurrent process reads the file whilecpis in progress, it may receive truncated or corrupted JSON.
I recommend refactoring the function to sanitize inputs using a strict allowlist, call cmd_get directly to preserve output formatting, and leverage cmd_put for the repair logic to ensure atomicity and reuse existing sanitization logic.
References
- Sanitize variables used in file paths with a strict allowlist (e.g.,
tr -dc '[:alnum:]._-') to prevent potential injection issues. - Implement defense-in-depth sanitization for user-provided data, including removing Unicode control characters and applying strict allowlist validation (e.g., regex) for IDs.
There was a problem hiding this comment.
Applied in 6c4b92a — P1/P3 반영, P2는 이미 처리됨.
- P1 (path traversal):
cmd_get_with_fallback도cmd_put과 동일한[:alnum:]._-허용목록으로 strong/weak 키를 정규화합니다 (자체 호출자는 16-hex 해시만 발행하지만 defense-in-depth 차원). - P3 (atomicity): weak alias self-heal을 private tmp(
.heal-${weak}.$$.tmp)에 stage한 뒤mv -frename(2)으로 게시하도록 변경. 동시 reader가 절반만 복사된 alias를 관측할 수 없습니다 (cmd_put의 alias-first 게시와 동일한 atomicity 보장). - P2 (trailing newline): 지적하신
out=$(cmd_get ...); printf '%s' "$out"패턴은 현재 코드에 존재하지 않습니다.cmd_get이 직접cat "$file"로 stream하므로 command-substitution capture 자체가 없고 trailing newline은 보존됩니다 (codex R2 P3에서 이미 처리됨). 후속 리뷰어를 위해 주석으로 invariant를 명시했습니다.
3개 concurrency fixture (race-window / self-heal / 5way) 모두 통과.
… fallback (#70) Codex review followups on PR #81 W1.4: R2 P1: ln -f against existing alias is NOT atomic (link(2) cannot overwrite, so -f does unlink+link with a window where the alias name disappears). Real fix: build inode at primary_tmp, hardlink alias_tmp to it, publish ALIAS first via rename(2), then publish strong via rename(2). Both names share the inode; only weak-HIT/strong-MISS is reachable from a partial write — the strong-HIT/weak-MISS bug is gone. R2 P2: cmd_get_with_fallback no longer rebuilds strong from weak (weak_key omits idea_spec_hash; rebuilding strong from weak could let stale-spec content satisfy a spec-specific lookup). R2 P3: stream via cmd_get directly (cat) instead of out=$() capture — trailing newlines are now preserved byte-for-byte. R3 P2-A: fall back to cp when ln fails (exFAT/SMB/NFS without hardlink support). Publish status echo now indicates "via link" or "via copy" so the mode is visible. R3 P2-B: distinct exit codes for cmd_get_with_fallback — 0 = strong HIT (authoritative) 2 = soft hit via weak (caller MUST regen previews) 1 = both miss Test fixture asserts all three codes. R3 P3: replaced runtime polling probe (which couldn't reliably catch sub-µs rename(2) races) with a STATIC source-level invariant test that asserts alias publish line < strong publish line in the source. Mutation-tested: swapping the publish blocks fails the test. Files: - scripts/preview-cache.sh — restructure cmd_put to alias-first publish; add cp fallback; narrow cmd_get_with_fallback (no strong-from-weak repair); exit-code contract documented in usage. - tests/fixtures/cache-concurrency/test-race-window.sh (new) — static invariant guard. - tests/fixtures/cache-concurrency/test-self-heal.sh — 4 cases (strong-only, weak-only soft-hit, full miss, byte-equivalence). - tests/fixtures/cache-concurrency/README.md — updated to reflect exit-code contract and fixture rationale. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmd_get_with_fallback now mirrors cmd_put's hardening posture: - Sanitise strong/weak keys via the same `[:alnum:]._-` allowlist (defence-in-depth path-traversal guard; gemini HIGH P1). - Stage weak-alias self-heal into a private tmp + mv -f so a concurrent reader can never observe a half-copied alias (atomicity parity with cmd_put; gemini HIGH P3). - Newline-preservation concern (P2) was already addressed in the initial PR (cmd_get streams via cat — no command-substitution capture), so no code change there; comment expanded to make the invariant explicit for future reviewers. All three concurrency fixtures still pass (race-window / self-heal / 5way) and bash -n is clean. Refs PR #81 review comments (gemini-code-assist) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6c4b92a615
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| strong=$(printf '%s' "$strong_in" | tr -dc '[:alnum:]._-') | ||
| if [[ -z "$strong" ]]; then | ||
| echo "preview-cache.sh: refusing get-fallback with empty/unsafe strong key: '$strong_in'" >&2 | ||
| return 2 |
There was a problem hiding this comment.
Use non-soft-hit status for invalid strong keys
get-fallback documents exit 2 as a soft hit via weak alias, but the input-validation branch also returns 2 when strong_key sanitizes to empty. Any caller that branches on 2 to take the soft-hit path (e.g., skip Socratic and regenerate previews) will mis-handle malformed input as a cache hit even though no weak payload was streamed. Return a distinct error code here so 2 remains unambiguous.
Useful? React with 👍 / 👎.
Summary
scripts/preview-cache.shcmd_putnow publishes the weak alias as a hardlink (ln -f) against the strong-key inode instead of as an independentcp+mv. Strong and weak names flip in lock-step for every concurrentcmd_getobserver, closing the I-8 race that occasionally produced strong-HIT / weak-MISS and silently re-triggered the Socratic interview.cmd_get_with_fallbackhelper (CLI:get-fallback STRONG WEAK) implements Option C self-heal: legacy v1.6.1 entries written before this fix, or any selectively-invalidated half-pair, repair the missing side on first read by hardlinking against the surviving file.tests/fixtures/cache-concurrency/pin the invariant: 5-way concurrent put (all 5 pairs share inode, byte-identical) and self-heal (strong-only, then weak-only seeded → missing side restored, inodes match).Root cause
The previous
cmd_putwrote the strong key and the weak alias as two independentmktemp+mvsequences. Between the twomvcalls, a concurrent runner could observe the strong file on disk while the weak alias did not yet exist, satisfyingcmd_get(strong)but failingcmd_get(weak). No data corruption, but UX degradation: the one-click replay promise broke whenever two runs raced. Hardlink (link(2)on the same FS) is atomic from readers' perspective and produces a single shared inode, so the alias name and content become visible in one step.Test plan
bash tests/fixtures/cache-concurrency/test-5way.sh(5 concurrent puts, all pairs share inode)bash tests/fixtures/cache-concurrency/test-self-heal.sh(strong-only and weak-only half-pair recovery)bash tests/fixtures/security/verify-security.sh— full Phase 1 suite still green (T-9.4 atomic concurrent-put unaffected)bash -n scripts/preview-cache.shsyntax checkCloses #70
Summary by CodeRabbit
릴리스 노트
새 기능
get-fallback하위 명령 추가 (강력한 캐시 키 우선 시도, 약한 별칭 폴백 지원)테스트