security(W2+W4): workflow-perm narrowing + log-bash secret scrub by dackclup · Pull Request #229 · dackclup/quantrank

dackclup · 2026-05-24T04:01:28Z

Summary

Closes the 2 remaining security-reviewer WARNs deferred from PR #226 (W1 + W3 shipped there; W2 + W4 bundled here as one focused security-hardening PR). Both fixes are operational hygiene with no compute / schema / production-code surface.

W2 — `compute-rankings.yml` workflow-perm narrowing

The workflow-level permissions: contents: write was wider than needed — only the final Commit JSON outputs step inside the compute: job is a writer. Fix:

	Before	After
Workflow-level default	`contents: write`	`contents: read`
`compute:` job	inherits write	explicitly `contents: write`
Future job added	would inherit write	inherits read unless opts up

Behavior unchanged for the compute job; only the default surface shrinks. YAML parse-verified clean.

W4 — `log-bash.sh` inline-credential scrub

Previously the hook appended the raw Bash command (including any inline env-var dereference like EDGAR_USER_AGENT=foo python ... or an Authorization: Bearer <tok> header) to gitignored .claude/session.log. Severity is LOW because the file is gitignored + local-only, but an accidental cat .claude/session.log during a screen-share or pasted into a gist could leak the credential.

Fix: sed pre-filter redacts the value half of common secret-prefix tokens before appending; the prefix stays so the log is still readable for "which integration was being called" debugging.

Covered prefixes:

Prefix family	Pattern
GitHub	`ghp_` / `gho_` / `ghu_` / `ghs_` / `ghr_` / `github_pat_`
Anthropic	`sk-ant-api*`
OpenAI generic	`sk-*`
AWS	`AKIA` / `ASIA`
Google API	`AIza*`
Slack	`xox[abprs]-*`
Bare bearer	`Bearer <tok>` / `Authorization: Bearer <tok>`

Manual scrub test (this PR's verification):

in:  curl -H "Authorization: Bearer ghp_abcdef123...890" ...
out: curl -H "Authorization: Bearer ghp_[REDACTED]" ...

Fail-open discipline preserved — if sed errors (unlikely), the existing || true falls back to the original command, never blocks the hook.

Defense-in-depth: the gitignore remains the primary guard; this shrinks the blast radius of an accidental local leak.

Files touched

File	Δ	Reason
`.github/workflows/compute-rankings.yml`	+12 / −1	W2 — workflow-perm narrowing
`.claude/hooks/log-bash.sh`	+19 / −1	W4 — secret scrub
`CLAUDE.md`	+33	§Phase status in-flight note
`AGENTS.md`	+28 / −18	§Phase + version state in-flight note + PR #228 stale-header reword

Doc-only otherwise — no compute / schema / scoring / valuation / frontend / Python / TS production-code change. CLAUDE.md + AGENTS.md lockstep satisfied per §Conventions.

Test plan

python -c "yaml.safe_load(...)" — .github/workflows/compute-rankings.yml parses clean
Manual scrub test — Bearer ghp_abc... correctly redacts to Bearer ghp_[REDACTED]
git diff --stat confirms scope (4 files / +105 / −20)
Frontend (build) CI green (no frontend touched; expected pass)
Python (lint + test) CI green (no Python touched; expected pass)
security-reviewer sonnet spawn (optional — re-verifies both WARNs resolved)
Flip Draft → Ready after CI green + reviewer pass

Out of scope (still deferred)

Fix feat(phase-1): universe + prices + momentum stub #3 (_FOOTNOTES_REQUIRED_ATTRS extension) — still needs @network verify on real edgartools
F2 — <rule10b5_1> docstring precision via edgar-debugger
F3 — form4 citation tightening (pairs with F2)
Item 5 — cron-feat(phase-2): SEC EDGAR fundamentals + per-stock detail pages #4 verify-helper Section A-L (~19h away at Sun 22:00 UTC+)

https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

Generated by Claude Code

Closes the 2 remaining security-reviewer WARNs deferred from PR #226 (W1 + W3 shipped there; W2 + W4 bundled here as one focused security-hardening PR). Both fixes are operational hygiene with no compute / schema / production-code surface. W2 — .github/workflows/compute-rankings.yml workflow-perm narrowing --- The workflow-level `permissions: contents: write` was wider than needed — only the final `Commit JSON outputs` step inside the `compute:` job is a writer. Fix: - Workflow-level default narrowed to `contents: read` (least privilege baseline; any future job added to this file inherits read unless it explicitly opts up) - The `compute:` job explicitly opts up to `contents: write` so the `git commit + push` step still works Behavior unchanged for the compute job; only the default surface shrinks. YAML parse-verified clean. W4 — .claude/hooks/log-bash.sh inline-credential scrub --- Previously the hook appended the raw Bash command (including any inline env-var dereference like `EDGAR_USER_AGENT=foo python ...` or an `Authorization: Bearer <tok>` header) to gitignored `.claude/session.log`. Severity is LOW because the file is gitignored + local-only, but an accidental `cat .claude/session.log` during a screen-share or pasted into a gist could leak the credential. Fix: `sed` pre-filter redacts the value half of common secret-prefix tokens before appending; the prefix stays so the log is still readable for "which integration was being called" debugging. Covered prefixes: - GitHub: ghp_ / gho_ / ghu_ / ghs_ / ghr_ / github_pat_ - Anthropic: sk-ant-api* - OpenAI generic: sk-* - AWS: AKIA* / ASIA* - Google API: AIza* - Slack: xox[abprs]-* - Bare `Bearer <tok>` / `Authorization: Bearer <tok>` headers Manual scrub test: in: curl -H "Authorization: Bearer ghp_abcdef123...890" ... out: curl -H "Authorization: Bearer ghp_[REDACTED]" ... Fail-open discipline preserved — if `sed` errors (unlikely), the existing `|| true` falls back to the original command, never blocks the hook. Defense-in-depth: the gitignore remains the primary guard; this shrinks the blast radius of an accidental local leak. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes (no new §Gotchas — these are not invariants future code authors need to remember; they're hardening of existing surfaces). https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

vercel · 2026-05-24T04:01:33Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
quantrank	Ready	Preview, Comment	May 24, 2026 4:02am

…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4

…nent 45-min simulate fix (#230) * docs(form4): correct <rule10b5_1> → <aff10b5One> docstring precision per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * ci(simulate): permanent 45-min cancellation fix — wire QR_SKIP_TIER2 + cache + path filter The docstring fix above triggered simulate (path filter matched compute/scoring/**) which cancelled at 45m02s on the timeout cap. ci-triage-engineer deep-dive (session 5, 2026-05-24) identified the recurring pattern across multiple PRs: the QR_SKIP_TIER2 kill-switch fully wired in compute/scoring/tier2.py:158 was NEVER set in .github/workflows/pre-merge-prod-sim.yml. Past mitigations: - PR #165 (1y non-reliance lookback): addressed a specific 5-year EDGAR backfill spike on one PR but left the universe-wide Tier-2 loop intact - PR-form4-2 (FORM4_FETCH_SKIP=1): saved form4 budget but left the Tier-2 (502 tickers × 10-K text fetch + 8-K fetch) running unconditionally (20-35m cold-cache cost) Recurrence tally on last 5 simulate runs: PR #165 19m01s ✅ warm PR #204 19m37s ✅ warm PR #222 16m56s ✅ warm PR #224 17m08s ✅ warm PR #230 45m15s ❌ CANCELLED (cold) Pattern is structural — when GitHub evicts the warm cache after a 7-day gap without a simulate-triggering PR, the next compute/scoring PR hits full cold. The 4-part permanent fix: (a) QR_SKIP_TIER2: "1" added to pre-merge-prod-sim.yml env block PRIMARY fix — eliminates the 20-35m cold-cache Tier-2 cost. Tradeoff: the non_reliance veto (enabled since PR 4g) is suppressed in simulate's diff output. Acceptable because simulate is informational-only (per workflow comment line 24); non_reliance veto correctness is covered by offline pytest. (b) compute/cache/edgar_form4 added to BOTH workflows' cache paths (compute-rankings.yml save+restore, pre-merge-prod-sim.yml restore only). Future-proofs for Phase 4.5e PR 5 when form4 weight promotion enables FORM4_FETCH_SKIP=0 on simulate. (c) Path filter widened in pre-merge-prod-sim.yml from compute/scoring/** + compute/features/** only to also include: - compute/ingest/** (fundamentals fetcher regressions) - compute/valuation/** (RIM / DCF / ensemble regressions) - compute/output/schemas.py (Pydantic schema bumps) - compute/main.py (orchestrator changes) - pyproject.toml (dep bumps affecting compute) (d) compute/scoring/tier2.py:154-155 docstring corrected The old comment said "_EIGHT_K_DEFENSES_ENABLED=False" but PR 4g (2026-05-17) flipped it to True. The non_reliance veto IS the active veto path now; updated comment explains the veto suppression tradeoff in simulate context. Expected outcomes: - Docstring-only PR in compute/scoring/ → simulate completes in ~12-15m warm or ~17-20m partial-cold (well under 45m cap) - Real scoring PR → same budget; composite-score diff unchanged - Cache eviction (7-day gap) → worst case fundamentals cold ~10-20m + no Tier-2 + no form4 = under cap - Weekly cron UNAFFECTED — QR_SKIP_TIER2 not set there; full run continues to populate the warm cache Companion benefit: this PR's re-pushed simulate is the LIVE VALIDATION of the fix — first sub-45m simulate on this branch proves the fix works. Files touched: - .github/workflows/pre-merge-prod-sim.yml (env + cache + paths) - .github/workflows/compute-rankings.yml (cache path only) - compute/scoring/tier2.py (stale comment) - CLAUDE.md + AGENTS.md (§Phase status note) YAML parse-verified; ruff clean; no compute / schema / scoring / valuation / Python behavior change. CLAUDE.md + AGENTS.md lockstep satisfied via the §Phase status notes. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * docs(workflow): rebase-discipline § + parallel-PR §Phase status collision §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * ci(simulate): Part 4 — wire QR_SKIP_FUNDAMENTALS to actually close the 45m cap Part 2's QR_SKIP_TIER2 fix was necessary but NOT sufficient. Re-pushing the rebased branch on 2026-05-24 07:06 UTC still hit simulate cancelled at 45m15s. ci-triage-engineer deep-dive #2 identified the gap: compute/main.py has THREE independent SEC EDGAR ThreadPoolExecutor loops, and Part 2 only killed ONE (Step 4b — Tier-2): Step 2 (compute/main.py:717) — fundamentals snapshot (502 tickers × SEC companyfacts XBRL); NOT skipped by QR_SKIP_TIER2 Step 3 (compute/main.py:785) — annual fundamentals history; same Step 4b (compute/main.py:972) — Tier-2 / 8-K; killed by Part 2 Cold-cache cost of Steps 2+3 alone is 25-50m per CLAUDE.md §Gotchas — enough to fill the entire 45m simulate budget. Worse, the cache freshness gate at compute/ingest/fundamentals.py:917 (_is_fresh()) checks the filed_date INSIDE each cached parquet against FUNDAMENTALS_REFETCH_DAYS=45. On a partial cache hit, any ticker with a > 45d-old most-recent filing still triggers a live EDGAR round-trip. Part 4 fix — QR_SKIP_FUNDAMENTALS=1 escape hatch: (a) compute/ingest/fundamentals.py:fetch_fundamentals — env-var check added at the top of the function (BEFORE _require_identity()), so the cached snapshot is returned unconditionally when set, without a freshness check OR an EDGAR identity requirement. Falls through to live fetch if no cache exists (cold runner has nothing to read). (b) compute/ingest/fundamentals.py:fetch_fundamentals_history — mirror pattern: returns the cached annual parquet without the 180d age check when QR_SKIP_FUNDAMENTALS=1 AND force_refresh=False. Same fallthrough. (c) .github/workflows/pre-merge-prod-sim.yml — adds QR_SKIP_FUNDAMENTALS: "1" to the env block alongside QR_SKIP_TIER2 and FORM4_FETCH_SKIP. Inline comment explains the root cause + the safety reasoning. SAFE for simulate because: - The workflow's purpose is composite-score-DIFF detection vs main's COMMITTED rankings.json - Both sides (PR branch + main) were produced from the same upstream fundamentals input (the cache the weekly cron wrote) - Using that cache without re-fetch is the CORRECT input for the diff - It's NOT a quality compromise — re-fetching live would give the PR branch newer data than main has, making the diff confusing rather than accurate Weekly cron (compute-rankings.yml) does NOT set QR_SKIP_FUNDAMENTALS — full live fetch still runs there and populates the warm cache for future simulate restores. Expected outcome: simulate completes in 8-12m on a cache-hit restore (no live SEC fetch in any of Steps 2, 3, or 4b). Live validation: re-push after this commit → simulate green under 25m on this PR. Tests verified: tests/test_features/test_fundamentals.py + tests/test_workflow_cache_coverage.py = 24 pass / 6 skipped. ruff clean. YAML parse-verified. schema_check in sync. CLAUDE.md §Phase status + AGENTS.md §Phase + version state both updated with the Part 4 explanation. Future contributors editing pre-merge-prod-sim.yml should now keep all THREE escape-hatch env vars set together: FORM4_FETCH_SKIP + QR_SKIP_TIER2 + QR_SKIP_FUNDAMENTALS. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 --------- Co-authored-by: Claude <noreply@anthropic.com>

…fix for parallel-PR collision) (#237) Closes the structural follow-up tracked in PR #230's §Gotchas "Parallel-PR §Phase status collision pattern". PR #230 itself hit the collision 3 times in one session while iterating on the simulate-cap fix — empirical proof that the doc-discipline §Convention landed in PR #230 (rebase-before-Mark-Ready) is necessary but NOT sufficient when 5+ PRs land on main in a single hour. Root cause (pre-2026-05-24): every PR was required to add a "**X in flight (this PR)**" bullet to CLAUDE.md §Phase status + AGENTS.md §Phase + version state per §Conventions "ship with every PR" lockstep. Both blocks were append-shaped, with the new bullet inserted at the SAME line (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two PRs opened in parallel both touched that single anchor line → `mergeable_state: dirty` → recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" frustration. The conflict was BENIGN (both PRs added distinct entries at the same insertion line; resolution was always "keep both") but `git merge` cannot auto-detect that. PR #230 hit this 3 times: - vs PR #229 (security WARN cleanup) mid-iteration - vs PR #232 + #233 (LedgerCraft A1 + A2) before Mark-Ready - vs PR #234 + #235 + #236 (LedgerCraft A3 + B1 + B2+B3+B4) during the simulate-fix re-push loop Structural fix: (1) NEW FILE — `PHASE_STATUS_INFLIGHT.md` at the repo root. Each PR appends its in-flight entry to the END of this file (the "In flight (current)" sub-section). Parallel PRs both append at disjoint last-lines → `git merge` auto-resolves → no more `mergeable_state: dirty` from this pattern. (2) CLAUDE.md §Conventions — "ship with every PR" rule updated to point at PHASE_STATUS_INFLIGHT.md as the canonical destination for in-flight entries. Substance updates to CLAUDE.md / AGENTS.md sections still land directly (they're rare + cross-PR-coupled by nature). (3) CLAUDE.md §Gotchas "Parallel-PR §Phase status collision pattern" marked as RESOLVED 2026-05-24 with a back-reference to the side-file adoption. The historical-symptom record stays for future readers' context. (4) AGENTS.md mirror — same updates for cross-tool agents (Copilot / Cursor / Devin / VS Code Agent Mode). The side-file works regardless of which agent runtime authors the PR. Trade-offs: - Housekeeping workflow to drain merged entries from PHASE_STATUS_INFLIGHT.md into CLAUDE.md §Phase status proper DEFERRED — manual cleanup is fine for the first few weeks while the pattern proves itself. `tools/housekeep_phase_status.py` may land later once volume + shape are known. - The rebase-before-Mark-Ready discipline (PR #230 §Conventions bullet) still applies as a backstop for OTHER conflict surfaces (shared code edits, workflow YAML, schema bumps) — it's no longer the recurring drag it used to be, but still good discipline. No compute / schema / scoring / valuation / frontend / Python / TS code change. Doc-only PR. ruff clean. schema_check clean. CLAUDE.md + AGENTS.md lockstep satisfied via §Conventions + §Gotchas substance updates (the "ship with every PR" rule now points at the new file as the canonical destination, and this PR's in-flight entry lives in PHASE_STATUS_INFLIGHT.md per the new convention — dogfooding from commit #1). https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 Co-authored-by: Claude <noreply@anthropic.com>

vercel Bot deployed to Preview May 24, 2026 04:02 View deployment

dackclup marked this pull request as ready for review May 24, 2026 04:09

dackclup merged commit dacf293 into main May 24, 2026
4 checks passed

dackclup deleted the claude/eager-bohr-12bQi branch May 24, 2026 05:19

dackclup mentioned this pull request May 24, 2026

docs(form4)+ci(simulate): correct <rule10b5_1> → <aff10b5One> + wire QR_SKIP_TIER2 (permanent 45-min cap fix) #230

Merged

9 tasks

dackclup mentioned this pull request May 24, 2026

docs(workflow): adopt PHASE_STATUS_INFLIGHT.md side-file (structural fix for parallel-PR collision) #237

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security(W2+W4): workflow-perm narrowing + log-bash secret scrub#229

security(W2+W4): workflow-perm narrowing + log-bash secret scrub#229
dackclup merged 1 commit into
mainfrom
claude/eager-bohr-12bQi

dackclup commented May 24, 2026

Uh oh!

vercel Bot commented May 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dackclup commented May 24, 2026

Summary

W2 — compute-rankings.yml workflow-perm narrowing

W4 — log-bash.sh inline-credential scrub

Files touched

Test plan

Out of scope (still deferred)

Uh oh!

vercel Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

W2 — `compute-rankings.yml` workflow-perm narrowing

W4 — `log-bash.sh` inline-credential scrub

vercel Bot commented May 24, 2026 •

edited

Loading