security(W2+W4): workflow-perm narrowing + log-bash secret scrub#229
Merged
Conversation
Closes the 2 remaining security-reviewer WARNs deferred from PR #226 (W1 + W3 shipped there; W2 + W4 bundled here as one focused security-hardening PR). Both fixes are operational hygiene with no compute / schema / production-code surface. W2 — .github/workflows/compute-rankings.yml workflow-perm narrowing --- The workflow-level `permissions: contents: write` was wider than needed — only the final `Commit JSON outputs` step inside the `compute:` job is a writer. Fix: - Workflow-level default narrowed to `contents: read` (least privilege baseline; any future job added to this file inherits read unless it explicitly opts up) - The `compute:` job explicitly opts up to `contents: write` so the `git commit + push` step still works Behavior unchanged for the compute job; only the default surface shrinks. YAML parse-verified clean. W4 — .claude/hooks/log-bash.sh inline-credential scrub --- Previously the hook appended the raw Bash command (including any inline env-var dereference like `EDGAR_USER_AGENT=foo python ...` or an `Authorization: Bearer <tok>` header) to gitignored `.claude/session.log`. Severity is LOW because the file is gitignored + local-only, but an accidental `cat .claude/session.log` during a screen-share or pasted into a gist could leak the credential. Fix: `sed` pre-filter redacts the value half of common secret-prefix tokens before appending; the prefix stays so the log is still readable for "which integration was being called" debugging. Covered prefixes: - GitHub: ghp_ / gho_ / ghu_ / ghs_ / ghr_ / github_pat_ - Anthropic: sk-ant-api* - OpenAI generic: sk-* - AWS: AKIA* / ASIA* - Google API: AIza* - Slack: xox[abprs]-* - Bare `Bearer <tok>` / `Authorization: Bearer <tok>` headers Manual scrub test: in: curl -H "Authorization: Bearer ghp_abcdef123...890" ... out: curl -H "Authorization: Bearer ghp_[REDACTED]" ... Fail-open discipline preserved — if `sed` errors (unlikely), the existing `|| true` falls back to the original command, never blocks the hook. Defense-in-depth: the gitignore remains the primary guard; this shrinks the blast radius of an accidental local leak. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes (no new §Gotchas — these are not invariants future code authors need to remember; they're hardening of existing surfaces). https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
9 tasks
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
pushed a commit
that referenced
this pull request
May 24, 2026
…sion §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
dackclup
added a commit
that referenced
this pull request
May 24, 2026
…nent 45-min simulate fix (#230) * docs(form4): correct <rule10b5_1> → <aff10b5One> docstring precision per edgar-debugger live-XML verify The literature-searcher proof-of-life on 2026-05-23 (session 4) flagged a precision gap: the form4 module docstrings referenced the colloquial <rule10b5_1> XML tag name (a label, not the actual SEC EDGAR Ownership XML element). edgar-debugger re-audit on 2026-05-24 (session 5) fetched 3 live AAPL Form 4 XMLs from SEC EDGAR and confirmed the actual element is <aff10b5One> at ownershipDocument/aff10b5One — a DOCUMENT-LEVEL boolean (one per filing, covering ALL transactions in that Form 4), NOT a per-transaction attribute. Updated 7 references to use the canonical name: 5 code spots: - compute/scoring/form4_insider.py:20 (module docstring header) - compute/scoring/form4_insider.py:82 (§Footnote resolution) - compute/scoring/form4_insider.py:587 (inline _form4_to_transactions comment) - compute/scoring/form4_signals.py:129 (module docstring) - compute/output/schemas.py:200 (extreme_estimate_majority_count comment context) 2 doc spots: - CLAUDE.md §Phase status PR #224 entry "Access-path caveat" block (line 1352-1359) - AGENTS.md §Phase + version state PR #224 entry "Access-path caveat for cross-tool agents" block (line 947-951) Added §Footnote resolution architectural-gap note in compute/scoring/form4_insider.py: a filer who checks <aff10b5One>true at the document level but does NOT include the per-transaction footnote text (valid — the checkbox IS the formal affirmative defense, footnotes are supplemental) will slip past the current footnote-text fallback path and enter the opportunistic cohort incorrectly. Deferred to a follow-up PR that direct-parses the raw Form 4 XML for ownershipDocument/aff10b5One (edgartools doesn't expose it, so the implementation walks the filing text once per Ownership object). Tracked as the architectural follow-up after PR 4-eq. edgartools 5.31.5 verified as PyPI latest by edgar-debugger 2026-05-24 — no newer release adds a parse path for <aff10b5One>. detect_10b5_1_plan regex set (6 patterns) confirmed complete vs real-world footnote text — pattern "10b5-1" alone covers every common variant including Rule 10b5-1(c) subsection cites. No additions needed. Docstring-only PR — no compute / scoring / valuation / behavior change. schema_check clean (comment-only edit on schemas.py), ruff clean, tests unchanged at 1168+. CLAUDE.md + AGENTS.md lockstep satisfied via §Phase status in-flight notes. Companion stale-header reword: marked PR #229 "Security WARN cleanup" → "merged via PR #229" in CLAUDE.md + AGENTS.md. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * ci(simulate): permanent 45-min cancellation fix — wire QR_SKIP_TIER2 + cache + path filter The docstring fix above triggered simulate (path filter matched compute/scoring/**) which cancelled at 45m02s on the timeout cap. ci-triage-engineer deep-dive (session 5, 2026-05-24) identified the recurring pattern across multiple PRs: the QR_SKIP_TIER2 kill-switch fully wired in compute/scoring/tier2.py:158 was NEVER set in .github/workflows/pre-merge-prod-sim.yml. Past mitigations: - PR #165 (1y non-reliance lookback): addressed a specific 5-year EDGAR backfill spike on one PR but left the universe-wide Tier-2 loop intact - PR-form4-2 (FORM4_FETCH_SKIP=1): saved form4 budget but left the Tier-2 (502 tickers × 10-K text fetch + 8-K fetch) running unconditionally (20-35m cold-cache cost) Recurrence tally on last 5 simulate runs: PR #165 19m01s ✅ warm PR #204 19m37s ✅ warm PR #222 16m56s ✅ warm PR #224 17m08s ✅ warm PR #230 45m15s ❌ CANCELLED (cold) Pattern is structural — when GitHub evicts the warm cache after a 7-day gap without a simulate-triggering PR, the next compute/scoring PR hits full cold. The 4-part permanent fix: (a) QR_SKIP_TIER2: "1" added to pre-merge-prod-sim.yml env block PRIMARY fix — eliminates the 20-35m cold-cache Tier-2 cost. Tradeoff: the non_reliance veto (enabled since PR 4g) is suppressed in simulate's diff output. Acceptable because simulate is informational-only (per workflow comment line 24); non_reliance veto correctness is covered by offline pytest. (b) compute/cache/edgar_form4 added to BOTH workflows' cache paths (compute-rankings.yml save+restore, pre-merge-prod-sim.yml restore only). Future-proofs for Phase 4.5e PR 5 when form4 weight promotion enables FORM4_FETCH_SKIP=0 on simulate. (c) Path filter widened in pre-merge-prod-sim.yml from compute/scoring/** + compute/features/** only to also include: - compute/ingest/** (fundamentals fetcher regressions) - compute/valuation/** (RIM / DCF / ensemble regressions) - compute/output/schemas.py (Pydantic schema bumps) - compute/main.py (orchestrator changes) - pyproject.toml (dep bumps affecting compute) (d) compute/scoring/tier2.py:154-155 docstring corrected The old comment said "_EIGHT_K_DEFENSES_ENABLED=False" but PR 4g (2026-05-17) flipped it to True. The non_reliance veto IS the active veto path now; updated comment explains the veto suppression tradeoff in simulate context. Expected outcomes: - Docstring-only PR in compute/scoring/ → simulate completes in ~12-15m warm or ~17-20m partial-cold (well under 45m cap) - Real scoring PR → same budget; composite-score diff unchanged - Cache eviction (7-day gap) → worst case fundamentals cold ~10-20m + no Tier-2 + no form4 = under cap - Weekly cron UNAFFECTED — QR_SKIP_TIER2 not set there; full run continues to populate the warm cache Companion benefit: this PR's re-pushed simulate is the LIVE VALIDATION of the fix — first sub-45m simulate on this branch proves the fix works. Files touched: - .github/workflows/pre-merge-prod-sim.yml (env + cache + paths) - .github/workflows/compute-rankings.yml (cache path only) - compute/scoring/tier2.py (stale comment) - CLAUDE.md + AGENTS.md (§Phase status note) YAML parse-verified; ruff clean; no compute / schema / scoring / valuation / Python behavior change. CLAUDE.md + AGENTS.md lockstep satisfied via the §Phase status notes. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * docs(workflow): rebase-discipline § + parallel-PR §Phase status collision §Gotcha Closes the recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" pattern surfaced 2026-05-24: PR #230 hit mergeable_state=dirty twice in a single session — first when PR #229 (security WARN cleanup) landed on main mid-iteration, then again when PR #232 + PR #233 (LedgerCraft A1 + A2) landed before Mark-Ready. ROOT CAUSE: every PR adds a "**X in flight (this PR)**" entry at the SAME insertion line in CLAUDE.md §Phase status + AGENTS.md §Phase + version state (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two parallel PRs both touch that single line → `git merge` cannot auto-resolve two distinct adds at the same line → `mergeable_state: dirty`. The conflict is BENIGN (both adds are legitimate; resolution is always "keep both in chronological order") but git can't auto-detect that. THREE COMPANION EDITS bundled here: (a) CLAUDE.md §Conventions — new bullet "Rebase onto origin/main before flipping any PR Draft → Ready". Includes the operational `git fetch origin main && git rebase origin/main` recipe + the "keep both entries in chronological order (older PR first, your entry second)" resolution discipline. (b) CLAUDE.md §Gotchas — new "Parallel-PR §Phase status collision pattern" entry recording the recurring symptom (PR #230 ×2 on 2026-05-24 against PR #229 + PR #232 + PR #233) + the local-mitigation workflow + a forward-looking structural follow-up note: consider moving the "in flight" sub-block to a side file (PHASE_STATUS_INFLIGHT.md) that's append-only- per-PR; on merge a housekeeping commit moves the entry into CLAUDE.md proper. Not yet adopted — trade-off is extra discipline at merge time. (c) AGENTS.md mirror — cross-tool equivalent for Copilot / Cursor / Devin contributors, emphasising the rebase-before-Mark-Ready discipline applies to ALL agent runtimes (not Claude-Code- specific). ALSO bundled as a 3rd in-flight entry in the §Phase status block documenting THIS PR's scope expansion (Parts 1+2+3 now). No infrastructure / CI / behavior change — pure doc discipline. Future contributors who hit the same conflict find the §Gotchas entry + the §Conventions recipe + AGENTS.md note and resolve in seconds, instead of re-discovering the pattern. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 * ci(simulate): Part 4 — wire QR_SKIP_FUNDAMENTALS to actually close the 45m cap Part 2's QR_SKIP_TIER2 fix was necessary but NOT sufficient. Re-pushing the rebased branch on 2026-05-24 07:06 UTC still hit simulate cancelled at 45m15s. ci-triage-engineer deep-dive #2 identified the gap: compute/main.py has THREE independent SEC EDGAR ThreadPoolExecutor loops, and Part 2 only killed ONE (Step 4b — Tier-2): Step 2 (compute/main.py:717) — fundamentals snapshot (502 tickers × SEC companyfacts XBRL); NOT skipped by QR_SKIP_TIER2 Step 3 (compute/main.py:785) — annual fundamentals history; same Step 4b (compute/main.py:972) — Tier-2 / 8-K; killed by Part 2 Cold-cache cost of Steps 2+3 alone is 25-50m per CLAUDE.md §Gotchas — enough to fill the entire 45m simulate budget. Worse, the cache freshness gate at compute/ingest/fundamentals.py:917 (_is_fresh()) checks the filed_date INSIDE each cached parquet against FUNDAMENTALS_REFETCH_DAYS=45. On a partial cache hit, any ticker with a > 45d-old most-recent filing still triggers a live EDGAR round-trip. Part 4 fix — QR_SKIP_FUNDAMENTALS=1 escape hatch: (a) compute/ingest/fundamentals.py:fetch_fundamentals — env-var check added at the top of the function (BEFORE _require_identity()), so the cached snapshot is returned unconditionally when set, without a freshness check OR an EDGAR identity requirement. Falls through to live fetch if no cache exists (cold runner has nothing to read). (b) compute/ingest/fundamentals.py:fetch_fundamentals_history — mirror pattern: returns the cached annual parquet without the 180d age check when QR_SKIP_FUNDAMENTALS=1 AND force_refresh=False. Same fallthrough. (c) .github/workflows/pre-merge-prod-sim.yml — adds QR_SKIP_FUNDAMENTALS: "1" to the env block alongside QR_SKIP_TIER2 and FORM4_FETCH_SKIP. Inline comment explains the root cause + the safety reasoning. SAFE for simulate because: - The workflow's purpose is composite-score-DIFF detection vs main's COMMITTED rankings.json - Both sides (PR branch + main) were produced from the same upstream fundamentals input (the cache the weekly cron wrote) - Using that cache without re-fetch is the CORRECT input for the diff - It's NOT a quality compromise — re-fetching live would give the PR branch newer data than main has, making the diff confusing rather than accurate Weekly cron (compute-rankings.yml) does NOT set QR_SKIP_FUNDAMENTALS — full live fetch still runs there and populates the warm cache for future simulate restores. Expected outcome: simulate completes in 8-12m on a cache-hit restore (no live SEC fetch in any of Steps 2, 3, or 4b). Live validation: re-push after this commit → simulate green under 25m on this PR. Tests verified: tests/test_features/test_fundamentals.py + tests/test_workflow_cache_coverage.py = 24 pass / 6 skipped. ruff clean. YAML parse-verified. schema_check in sync. CLAUDE.md §Phase status + AGENTS.md §Phase + version state both updated with the Part 4 explanation. Future contributors editing pre-merge-prod-sim.yml should now keep all THREE escape-hatch env vars set together: FORM4_FETCH_SKIP + QR_SKIP_TIER2 + QR_SKIP_FUNDAMENTALS. https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 --------- Co-authored-by: Claude <noreply@anthropic.com>
8 tasks
dackclup
added a commit
that referenced
this pull request
May 24, 2026
…fix for parallel-PR collision) (#237) Closes the structural follow-up tracked in PR #230's §Gotchas "Parallel-PR §Phase status collision pattern". PR #230 itself hit the collision 3 times in one session while iterating on the simulate-cap fix — empirical proof that the doc-discipline §Convention landed in PR #230 (rebase-before-Mark-Ready) is necessary but NOT sufficient when 5+ PRs land on main in a single hour. Root cause (pre-2026-05-24): every PR was required to add a "**X in flight (this PR)**" bullet to CLAUDE.md §Phase status + AGENTS.md §Phase + version state per §Conventions "ship with every PR" lockstep. Both blocks were append-shaped, with the new bullet inserted at the SAME line (just before "**Next deliverables**" / "## Claude-Code-specific tooling"). Two PRs opened in parallel both touched that single anchor line → `mergeable_state: dirty` → recurring "merge ไม่ได้ หลังแก้แล้วกลับมาเป็นอีก" frustration. The conflict was BENIGN (both PRs added distinct entries at the same insertion line; resolution was always "keep both") but `git merge` cannot auto-detect that. PR #230 hit this 3 times: - vs PR #229 (security WARN cleanup) mid-iteration - vs PR #232 + #233 (LedgerCraft A1 + A2) before Mark-Ready - vs PR #234 + #235 + #236 (LedgerCraft A3 + B1 + B2+B3+B4) during the simulate-fix re-push loop Structural fix: (1) NEW FILE — `PHASE_STATUS_INFLIGHT.md` at the repo root. Each PR appends its in-flight entry to the END of this file (the "In flight (current)" sub-section). Parallel PRs both append at disjoint last-lines → `git merge` auto-resolves → no more `mergeable_state: dirty` from this pattern. (2) CLAUDE.md §Conventions — "ship with every PR" rule updated to point at PHASE_STATUS_INFLIGHT.md as the canonical destination for in-flight entries. Substance updates to CLAUDE.md / AGENTS.md sections still land directly (they're rare + cross-PR-coupled by nature). (3) CLAUDE.md §Gotchas "Parallel-PR §Phase status collision pattern" marked as RESOLVED 2026-05-24 with a back-reference to the side-file adoption. The historical-symptom record stays for future readers' context. (4) AGENTS.md mirror — same updates for cross-tool agents (Copilot / Cursor / Devin / VS Code Agent Mode). The side-file works regardless of which agent runtime authors the PR. Trade-offs: - Housekeeping workflow to drain merged entries from PHASE_STATUS_INFLIGHT.md into CLAUDE.md §Phase status proper DEFERRED — manual cleanup is fine for the first few weeks while the pattern proves itself. `tools/housekeep_phase_status.py` may land later once volume + shape are known. - The rebase-before-Mark-Ready discipline (PR #230 §Conventions bullet) still applies as a backstop for OTHER conflict surfaces (shared code edits, workflow YAML, schema bumps) — it's no longer the recurring drag it used to be, but still good discipline. No compute / schema / scoring / valuation / frontend / Python / TS code change. Doc-only PR. ruff clean. schema_check clean. CLAUDE.md + AGENTS.md lockstep satisfied via §Conventions + §Gotchas substance updates (the "ship with every PR" rule now points at the new file as the canonical destination, and this PR's in-flight entry lives in PHASE_STATUS_INFLIGHT.md per the new convention — dogfooding from commit #1). https://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4 Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the 2 remaining
security-reviewerWARNs deferred from PR #226 (W1 + W3 shipped there; W2 + W4 bundled here as one focused security-hardening PR). Both fixes are operational hygiene with no compute / schema / production-code surface.W2 —
compute-rankings.ymlworkflow-perm narrowingThe workflow-level
permissions: contents: writewas wider than needed — only the finalCommit JSON outputsstep inside thecompute:job is a writer. Fix:contents: writecontents: readcompute:jobcontents: writeBehavior unchanged for the
computejob; only the default surface shrinks. YAML parse-verified clean.W4 —
log-bash.shinline-credential scrubPreviously the hook appended the raw Bash command (including any inline env-var dereference like
EDGAR_USER_AGENT=foo python ...or anAuthorization: Bearer <tok>header) to gitignored.claude/session.log. Severity is LOW because the file is gitignored + local-only, but an accidentalcat .claude/session.logduring a screen-share or pasted into a gist could leak the credential.Fix:
sedpre-filter redacts the value half of common secret-prefix tokens before appending; the prefix stays so the log is still readable for "which integration was being called" debugging.Covered prefixes:
ghp_/gho_/ghu_/ghs_/ghr_/github_pat_sk-ant-api*sk-*AKIA*/ASIA*AIza*xox[abprs]-*Bearer <tok>/Authorization: Bearer <tok>Manual scrub test (this PR's verification):
Fail-open discipline preserved — if
sederrors (unlikely), the existing|| truefalls back to the original command, never blocks the hook.Defense-in-depth: the gitignore remains the primary guard; this shrinks the blast radius of an accidental local leak.
Files touched
.github/workflows/compute-rankings.yml.claude/hooks/log-bash.shCLAUDE.mdAGENTS.mdDoc-only otherwise — no compute / schema / scoring / valuation / frontend / Python / TS production-code change. CLAUDE.md + AGENTS.md lockstep satisfied per §Conventions.
Test plan
python -c "yaml.safe_load(...)"—.github/workflows/compute-rankings.ymlparses cleanBearer ghp_abc...correctly redacts toBearer ghp_[REDACTED]git diff --statconfirms scope (4 files / +105 / −20)security-reviewersonnet spawn (optional — re-verifies both WARNs resolved)Out of scope (still deferred)
_FOOTNOTES_REQUIRED_ATTRSextension) — still needs@networkverify on real edgartools<rule10b5_1>docstring precision viaedgar-debuggerhttps://claude.ai/code/session_01JwntEE4PNAXSMkZxRA9BB4
Generated by Claude Code