security: audit remediation Phase B/C (#4,#5,#7,#9,#10 + #8 partial) by AVADSA25 · Pull Request #153 · AVADSA25/codec

AVADSA25 · 2026-05-29T09:40:00Z

Continuation of the top-10 audit remediation. Phase A (#1·#2·#3, C1/C2/C3/C6/C7) already merged via PR #152 (squash → main). This PR carries the Phase B/C work that was pushed after that merge.

All TDD'd; 490 tests green across touched modules; ruff clean; rebased cleanly onto current main.

Commit	Fix	Findings
`run_with_timeout`	#4	C4 — kill the hanging ThreadPoolExecutor in mcp/observer
state locking	#5	C5·H14·M6 — SQLite RLock + cross-process file_lock (grants/notifications)
SSRF guard	#7a	H1 — block internal/metadata URLs in web_fetch + crew fetch
skill-overwrite	#7b	H2·H6 — approve refuses shadowing a pinned built-in
A-12 guard + OAuth scope tests	#10	test coverage
jsonstore core	#9	C8 — atomic_write_json `default=`/`sort_keys=` + `read_modify_write`; notifications RMW lock; `docs/STATE-FILES.md`
chat helper extraction	#8	C9 (start) — `chat_completion` 356→252 lines, behavior-preserving

Partial / deferred (documented)

feat(self_improve): migrate to plugin hook architecture (Phase 1 Step 4) #8 module split → deferred (chat_completion shares prompt helpers with the prompt-override endpoints → circular import; intra-file CC reduction landed instead). See docs/FIX8-….
feat(observer): continuous observation loop (Phase 2 Step 5) #9 Phases 1/3/4 → deferred (mass swaps; helper convergence touches don't-touch pending_questions/grants; don't-touch files need per-file sign-off). See docs/STATE-FILES.md.
fix(tests): skip destructive skills in test_mcp_all_tools + incident report #6 (deps + PM2 interpreter pin) → not started; needs the live venv path + a watched pm2 restart.

Reviewer notes

C5: the new test reproduced a hard segfault from concurrent unserialized sqlite access pre-fix.
C3 sandbox (in Phase A): broad allow file-read* retained + layered (deny …) after it (SBPL last-match-wins) to avoid breaking stdlib imports.

🤖 Generated with Claude Code

The idiom `with ThreadPoolExecutor() as ex: fut.result(timeout=T)` defeats its own timeout: the context-manager __exit__ calls shutdown(wait=True), which blocks until the runaway task finishes. A 50ms timeout on a 5s task took ~5s, so codec_mcp tool dispatch and codec_observer OCR could hang on a slow skill or a stuck screencapture popup. Add codec_concurrency.run_with_timeout (daemon thread + join(timeout), no shutdown(wait=True)): surfaces the timeout promptly, abandons the daemon worker (never blocks shutdown), re-raises fn exceptions, and raises TimeoutError (== concurrent.futures.TimeoutError on py3.11+). Migrate both call sites; mirrors the proven codec_hooks._run_hook_with_timeout pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CodecMemory shared one sqlite3 connection across threads (check_same_thread=False) with no app-level lock — concurrent save/search on the same connection can corrupt cursor state or even segfault (the new test reproduced a hard crash before the fix). Add a threading.RLock and serialize every runtime connection use (save, search, search_recent, get_sessions, cleanup, rebuild_fts, close). RLock so get_context->search and cleanup->close don't self-deadlock; WAL + busy_timeout stay the cross-process layer. Route the unguarded ~/.codec JSON read-modify-writes through the cross-process file_lock: - grant_permission (routes/agents.py): new codec_agent_plan.grants_lock (mirrors _status_lock) held across load_grants -> modify -> save_grants -> set_grants_hash, so concurrent /grant calls can't clobber grants.json. - both notifications.json writers in codec_ask_user (question-write + mark-read) now also hold file_lock(NOTIFICATIONS_PATH), matching the pairing the pending_questions path already used. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…s (H1) Attacker-influenceable URLs (chat injection, clipboard, crew task) could be fetched against 127.0.0.1, 10/8, 192.168/16, 169.254.169.254 (cloud metadata), etc. — and the body flows back into the LLM transcript, so a read is an exfil path even with egress otherwise denied. Add codec_ssrf.validate_url: enforces http(s), resolves the host, and rejects if ANY resolved address is loopback / private / link-local / multicast / reserved / unspecified (dual-record + IPv4-mapped-IPv6 aware). Wire it BEFORE the HTTP call in skills/web_fetch.py (which clipboard_url_fetch delegates to) and in codec_agents._web_fetch. DNS-rebinding TOCTOU limitation documented. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…(H2/H6) The review-and-approve flow wrote an approved skill to the skills dir by basename with no name-collision check. An approved skill named after a hash-pinned built-in (calculator.py, file_write.py, ...) would take that trusted name — shadowing the real built-in, or overwriting it if the write dir is the repo skills dir. skill_approve now refuses any filename matching a name in the committed skills/.manifest.json (case-insensitive, so the guard also holds on case-insensitive filesystems), emitting skill_approve_blocked. PR-1A's load-time hash/AST gate remains the last line of defense; this refuses at the write point so the trusted file is never displaced. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

#10) - tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions ...)` appears outside codec_llm. Allowlists the documented vision sites (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated session-script string. Runs in CI via the full pytest suite (no separate workflow step needed). Includes a synthetic-violation test proving the guard is not a no-op. - tests/test_oauth_provider.py: regression tests for the refresh-token scope-escalation defense (requested scopes must be a subset of the original grant) — previously untested. permission_gate mutation cases are already covered by test_agent_runner.py (8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep -> threading.Event refactor is deferred — no specific flaky test identified, and rewriting passing tests without evidence risks regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…store registry) Design-only, per CLAUDE.md §11 + the SECURITY-REMEDIATION-DESIGN gate for the two remaining structural fixes. No production code. Each note grounds scope, seam, migration phasing, test plan, rollback, and open decisions in the actual HEAD surface (codec_dashboard chat cluster ~lines 2207-3005; ~30 ad-hoc json.dump writers + 2 duplicate atomic helpers). Awaiting sign-off before implementation. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…k + registry (C8) Fix #9, scoped to the high-value, low-risk core (recommended decisions applied): - Phase 0: atomic_write_json gains `default=` (custom serializer, e.g. str for datetime) and `sort_keys=` passthrough; new read_modify_write(path, fn) standardizes lock+load+mutate+atomic-write so a future RMW can't forget the file_lock. - Phase 2: routes/_shared._save_notification now holds the cross-process file_lock across its load->insert->write (it ran under the in-process _notif_lock only, so separate PM2 daemons clobbered each other). - Phase 5: docs/STATE-FILES.md registry of every ~/.codec state file + lock policy + migration status. Deferred to a documented follow-on (see STATE-FILES.md): Phase 1 mass overwrite-durability swaps; Phase 3 helper convergence (entangles the don't-touch pending_questions/grants write paths — byte-preserving, reviewed per-helper); Phase 4 don't-touch files (config.json, OAuth/Google tokens) with per-file sign-off. The broad json.dump CI ratchet was rejected as low-signal (rationale in STATE-FILES.md). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…start) Fix #8, intra-file CC reduction. The routes/chat.py module split is deferred: chat_completion shares _load_prompt_overrides + CHAT_SYSTEM_PROMPT with the prompt-override endpoints, so a clean split needs those relocated first (tracked in docs/FIX8-ROUTES-CHAT-EXTRACTION-DESIGN.md). Extract two of the heaviest blocks out of the 356-line chat_completion into named, independently-testable helpers — behavior-preserving (the 48-test chat/stream/budget/escalation baseline is byte-identical before/after): - _build_chat_system_prompt(config, budget, has_attachment, last_user_text): override + step-budget + attachment / content-rewrite / observer-inject logic. - _chat_vision_response(body, messages): the image -> vision-model branch. chat_completion: 356 -> 252 lines. Adds tests/test_chat_helpers.py (5 unit tests now possible because the logic is no longer trapped inside the handler). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

AVADSA25 · 2026-05-29T10:43:58Z

Superseded by #154, which contains all of this PR's Phase B/C commits plus the full re-audit remediation wave (N1 critical /ws/voice auth, N3 SSRF-redirect, N4/N10 symlink, N5 TOTP ?s=, N7/N8/N9, N19/N20/N21). Merge #154 instead.

Mikarina13 and others added 8 commits May 29, 2026 11:39

AVADSA25 mentioned this pull request May 29, 2026

security: Phase B/C top-10 remediation + full re-audit wave (1 critical + highs) #154

Merged

AVADSA25 closed this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: audit remediation Phase B/C (#4,#5,#7,#9,#10 + #8 partial)#153

security: audit remediation Phase B/C (#4,#5,#7,#9,#10 + #8 partial)#153
AVADSA25 wants to merge 8 commits into
mainfrom
security-phase-bc

AVADSA25 commented May 29, 2026

Uh oh!

AVADSA25 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AVADSA25 commented May 29, 2026

Partial / deferred (documented)

Reviewer notes

Uh oh!

AVADSA25 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants