Skip to content

security: audit remediation Phase B/C (#4,#5,#7,#9,#10 + #8 partial)#153

Closed
AVADSA25 wants to merge 8 commits into
mainfrom
security-phase-bc
Closed

security: audit remediation Phase B/C (#4,#5,#7,#9,#10 + #8 partial)#153
AVADSA25 wants to merge 8 commits into
mainfrom
security-phase-bc

Conversation

@AVADSA25
Copy link
Copy Markdown
Owner

Continuation of the top-10 audit remediation. Phase A (#1·#2·#3, C1/C2/C3/C6/C7) already merged via PR #152 (squash → main). This PR carries the Phase B/C work that was pushed after that merge.

All TDD'd; 490 tests green across touched modules; ruff clean; rebased cleanly onto current main.

Commit Fix Findings
run_with_timeout #4 C4 — kill the hanging ThreadPoolExecutor in mcp/observer
state locking #5 C5·H14·M6 — SQLite RLock + cross-process file_lock (grants/notifications)
SSRF guard #7a H1 — block internal/metadata URLs in web_fetch + crew fetch
skill-overwrite #7b H2·H6 — approve refuses shadowing a pinned built-in
A-12 guard + OAuth scope tests #10 test coverage
jsonstore core #9 C8 — atomic_write_json default=/sort_keys= + read_modify_write; notifications RMW lock; docs/STATE-FILES.md
chat helper extraction #8 C9 (start)chat_completion 356→252 lines, behavior-preserving

Partial / deferred (documented)

Reviewer notes

  • C5: the new test reproduced a hard segfault from concurrent unserialized sqlite access pre-fix.
  • C3 sandbox (in Phase A): broad allow file-read* retained + layered (deny …) after it (SBPL last-match-wins) to avoid breaking stdlib imports.

🤖 Generated with Claude Code

Mikarina13 and others added 8 commits May 29, 2026 11:39
The idiom `with ThreadPoolExecutor() as ex: fut.result(timeout=T)` defeats
its own timeout: the context-manager __exit__ calls shutdown(wait=True),
which blocks until the runaway task finishes. A 50ms timeout on a 5s task
took ~5s, so codec_mcp tool dispatch and codec_observer OCR could hang on a
slow skill or a stuck screencapture popup.

Add codec_concurrency.run_with_timeout (daemon thread + join(timeout), no
shutdown(wait=True)): surfaces the timeout promptly, abandons the daemon
worker (never blocks shutdown), re-raises fn exceptions, and raises
TimeoutError (== concurrent.futures.TimeoutError on py3.11+). Migrate both
call sites; mirrors the proven codec_hooks._run_hook_with_timeout pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CodecMemory shared one sqlite3 connection across threads
(check_same_thread=False) with no app-level lock — concurrent save/search on
the same connection can corrupt cursor state or even segfault (the new test
reproduced a hard crash before the fix). Add a threading.RLock and serialize
every runtime connection use (save, search, search_recent, get_sessions,
cleanup, rebuild_fts, close). RLock so get_context->search and
cleanup->close don't self-deadlock; WAL + busy_timeout stay the
cross-process layer.

Route the unguarded ~/.codec JSON read-modify-writes through the
cross-process file_lock:
- grant_permission (routes/agents.py): new codec_agent_plan.grants_lock
  (mirrors _status_lock) held across load_grants -> modify -> save_grants ->
  set_grants_hash, so concurrent /grant calls can't clobber grants.json.
- both notifications.json writers in codec_ask_user (question-write +
  mark-read) now also hold file_lock(NOTIFICATIONS_PATH), matching the
  pairing the pending_questions path already used.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…s (H1)

Attacker-influenceable URLs (chat injection, clipboard, crew task) could be
fetched against 127.0.0.1, 10/8, 192.168/16, 169.254.169.254 (cloud
metadata), etc. — and the body flows back into the LLM transcript, so a read
is an exfil path even with egress otherwise denied.

Add codec_ssrf.validate_url: enforces http(s), resolves the host, and rejects
if ANY resolved address is loopback / private / link-local / multicast /
reserved / unspecified (dual-record + IPv4-mapped-IPv6 aware). Wire it BEFORE
the HTTP call in skills/web_fetch.py (which clipboard_url_fetch delegates to)
and in codec_agents._web_fetch. DNS-rebinding TOCTOU limitation documented.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…(H2/H6)

The review-and-approve flow wrote an approved skill to the skills dir by
basename with no name-collision check. An approved skill named after a
hash-pinned built-in (calculator.py, file_write.py, ...) would take that
trusted name — shadowing the real built-in, or overwriting it if the write
dir is the repo skills dir.

skill_approve now refuses any filename matching a name in the committed
skills/.manifest.json (case-insensitive, so the guard also holds on
case-insensitive filesystems), emitting skill_approve_blocked. PR-1A's
load-time hash/AST gate remains the last line of defense; this refuses at the
write point so the trusted file is never displaced.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
#10)

- tests/test_a12_invariant.py: fails if a new inline `.post(...chat/completions
  ...)` appears outside codec_llm. Allowlists the documented vision sites
  (dashboard, watcher, both bridges, screenshot_text) + codec_core's generated
  session-script string. Runs in CI via the full pytest suite (no separate
  workflow step needed). Includes a synthetic-violation test proving the guard
  is not a no-op.
- tests/test_oauth_provider.py: regression tests for the refresh-token
  scope-escalation defense (requested scopes must be a subset of the original
  grant) — previously untested.

permission_gate mutation cases are already covered by test_agent_runner.py
(8 tests incl. D-5 traversal/symlink); not duplicated. The flaky time.sleep ->
threading.Event refactor is deferred — no specific flaky test identified, and
rewriting passing tests without evidence risks regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…store registry)

Design-only, per CLAUDE.md §11 + the SECURITY-REMEDIATION-DESIGN gate for the
two remaining structural fixes. No production code. Each note grounds scope,
seam, migration phasing, test plan, rollback, and open decisions in the actual
HEAD surface (codec_dashboard chat cluster ~lines 2207-3005; ~30 ad-hoc
json.dump writers + 2 duplicate atomic helpers). Awaiting sign-off before
implementation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…k + registry (C8)

Fix #9, scoped to the high-value, low-risk core (recommended decisions applied):

- Phase 0: atomic_write_json gains `default=` (custom serializer, e.g. str for
  datetime) and `sort_keys=` passthrough; new read_modify_write(path, fn)
  standardizes lock+load+mutate+atomic-write so a future RMW can't forget the
  file_lock.
- Phase 2: routes/_shared._save_notification now holds the cross-process
  file_lock across its load->insert->write (it ran under the in-process
  _notif_lock only, so separate PM2 daemons clobbered each other).
- Phase 5: docs/STATE-FILES.md registry of every ~/.codec state file + lock
  policy + migration status.

Deferred to a documented follow-on (see STATE-FILES.md): Phase 1 mass
overwrite-durability swaps; Phase 3 helper convergence (entangles the
don't-touch pending_questions/grants write paths — byte-preserving, reviewed
per-helper); Phase 4 don't-touch files (config.json, OAuth/Google tokens) with
per-file sign-off. The broad json.dump CI ratchet was rejected as low-signal
(rationale in STATE-FILES.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…start)

Fix #8, intra-file CC reduction. The routes/chat.py module split is deferred:
chat_completion shares _load_prompt_overrides + CHAT_SYSTEM_PROMPT with the
prompt-override endpoints, so a clean split needs those relocated first
(tracked in docs/FIX8-ROUTES-CHAT-EXTRACTION-DESIGN.md).

Extract two of the heaviest blocks out of the 356-line chat_completion into
named, independently-testable helpers — behavior-preserving (the 48-test
chat/stream/budget/escalation baseline is byte-identical before/after):
- _build_chat_system_prompt(config, budget, has_attachment, last_user_text):
  override + step-budget + attachment / content-rewrite / observer-inject logic.
- _chat_vision_response(body, messages): the image -> vision-model branch.

chat_completion: 356 -> 252 lines. Adds tests/test_chat_helpers.py (5 unit
tests now possible because the logic is no longer trapped inside the handler).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@AVADSA25
Copy link
Copy Markdown
Owner Author

Superseded by #154, which contains all of this PR's Phase B/C commits plus the full re-audit remediation wave (N1 critical /ws/voice auth, N3 SSRF-redirect, N4/N10 symlink, N5 TOTP ?s=, N7/N8/N9, N19/N20/N21). Merge #154 instead.

@AVADSA25 AVADSA25 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants