diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index cc538ca..426e8af 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,91 +1,330 @@ -# Contributing to Sovereign AI Workstation (CODEC engine) +# Contributing to CODEC (Sovereign AI Workstation engine) -Thanks for your interest in contributing! **Sovereign AI Workstation** is the -product brand; **CODEC** is the open-source engine codename — that's what you -see in code paths, file names, and the `~/.codec/` config directory. Both names -appear in this codebase by design. +Thanks for your interest in contributing! **Sovereign AI Workstation** is the product brand; **CODEC** is the open-source engine codename — that's what you see in code paths, file names, and the `~/.codec/` config directory. The repo is MIT licensed and welcomes contributions. -## Quick Start for Contributors +If anything here is unclear, [open a Discussion](https://github.com/AVADSA25/codec/discussions) — we treat doc gaps as real bugs. + +## Code of conduct + +Be kind. Assume good faith. CODEC's project ethos is **local-first, user-sovereign, privacy by default** — keep those principles in mind when proposing features. + +--- + +## Quick start for contributors ```bash git clone https://github.com/AVADSA25/codec.git cd codec -./install.sh -python3 -m pytest # All tests should pass +./install.sh # First-time only — installs PM2 services + permissions +python3 -m venv .venv && source .venv/bin/activate +pip install -r requirements-dev.txt # Runtime deps + pytest + ruff +~/.pyenv/shims/pytest # Full suite (~1m30s) ``` -## How to Contribute +The full test suite must pass before any PR. Current baseline: **1,300+ pytest tests passing, 80 skipped (env-dependent), 0 failed**. -### Report Bugs -Open an issue on GitHub with: -- What you expected -- What happened -- Your setup (macOS version, Python version, LLM provider) +--- -### Write a Skill -Skills are Python files in `~/.codec/skills/`. Use the template: +## Project ground rules -```python -"""My Custom Skill""" -SKILL_NAME = "my_skill" -SKILL_TRIGGERS = ["trigger phrase 1", "trigger phrase 2"] -SKILL_DESCRIPTION = "What this skill does" +1. **Local-first.** Zero data leaves the user's machine unless they explicitly route through a cloud provider. New features that require an outbound API call MUST be opt-in via `~/.codec/config.json`. +2. **No new top-level dependencies without discussion.** CODEC keeps a small dependency surface for security and Mac App Store distribution. Prefer stdlib (`sqlite3`, `pathlib`, `subprocess`) where possible. +3. **Every action that touches the user's filesystem, processes, or external services emits an audit line** to `~/.codec/audit.log` via `codec_audit.audit(...)`. New code paths that bypass the audit log will be asked to add it. +4. **Don't commit user-specific data.** The repo is public; assume any string in a `.py` file is world-readable. Personal context lives in `~/.codec/config.json` and `~/.codec/prompt_overrides.json`. +5. **Security-critical paths require tests.** Anything touching `is_dangerous`, `chat_consent_ok`, the AST safety gate, the audit log, or the OAuth flow needs a regression test before merge. -def run(task, app="", ctx=""): - return "Result spoken back to user" +--- + +## How to add a skill + +A skill is a single Python file that does one thing well. CODEC currently has 76 built-in skills. + +### Step 1: Copy the template + +```bash +cp skills/_template.py skills/my_skill.py ``` -Drop it in `~/.codec/skills/` — CODEC loads it on restart. +### Step 2: Fill in the required exports -### Submit a PR -1. Fork the repo -2. Create a branch: `git checkout -b feature/my-feature` -3. Make changes -4. Run tests: `python3 -m pytest` -5. Push and open a PR +```python +SKILL_NAME = "my_skill" +SKILL_DESCRIPTION = "One-sentence description of what the skill does." +SKILL_TRIGGERS = [ + "my skill", # Voice/wake-word matchers; longest match wins. + "do the thing", # Be specific — generic triggers collide with other skills. +] +SKILL_MCP_EXPOSE = True # True = available over MCP; False = local-only. + +# Optional: declare destructive operations for the §1.7 strict-consent gate. +# SKILL_DESTRUCTIVE = True -### Code Style -- Python 3.10+ -- No external dependencies unless absolutely necessary -- Every skill needs: SKILL_NAME, SKILL_TRIGGERS, SKILL_DESCRIPTION, run() -- Error handling: never bare `except:` — always `except Exception as e:` +# Optional: auto-fire on observer signals (Phase 2 Step 6). +# SKILL_OBSERVATION_TRIGGER = {"type": "window_title_match", "pattern": r"...", "cooldown_seconds": 600} -## Project Structure +def run(task: str, app: str = "", ctx: str = "") -> str: + """Required entry point. Receives the user's task text and returns a string.""" + return f"Did the thing for: {task}" ``` -codec.py — Entry point + inline keyboard listener (wake word, F13/F18, double-tap) -codec_config.py — Configuration and constants -codec_dispatch.py — Skill matching and dispatch -codec_agent.py — LLM agent session builder -codec_overlays.py — Tkinter overlay popups -codec_compaction.py — Context compaction for memory -codec_memory.py — FTS5 memory search -codec_agents.py — Multi-agent crew framework -codec_voice.py — Voice call WebSocket pipeline -codec_dashboard.py — Web dashboard + API -codec_textassist.py — Right-click text services -codec_search.py — Web search (DuckDuckGo/Serper) -codec_mcp.py — MCP server for external tools -codec_heartbeat.py — System health monitoring -skills/ — 76 skill plugins -tests/ — 1,300+ pytest tests (1,386 functions across 99 files) + +### Step 3: Add a test + +Create `tests/test_skill_my_skill.py`: + +```python +def test_my_skill_runs(): + from skills import my_skill + result = my_skill.run("test input") + assert isinstance(result, str) + assert result # non-empty ``` -## Running Tests +### Step 4: Regenerate the trusted manifest (built-in skills only) -First install the dev/test dependencies (macOS): +Built-in skills in `skills/` are hash-pinned in `skills/.manifest.json` for the PR-1A safety gate. After adding or modifying a skill in the repo: ```bash -pip install -r requirements-dev.txt # runtime deps + pytest + ruff +python3 tools/generate_skill_manifest.py --write ``` -Then: +Commit `skills/.manifest.json` alongside your skill file. CI verifies no drift via `tools/generate_skill_manifest.py --check`. + +**User skills** in `~/.codec/skills/` don't need the manifest — they go through the AST safety gate at load instead. + +### Step 5: Run the test suite ```bash -python3 -m pytest # All tests -python3 -m pytest -v # Verbose -python3 -m pytest -k "test_skills" # Specific file -ruff check . # Lint (same gate as CI) +~/.pyenv/shims/pytest tests/test_skill_my_skill.py -v +~/.pyenv/shims/pytest tests/ # Full suite, no regressions +``` + +### Step 6: Make it discoverable + +For voice triggers, no action needed — `codec_skill_registry` AST-parses every `.py` file at startup and indexes the triggers. + +For MCP exposure, no action needed — `codec_mcp._load_skill_tools_into` auto-registers skills with `SKILL_MCP_EXPOSE = True`. + +### Safety boundary checklist + +Before submitting a skill that touches files, processes, or external services: + +- [ ] **Shell:** if you shell out, use `subprocess.run([...], shell=False)` with a list-form argv. **Never** use `shell=True`, `os.system`, or f-string interpolation into shell strings. +- [ ] **Filesystem writes:** confirm target paths are under `$HOME` or `/tmp`. Mirror `skills/file_write.py:_is_safe_target` — no writes to `~/.codec/`, `~/.ssh/`, `/etc/`, etc. +- [ ] **HTTP calls:** read URLs from `codec_config` (not hardcoded). Route LLM-style chat/completions calls through `codec_llm.call/stream/acall/astream`. +- [ ] **Code execution:** if your skill executes user-supplied code, set `SKILL_MCP_EXPOSE = False`. See `skills/python_exec.py` for the sandboxing pattern (AST gate + `sandbox-exec` + RLIMIT_CPU/AS/NOFILE). +- [ ] **Destructive operations:** set `SKILL_DESTRUCTIVE = True` so the §1.7 strict-consent gate engages on the chat path. + +--- + +## How to add a crew (multi-agent workflow) + +Crews are defined in `codec_agents.py` and registered in `CREW_REGISTRY` (currently 12 crews at `codec_agents.py:1696-1709`). + +### Step 1: Write a builder function + +```python +def my_crew(query: str = "default query") -> Crew: + researcher = Agent( + name="Researcher", + role="You research topics thoroughly using web search.", + tools=[web_search_tool, web_fetch_tool], + max_tool_calls=5, + ) + writer = Agent( + name="Writer", + role="You synthesize research into clear summaries.", + tools=[google_docs_create_tool], + max_tool_calls=3, + ) + return Crew( + agents=[researcher, writer], + tasks=[f"Research: {query}", "Write a 3-paragraph summary based on the research"], + mode="sequential", # or "parallel" + max_steps=8, + allowed_tools=["web_search", "web_fetch", "google_docs_create"], + ) +``` + +### Step 2: Register in CREW_REGISTRY + +```python +CREW_REGISTRY = { + # ... + "my_crew": { + "builder": my_crew, + "description": "Researches a topic and writes a summary", + }, +} +``` + +### Step 3: Add a voice trigger (optional) + +In `codec_voice.py` `_CREW_TRIGGERS`: + +```python +_CREW_TRIGGERS = { + # ... + "research and summarize": "my_crew", +} +``` + +### Step 4: Add a runtime test + +```python +def test_my_crew_builds(): + from codec_agents import CREW_REGISTRY + crew = CREW_REGISTRY["my_crew"]["builder"](query="test") + assert crew.mode in ("sequential", "parallel") + assert len(crew.agents) >= 1 + assert crew.allowed_tools # tool allowlist must be set ``` + +### Step 5: Bump the test floor + +If you're adding a crew, update `tests/test_security.py:255` from `assert len(crew_defs) >= 12` to the new minimum so future regressions catch a crew removal. + +--- + +## How to propose a substantive change + +For anything beyond a skill or crew — architectural changes, new endpoints, schema changes, security-relevant code — follow the **design-first workflow**: + +1. **Write a design doc** at `docs/-DESIGN.md` covering: what, why, schema/API changes, migration plan, test plan, rollback plan. +2. **Open a Discussion** with the doc. +3. **Wait for at least one maintainer ack** before implementing. +4. **Implement with tests passing after each file change.** +5. **PR description references the design doc.** + +Examples to model after: `docs/PHASE1-STEP1-DESIGN.md`, `docs/PHASE2-STEP5-DESIGN.md`, `docs/PHASE3-STEP9-DESIGN.md`. + +--- + +## Code conventions + +### Python style + +- **Python 3.11+**, type hints encouraged on new code. +- **Snake_case** everywhere. +- **One concern per commit**, conventional commit messages preferred (`feat:`, `fix:`, `security:`, `refactor:`, `docs:`). +- **No bare `except:`.** Use `except Exception:` only at trust-boundary surfaces (audit emit, JSON parse, network IO). Prefer specific exception types (`OSError`, `ValueError`, `requests.HTTPError`). +- **Use the structured logger,** not `print()`. `log = logging.getLogger("codec_")` at module top. `print()` only in user-facing CLI tools (e.g. `codec_marketplace.py`). + +### File organization + +- **Engine modules:** `codec_.py` at repo root. +- **HTTP routes:** `routes/.py`. +- **Skills:** `skills/.py`, one file per skill. +- **Tests:** `tests/test_.py`, one per module under test. +- **Design docs:** `docs/-DESIGN.md`. + +### Configuration + +- All ports, URLs, timeouts go through `codec_config`. Don't hardcode service URLs in your module — read `codec_config.QWEN_BASE_URL`, `KOKORO_URL`, etc. +- Secrets go through `codec_keychain` (macOS) with a `~/.codec/secrets.enc.json` fallback for headless environments. Never read raw plaintext from `~/.codec/config.json` for secrets. + +### Testing + +- `pytest` from repo root: `~/.pyenv/shims/pytest`. +- Tests live under `tests/`, names start with `test_`. +- Use `pytest.parametrize` for variant coverage. +- Mock network calls (`requests.post`, `httpx.AsyncClient`) — tests should run offline. +- Don't write to `~/.codec/` in tests; use `tmp_path` fixtures. + +### Audit emits + +If your code performs an action that touches the user's filesystem, processes, or external services, emit an audit line: + +```python +from codec_audit import audit + +audit( + event="my_event_name", + source="codec-", + outcome="ok", # or "error", "denied", "timeout", "warning" + level="info", # or "warning", "error" + message="Optional ≤500 char description", + extra={"key": "value"}, +) +``` + +`event` is required. Schema details in `codec_audit.py` and `docs/PHASE1-STEP1-DESIGN.md`. + +--- + +## Pull request workflow + +1. Branch from `main`: `git checkout -b my-change`. +2. Make your changes, commit logically. +3. Run the full test suite: `~/.pyenv/shims/pytest tests/`. +4. Push and open a PR. +5. Fill in the PR template (Summary + Test plan). +6. CI runs `ruff` lint + the full pytest suite. Both must pass. +7. A maintainer reviews. For security-relevant changes, expect a second reviewer. +8. After approval, the maintainer squash-merges. + +### What we look for in reviews + +- **Tests for new behavior.** If you can't write a test, that's a design smell. +- **No silent failure modes.** Errors should be logged or surfaced. +- **Backward compat.** Don't break existing CODEC installations without a clear migration path and a `config_version` bump in `codec_config.CONFIG_SCHEMA_VERSION`. +- **Doc updates.** If you change behavior, update `FEATURES.md` / `README.md` / `CLAUDE.md` accordingly. +- **No new top-level dependencies** unless the maintainer pre-approved. + +### Common PR rejection reasons + +- Touches a security boundary (`is_dangerous`, `chat_consent_ok`, AST gate, audit envelope) without regression tests. +- Adds an outbound HTTP call without an opt-in config flag. +- Hardcodes a service URL or port instead of reading from `codec_config`. +- Adds a built-in skill without regenerating `skills/.manifest.json`. +- Skips the audit emit on an action that touches the filesystem/processes/network. + +--- + +## Architecture overview + +For deeper context, read these files in order: + +1. **`CLAUDE.md`** (also called `AGENTS.md`) — the in-tree architecture doc. ~3000 lines covering every module, every chokepoint, every don't-touch zone. +2. **`docs/PHASE1-STEP1-DESIGN.md`** through **`docs/PHASE3-STEP10-DESIGN.md`** — design docs for the major substrate additions. +3. **`SECURITY.md`** — the threat model and hardening waves (PR-1A through PR-2G). +4. **`PRIVACY.md`** — what CODEC sends off-Mac, when, and to whom. + +### Module map quick reference + +``` +codec.py — Main process (wake word, hotkeys, dispatch) +codec_dashboard.py — FastAPI server (port 8090, PWA frontend) +codec_voice.py — WebSocket voice pipeline +codec_agents.py — Agent + Crew runtime (no CrewAI dependency) +codec_skill_registry — AST-gated skill loader +codec_dispatch — Single skill execution chokepoint +codec_audit — Unified audit envelope (schema:1, HMAC-signed) +codec_config — Config + dangerous-pattern detector + secrets accessors +codec_memory — SQLite + FTS5 +codec_mcp — MCP server (FastMCP, auto-registers all SKILL_MCP_EXPOSE=True) +codec_hooks — Plugin lifecycle hooks (pre/post tool, on_error, etc.) +codec_textassist — Right-click text services +codec_dictate — F5 live-dictation + draft refinement +pilot/ — Browser automation runtime (vendored module) +skills/ — 76 built-in skill plugins (hash-pinned in .manifest.json) +routes/ — HTTP endpoint groups extracted from codec_dashboard.py +tests/ — 156 test files, 1,300+ tests +``` + +--- + +## Reporting security issues + +**Don't open a public issue.** Email `security@avadigital.ai` or use [GitHub Private Vulnerability Reporting](https://github.com/AVADSA25/codec/security/advisories/new). + +--- + +## Getting help + +- **Bug?** [Open an issue](https://github.com/AVADSA25/codec/issues). +- **Question?** [Start a Discussion](https://github.com/AVADSA25/codec/discussions). + +Thanks for contributing to CODEC. diff --git a/codec_agents.py b/codec_agents.py index c291024..9c0f942 100644 --- a/codec_agents.py +++ b/codec_agents.py @@ -8,7 +8,6 @@ import json import os import re -import secrets import time from dataclasses import dataclass, field from datetime import datetime @@ -84,16 +83,18 @@ def _serper_api_key() -> str: # called outside a crew) sets _correlation_id_var; nested emits inherit it # automatically. Using contextvars keeps the ID intact across asyncio task # boundaries and run_in_executor calls. See design §1.4. -_correlation_id_var: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( - "codec_agents_correlation_id", default=None +# +# A5 / SR-5: the canonical home for this contextvar moved to codec_audit so +# downstream readers (codec_ask_user, codec_observer, codec_triggers) can +# import it without dragging codec_agents into a cycle. Re-exported here for +# back-compat with any external importer that grabbed +# `codec_agents._correlation_id_var` directly. +from codec_audit import ( + _correlation_id_var as _correlation_id_var, # noqa: F401 — re-export + _new_correlation_id as _new_correlation_id, # noqa: F401 — re-export ) -def _new_correlation_id() -> str: - """Generate a 12-char lowercase-hex correlation_id (6 bytes).""" - return secrets.token_hex(6) - - def _audit(event_type: str, **kwargs): """Shim over codec_audit.audit() for crew/agent runtime events. diff --git a/codec_ask_user.py b/codec_ask_user.py index 8fd02ef..aa63a79 100644 --- a/codec_ask_user.py +++ b/codec_ask_user.py @@ -333,20 +333,17 @@ def _deadline_iso(timeout_seconds: int) -> str: # ── Correlation_id discovery from contextvars (Step 1 §1.4) ─────────────────── def _current_correlation_id() -> Optional[str]: """Read the wrapping operation's correlation_id from whichever module's - contextvar happens to be set. ``codec_agents._correlation_id_var`` is - set by Crew.run / Agent.run; ``codec_voice._voice_correlation_id_var`` - is set by VoicePipeline.run. Try both — first non-None wins. + contextvar happens to be set. The general one (Crew/Agent) and the + voice-session one both live in codec_audit (A5 / SR-5) — we import + one-way down to the foundation layer instead of reaching back into + codec_agents / codec_voice (which would create import cycles). """ try: - from codec_agents import _correlation_id_var as _cv1 - v = _cv1.get() + from codec_audit import _correlation_id_var, _voice_correlation_id_var + v = _correlation_id_var.get() if v: return v - except Exception: - pass - try: - from codec_voice import _voice_correlation_id_var as _cv2 - v = _cv2.get() + v = _voice_correlation_id_var.get() if v: return v except Exception: diff --git a/codec_audit.html b/codec_audit.html index 11a0b89..34cd2f5 100644 --- a/codec_audit.html +++ b/codec_audit.html @@ -500,7 +500,7 @@

No events found

html += `
${formatTimestamp(ev.ts)} - ${ev.src || cat} + ${escapeHtml(ev.src || cat)} ${escapeHtml(ev.sum || ev.message || '--')}
diff --git a/codec_audit.py b/codec_audit.py index 27ad2e2..d8a253e 100644 --- a/codec_audit.py +++ b/codec_audit.py @@ -48,22 +48,55 @@ """ from __future__ import annotations +import contextvars import hashlib import hmac as _hmac import json import logging import os import re +import secrets import threading import time from datetime import datetime, timezone from pathlib import Path +from typing import Optional # H-3 (PR-4E): the cross-process flock primitive (stdlib-only; does NOT import # codec_audit, so no cycle with this foundation module). Used in _write to # serialize rotation + append across all PM2 daemons. import codec_jsonstore +# ── Correlation ID contextvars (A5 / SR-5) ───────────────────────────────────── +# Canonical home for the cross-module correlation_id state. Previously these +# lived in codec_agents (general) and codec_voice (voice-scoped), forcing +# codec_ask_user / codec_observer / codec_triggers to import from those +# modules to read the contextvar — creating 3 of the 4 documented import +# cycles. With both vars hosted here in the foundation layer, every reader +# imports one-way down (toward codec_audit) and the cycles vanish. +# +# `_correlation_id_var` is the general 12-hex contextvar set by Crew.run / +# Agent.run / agent_runner. tool_call / tool_result / hook_fired / audit +# emit nested inside a wrapping operation inherit it. +# +# `_voice_correlation_id_var` is a parallel var set by VoicePipeline.run for +# voice-session-scoped operations. Voice sessions can long outlive any one +# crew or agent run; the separate var prevents cross-contamination. +_correlation_id_var: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( + "codec_correlation_id", default=None +) +_voice_correlation_id_var: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( + "codec_voice_correlation_id", default=None +) + + +def _new_correlation_id() -> str: + """12-character lowercase-hex correlation_id from secrets.token_hex(6). + Mirrors the Step 1 §1.4 contract: short enough for inline logging, + enough entropy that collisions inside the rotation window are negligible. + """ + return secrets.token_hex(6) + # ── Storage ──────────────────────────────────────────────────────────────────── _AUDIT_DIR = Path(os.path.expanduser("~/.codec")) _AUDIT_DIR.mkdir(parents=True, exist_ok=True) diff --git a/codec_dashboard.html b/codec_dashboard.html index faf5e8e..5f5c4c5 100644 --- a/codec_dashboard.html +++ b/codec_dashboard.html @@ -1699,7 +1699,7 @@

MENU

if (Object.keys(_promptsData).length === 0) throw new Error('No prompts found'); renderPrompts(); } catch(e) { - container.innerHTML = '
Failed to load prompts: ' + e.message + '. Try restarting the dashboard.
'; + container.innerHTML = '
Failed to load prompts: ' + _escHtml(e.message) + '. Try restarting the dashboard.
'; } } function renderPrompts() { @@ -1712,12 +1712,18 @@

MENU

var modClass = p.modified ? ' modified' : ''; var modBadge = p.modified ? 'edited' : ''; var charCount = (p.value || '').length; - html += '
' + - '
' + + // A4 / XSS hardening: prompt metadata (label/description/file) is sourced + // from ~/.codec/prompt_overrides.json. If an attacker ever lands a write to + // that file (via a /api/save_file bypass — see A1 — or a future plugin + // misstep), unescaped metadata becomes stored XSS on this settings panel. + // Escape every interpolated field; the textarea content already routes + // through _escHtml. + html += '
' + + '
' + '
' + - '
' + p.label + ' ' + modBadge + '
' + - '
' + p.description + '
' + - '
' + p.file + ' — ' + charCount + ' chars
' + + '
' + _escHtml(p.label || '') + ' ' + modBadge + '
' + + '
' + _escHtml(p.description || '') + '
' + + '
' + _escHtml(p.file || '') + ' — ' + charCount + ' chars
' + '
' + '' + '
' + diff --git a/codec_dashboard.py b/codec_dashboard.py index 660fa14..7c3bcf9 100644 --- a/codec_dashboard.py +++ b/codec_dashboard.py @@ -1947,24 +1947,108 @@ async def run_code(request: Request): except OSError as e: log.debug(f"Temp file cleanup failed for {_p}: {e}") +# /api/save_file safety check — mirrors PR-1C's `file_write` skill blocklist. +# A1 / SR-8: previously `~/.codec` was in the allowlist, which let any +# authenticated POST drop a malicious plugin into ~/.codec/plugins/ + add its +# hash to plugins.allowlist → RCE on next dispatch tick. The skill-side +# blocklist (skills/file_write.py:62-104) refuses all of: +# - the macOS system tree (/System, /Library, /usr, /bin, /sbin, /etc, …) +# - the entire ~/.codec/ tree (skills, plugins, oauth_state.json, audit.log, +# config.json, memory.db, agents/, plugins.allowlist, …) +# - the repo's built-in skills/ directory +# - sensitive filename patterns (.ssh, .env, credentials, id_rsa, token, …) +# - sensitive extensions (.pem, .key, .p12, .pfx, .keystore) +# This HTTP endpoint must apply the same set. Replicated inline (rather than +# importing from the skill module) so the dashboard's safety surface doesn't +# depend on skill-loader timing. Keep in sync with skills/file_write.py. +_SAVE_FILE_BLOCKED_SYSTEM_ROOTS = [ + "/System", "/Library", "/usr", "/bin", "/sbin", "/etc", + "/var", "/dev", "/Volumes", +] +_SAVE_FILE_BLOCKED_FILENAME_PATTERNS = [ + ".ssh", ".gnupg", ".env", "credentials", "secrets", "secret", + ".aws", ".gcloud", ".kube", "id_rsa", "id_ed25519", "id_dsa", + ".netrc", ".npmrc", ".pypirc", "keychain", "password", "token", + "api_key", "apikey", "private_key", +] +_SAVE_FILE_BLOCKED_EXTS = [".pem", ".key", ".p12", ".pfx", ".keystore"] + +def _save_file_blocked_roots(): + """Realpath-resolved blocklist. Built once on first call (module-load + avoidance — keeps dashboard import side-effect-free).""" + roots = [] + for p in _SAVE_FILE_BLOCKED_SYSTEM_ROOTS: + try: + roots.append(os.path.realpath(p)) + except Exception: + roots.append(p) + # CODEC's own state directory + repo's built-in skills/ tree. + roots.append(os.path.realpath(os.path.expanduser("~/.codec"))) + roots.append(os.path.realpath(os.path.join( + os.path.dirname(os.path.abspath(__file__)), "skills"))) + return roots + + +_SAVE_FILE_BLOCKED_ROOTS_CACHE = None +_SAVE_FILE_TMP_REAL = os.path.realpath("/tmp") +_SAVE_FILE_HOME_REAL = os.path.realpath(os.path.expanduser("~")) + + +def _save_file_is_safe(path): + """Return (ok, reason). Mirrors skills/file_write._is_safe_target.""" + global _SAVE_FILE_BLOCKED_ROOTS_CACHE + if _SAVE_FILE_BLOCKED_ROOTS_CACHE is None: + _SAVE_FILE_BLOCKED_ROOTS_CACHE = _save_file_blocked_roots() + if not path: + return False, "Empty path" + expanded = os.path.expanduser(path) + try: + real_path = os.path.realpath(expanded) + except Exception: + real_path = expanded + base_lower = os.path.basename(real_path).lower() + for pat in _SAVE_FILE_BLOCKED_FILENAME_PATTERNS: + if pat in base_lower: + return False, f"Blocked filename pattern: {pat!r}" + for ext in _SAVE_FILE_BLOCKED_EXTS: + if base_lower.endswith(ext): + return False, f"Blocked extension: {ext}" + for blocked in _SAVE_FILE_BLOCKED_ROOTS_CACHE: + if real_path == blocked or real_path.startswith(blocked + os.sep): + return False, f"Blocked path: {blocked}" + under_home = (real_path == _SAVE_FILE_HOME_REAL or + real_path.startswith(_SAVE_FILE_HOME_REAL + os.sep)) + under_tmp = (real_path == _SAVE_FILE_TMP_REAL or + real_path.startswith(_SAVE_FILE_TMP_REAL + os.sep)) + if not (under_home or under_tmp): + return False, f"Target must live under $HOME or /tmp (got: {real_path})" + return True, "" + + @app.post("/api/save_file") async def save_file(request: Request): body = await request.json() filename = os.path.basename(body.get("filename", "untitled.py")) content = body.get("content", "") - ALLOWED_SAVE_DIRS = [ - os.path.expanduser("~/codec-workspace"), - os.path.expanduser("~/.codec"), - os.path.expanduser("~/Desktop"), - os.path.expanduser("~/Documents"), - ] - directory = os.path.realpath(os.path.expanduser(body.get("directory", "~/codec-workspace"))) - if not any(directory.startswith(allowed) for allowed in ALLOWED_SAVE_DIRS): - return JSONResponse({"error": "Directory not allowed"}, status_code=403) + directory = os.path.realpath(os.path.expanduser( + body.get("directory", "~/codec-workspace"))) + target_path = os.path.join(directory, filename) + ok, reason = _save_file_is_safe(target_path) + if not ok: + try: + log_event("save_file_blocked", "codec-dashboard", + f"/api/save_file refused: {reason}", + extra={"requested_path": target_path, "reason": reason}, + outcome="denied", level="warning") + except Exception: + pass + return JSONResponse( + {"error": "Directory not allowed", "reason": reason}, + status_code=403) os.makedirs(directory, exist_ok=True) - path = os.path.join(directory, filename) - with open(path, "w") as f: f.write(content) - return {"path": path, "size": len(content)} + with open(target_path, "w") as f: + f.write(content) + return {"path": target_path, "size": len(content)} # (Skills endpoints moved to routes/skills.py) @@ -3697,7 +3781,14 @@ async def _bg_heartbeat(): async def _bg_watcher(): - """Poll for draft tasks every 200ms.""" + """Poll for draft tasks every 1s. + + A10 / SR-13: was 200ms — that's 432k stat()+exists() calls/day per + customer for a file that changes <100×/day. 1s drops it 5× to ~86k + while keeping draft-task pickup latency within UX comfort (a draft + overlay closing 0.5-1s after a paste is indistinguishable from instant + to the operator). + """ from codec_watcher import TASK_FILE, handle_draft _bg_status["watcher"]["running"] = True log.info("[WATCHER] Background service started") @@ -3716,7 +3807,7 @@ async def _bg_watcher(): except Exception as e: _bg_status["watcher"]["errors"] += 1 log.error(f"[WATCHER] Error: {e}") - await asyncio.sleep(0.2) + await asyncio.sleep(1.0) _bg_status["watcher"]["running"] = False diff --git a/codec_heartbeat.py b/codec_heartbeat.py index 6a6009c..7192c51 100644 --- a/codec_heartbeat.py +++ b/codec_heartbeat.py @@ -29,7 +29,7 @@ def check_pending_tasks(): """Check memory for tasks that were saved for later""" try: - conn = sqlite3.connect(DB_PATH) + conn = sqlite3.connect(DB_PATH, timeout=5.0); conn.execute("PRAGMA busy_timeout=5000") # Find messages containing "task" or "later" or "remind" from last 24h cutoff = (datetime.now() - timedelta(hours=24)).isoformat() rows = conn.execute(""" @@ -90,7 +90,7 @@ def check_system_health(): def check_memory_stats(): """Report memory database stats + size monitoring""" try: - conn = sqlite3.connect(DB_PATH) + conn = sqlite3.connect(DB_PATH, timeout=5.0); conn.execute("PRAGMA busy_timeout=5000") total = conn.execute("SELECT COUNT(*) FROM conversations").fetchone()[0] sessions = conn.execute("SELECT COUNT(DISTINCT session_id) FROM conversations").fetchone()[0] latest = conn.execute("SELECT timestamp FROM conversations ORDER BY id DESC LIMIT 1").fetchone() @@ -132,8 +132,10 @@ def backup_memory_db(): try: # Use SQLite backup API for safe copy (handles WAL mode) - src = sqlite3.connect(DB_PATH) - dst = sqlite3.connect(backup_path) + src = sqlite3.connect(DB_PATH, timeout=5.0) + src.execute("PRAGMA busy_timeout=5000") + dst = sqlite3.connect(backup_path, timeout=5.0) + dst.execute("PRAGMA busy_timeout=5000") src.backup(dst) dst.close() src.close() @@ -195,7 +197,7 @@ def execute_pending_tasks(): against DANGEROUS_PATTERNS before execution. """ try: - conn = sqlite3.connect(DB_PATH) + conn = sqlite3.connect(DB_PATH, timeout=5.0); conn.execute("PRAGMA busy_timeout=5000") cutoff = (datetime.now() - timedelta(hours=24)).isoformat() # Tightened patterns: require explicit prefix or very specific phrasing diff --git a/codec_jsonstore.py b/codec_jsonstore.py index 8de103b..b60be0e 100644 --- a/codec_jsonstore.py +++ b/codec_jsonstore.py @@ -64,20 +64,31 @@ def atomic_write_json( def file_lock(path: Any) -> Iterator[None]: """Exclusive cross-process lock on `.lock` for the duration of the block. Use around a read-modify-write of `path` so concurrent daemons - serialize instead of clobbering each other.""" + serialize instead of clobbering each other. + + A7 / SR-11: the `open()` is now inside the try-block, paired with a + finally close. Previously, if a path issue caused open() to succeed but + something raised between open() and the body's try-block, the lock- + sidecar file handle leaked in LOCK_EX state until GC. + """ path = str(path) directory = os.path.dirname(path) or "." os.makedirs(directory, exist_ok=True) - lock_file = open(path + ".lock", "w") + lock_file = None try: + lock_file = open(path + ".lock", "w") fcntl.flock(lock_file.fileno(), fcntl.LOCK_EX) yield finally: - try: - fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN) - except Exception: - pass - lock_file.close() + if lock_file is not None: + try: + fcntl.flock(lock_file.fileno(), fcntl.LOCK_UN) + except Exception: + pass + try: + lock_file.close() + except Exception: + pass def read_modify_write( diff --git a/codec_license.py b/codec_license.py index d23a6f1..2cc7bf5 100644 --- a/codec_license.py +++ b/codec_license.py @@ -115,8 +115,52 @@ def _b64url(data: str) -> bytes: return base64.urlsafe_b64decode(data + "=" * (-len(data) % 4)) +# A3 / SR-10: in-memory pubkey + license-state caches. +# +# Before this, `_fetch_pubkey` ran an HTTPS round-trip on EVERY skill +# dispatch via `feature_allowed → license_state → _fetch_pubkey`. At 100 +# paying customers × ~200 skill calls/day each = 20k req/day to the license +# endpoint; peak ~5 req/sec sustained. Worse: the urllib call is sync with +# timeout=4.0, so a flaky network or AVA outage adds ~4s latency to every +# skill call until the disk fallback kicks in. +# +# Public keys are publishable, low-churn (rotation lead time measured in +# weeks). A 1h in-memory TTL drops license-server load by ~99% and removes +# the latency tail. Cache is per-process; module reload (PM2 restart) and +# `_invalidate_caches()` (settings save) flush it. +_PUBKEY_CACHE_VALUE: Optional[bytes] = None +_PUBKEY_CACHE_TS: float = 0.0 +_PUBKEY_CACHE_TTL: float = 3600.0 # 1 hour + +_LICENSE_STATE_CACHE_VALUE: Optional["LicenseState"] = None +_LICENSE_STATE_CACHE_TS: float = 0.0 +_LICENSE_STATE_CACHE_TTL: float = 60.0 # 60 seconds + + +def _invalidate_caches() -> None: + """Drop the in-memory pubkey + license-state caches. Call after the + operator changes their license token, config URL, or to force a + re-fetch on next call. Safe to call from any thread.""" + global _PUBKEY_CACHE_VALUE, _PUBKEY_CACHE_TS + global _LICENSE_STATE_CACHE_VALUE, _LICENSE_STATE_CACHE_TS + _PUBKEY_CACHE_VALUE = None + _PUBKEY_CACHE_TS = 0.0 + _LICENSE_STATE_CACHE_VALUE = None + _LICENSE_STATE_CACHE_TS = 0.0 + + def _fetch_pubkey(cfg: dict, timeout: float = 4.0) -> Optional[bytes]: - """Fetch the server's PEM public key; cache to disk. Falls back to cache offline.""" + """Fetch the server's PEM public key; cache in memory (TTL 1h) + on disk. + + Cache miss path: HTTPS GET → on-success writes both caches. Cache hit + path: returns in-memory bytes (sub-microsecond). On network failure, + falls back to the disk cache (offline-survives-PM2-restart). On total + failure (no memory cache, no disk cache, no network) returns None. + """ + global _PUBKEY_CACHE_VALUE, _PUBKEY_CACHE_TS + now = time.time() + if _PUBKEY_CACHE_VALUE is not None and (now - _PUBKEY_CACHE_TS) < _PUBKEY_CACHE_TTL: + return _PUBKEY_CACHE_VALUE try: with urllib.request.urlopen(_pubkey_url(cfg), timeout=timeout) as r: pem = r.read() @@ -125,10 +169,14 @@ def _fetch_pubkey(cfg: dict, timeout: float = 4.0) -> Optional[bytes]: PUBKEY_CACHE.write_bytes(pem) except Exception: pass + _PUBKEY_CACHE_VALUE = pem + _PUBKEY_CACHE_TS = now return pem except Exception: pass - # Offline: use cached key + # Network failed — try disk fallback. Don't pollute the in-memory cache + # on this path so the next call retries the network instead of holding + # a stale value for an hour. try: return PUBKEY_CACHE.read_bytes() except Exception: @@ -217,35 +265,57 @@ def license_state(cfg: Optional[dict] = None, *, _now: Optional[float] = None) - valid → "ok" (refresh grace timestamp) can't verify (offline) → "grace" if within window, else "readonly" invalid/expired → "readonly" + + A3 / SR-10: result is memoized for 60s (process-level cache) to keep + `feature_allowed` cheap on the hot path. Tests bypass the cache by + passing `_now=` (skips the cache entirely so the test controls time). """ cfg = cfg if cfg is not None else _load_config() now = _now if _now is not None else time.time() + # In-memory state cache (60s TTL). Only honored when the caller didn't + # inject a custom `_now` (tests inject _now to control time, so they + # always see fresh evaluation). Gated solely on time-injection — a + # custom cfg is allowed to reuse the cache because cfg flips are rare + # and the test-injection escape hatch is the simpler invariant. + global _LICENSE_STATE_CACHE_VALUE, _LICENSE_STATE_CACHE_TS + use_cache = (_now is None) + if use_cache and _LICENSE_STATE_CACHE_VALUE is not None and \ + (now - _LICENSE_STATE_CACHE_TS) < _LICENSE_STATE_CACHE_TTL: + return _LICENSE_STATE_CACHE_VALUE + + def _store(result): + global _LICENSE_STATE_CACHE_VALUE, _LICENSE_STATE_CACHE_TS + if use_cache: + _LICENSE_STATE_CACHE_VALUE = result + _LICENSE_STATE_CACHE_TS = now + return result + if not _is_paid_edition(cfg): - return LicenseState(mode="oss", reason="open-source build — no license required") + return _store(LicenseState(mode="oss", reason="open-source build — no license required")) token = _license_token(cfg) if not token: - return LicenseState(mode="readonly", reason="paid build not activated — enter a license key") + return _store(LicenseState(mode="readonly", reason="paid build not activated — enter a license key")) pubkey = _fetch_pubkey(cfg) if pubkey is None: # Can't get the key at all → fall back to grace window - return _grace_or_readonly(now, "license server unreachable and no cached key") + return _store(_grace_or_readonly(now, "license server unreachable and no cached key")) valid, claims, reason = verify_license_token(token, pubkey) tier = str(claims.get("tier", "pro")) expires_at = str(claims.get("expires_at", "")) if valid: _write_grace(now, tier, expires_at) - return LicenseState(mode="ok", reason="licensed", tier=tier, expires_at=expires_at) + return _store(LicenseState(mode="ok", reason="licensed", tier=tier, expires_at=expires_at)) if reason in ("verify error: ", "license server unreachable") or "unreachable" in reason: - return _grace_or_readonly(now, reason) + return _store(_grace_or_readonly(now, reason)) # Hard invalid (bad signature / expired / malformed) → read-only immediately - return LicenseState(mode="readonly", reason=f"license invalid: {reason}", - tier=tier, expires_at=expires_at) + return _store(LicenseState(mode="readonly", reason=f"license invalid: {reason}", + tier=tier, expires_at=expires_at)) def _grace_or_readonly(now: float, reason: str) -> LicenseState: diff --git a/codec_memory_upgrade.py b/codec_memory_upgrade.py index d7f5952..a4c64ad 100644 --- a/codec_memory_upgrade.py +++ b/codec_memory_upgrade.py @@ -94,8 +94,13 @@ def get_boot_context(include_rooms: bool = True) -> str: def _conn() -> sqlite3.Connection: - c = sqlite3.connect(DB_PATH) + # A9 / SR-12: 5s busy_timeout so concurrent chat + observer + facts + # writes wait instead of raising SQLITE_BUSY immediately. WAL is per-DB + # (already enabled by codec_memory's first writer); applying it here is + # idempotent. + c = sqlite3.connect(DB_PATH, timeout=5.0) c.execute("PRAGMA journal_mode=WAL") + c.execute("PRAGMA busy_timeout=5000") c.executescript(_FACTS_SCHEMA) return c diff --git a/codec_vibe.html b/codec_vibe.html index 7a2d9f8..954ee4e 100644 --- a/codec_vibe.html +++ b/codec_vibe.html @@ -499,7 +499,20 @@

CODEC

var el = document.getElementById('cm'); var d = document.createElement('div'); d.className = 'mg ' + (role === 'user' ? 'mu' : 'ma'); - if (raw) { d.innerHTML = typeof DOMPurify !== 'undefined' ? DOMPurify.sanitize(text) : text; } else { d.innerHTML = esc(text).replace(/\n/g, '
'); } + // A2 / SR-9: DOMPurify must FAIL-CLOSED. If the CDN fails to load, a + // sub-resource hijack blocks it, or an extension strips the script tag, + // we previously wrote raw `text` (LLM output) directly into innerHTML — + // an XSS sink for any attacker-influenceable model output. Escape instead. + if (raw) { + if (typeof DOMPurify !== 'undefined') { + d.innerHTML = DOMPurify.sanitize(text); + } else { + console.warn('DOMPurify unavailable — falling back to plain-text escape'); + d.innerHTML = esc(text).replace(/\n/g, '
'); + } + } else { + d.innerHTML = esc(text).replace(/\n/g, '
'); + } el.appendChild(d); el.scrollTop = el.scrollHeight; } diff --git a/codec_voice.py b/codec_voice.py index 7e52991..2dec28a 100644 --- a/codec_voice.py +++ b/codec_voice.py @@ -38,9 +38,11 @@ # Per-session correlation_id contextvar — set at VoicePipeline.run entry, # inherited by every audit emit during the session lifetime (incl. nested # tool calls fired from inside the pipeline). See design §1.4. -_voice_correlation_id_var: contextvars.ContextVar[Optional[str]] = contextvars.ContextVar( - "codec_voice_correlation_id", default=None -) +# +# A5 / SR-5: canonical home moved to codec_audit. Re-exported here for +# back-compat with any external importer (codec_ask_user previously pulled +# from here; now reads from codec_audit directly). +from codec_audit import _voice_correlation_id_var as _voice_correlation_id_var # noqa: F401 — re-export # ── Phase 1 Step 3 §5.3.1 — fuzzy-option-match for AskUserQuestion ──────── # When the question carries `options`, the voice ASR layer maps the spoken @@ -206,8 +208,7 @@ def _clear_voice_session_marker() -> None: WHISPER_URL = _cfg.get("stt_url", WHISPER_URL) WHISPER_MODEL = _cfg.get("stt_model", WHISPER_MODEL) except Exception as _e: - print(f"[Voice] Config load warning: {_e} — using defaults") - + log.warning(f"Config load warning: {_e} — using defaults") # ── Vision config ──────────────────────────────────────────────────────── VISION_URL = "http://localhost:8083/v1/chat/completions" VISION_MODEL = "mlx-community/Qwen2.5-VL-7B-Instruct-4bit" @@ -402,7 +403,7 @@ def __init__(self, websocket, resume_session_id: str | None = None): VoicePipeline._resume_timestamps.pop(resume_session_id, None) # keep the two dicts in sync self.messages = saved self._is_resumed = True - print(f"[Voice] Resumed session {resume_session_id} with {len(saved)} messages") + log.info(f"Resumed session {resume_session_id} with {len(saved)} messages") else: self.messages = [{"role": "system", "content": _build_system_prompt()}] @@ -444,8 +445,7 @@ def _save_for_resume(self): self._resumable_sessions[self.session_id] = list(self.messages) VoicePipeline._resume_timestamps[self.session_id] = time.monotonic() self._prune_resumable() - print(f"[Voice] Session {self.session_id} saved for resume ({len(self.messages)} messages)") - + log.info(f"Session {self.session_id} saved for resume ({len(self.messages)} messages)") # ── Skill loader (lazy via SkillRegistry) ───────────────────────────── def _load_skills(self): @@ -460,8 +460,7 @@ def _load_skills(self): "triggers": [t.lower() for t in triggers], "desc": self._skill_registry.get_description(name), } - print(f"[Voice] {len(self.skills)} skills registered (lazy)") - + log.info(f"{len(self.skills)} skills registered (lazy)") # ── LLM Warmup ──────────────────────────────────────────────────────── async def warmup_llm(self): @@ -482,10 +481,9 @@ async def warmup_llm(self): "role": "system", "content": base + "\n\nRecent memory:\n" + context } - print("[Voice] Warmup: memory context injected into system prompt") + log.debug("Warmup: memory context injected into system prompt") except Exception as e: - print(f"[Voice] Warmup error: {e}") - + log.error(f"Warmup error: {e}") _VOICE_SKIP_SKILLS = {"calculator", "app_switch", "brightness", "clipboard"} def _match_skill(self, text: str) -> Optional[dict]: @@ -559,20 +557,20 @@ async def transcribe(self, pcm: bytes) -> str: text = r.json().get("text", "").strip() clean = text.lower().rstrip(".!?, ") if clean in NOISE_WORDS: - print(f"[Voice] Discarded noise: '{text}'") + log.debug(f"Discarded noise: '{text}'") return "" # Whisper hallucination filter (YouTube outros, annotations, etc.) text_lower = text.strip().lower() if text_lower in WHISPER_HALLUCINATIONS: - print(f"[Voice] Discarded hallucination: '{text}'") + log.debug(f"Discarded hallucination: '{text}'") return "" # Detect repetitive hallucinations like "thank you. thank you. thank you." if re.search(r'(.{4,}?)\1{2,}', text_lower): - print(f"[Voice] Discarded repetitive: '{text}'") + log.debug(f"Discarded repetitive: '{text}'") return "" words = [w for w in clean.split() if w not in {"uh","um","er","hmm","ah"}] if len(words) < 2: - print(f"[Voice] Discarded too short: '{text}'") + log.debug(f"Discarded too short: '{text}'") return "" try: _dash = os.path.dirname(os.path.abspath(__file__)) @@ -581,11 +579,11 @@ async def transcribe(self, pcm: bytes) -> str: from codec_config import clean_transcript as _clean text = _clean(text) or text except Exception as e: - print(f"[Voice] Transcript clean warning: {e}") + log.warning(f"Transcript clean warning: {e}") return text - print(f"[Voice] Whisper {r.status_code}: {r.text[:200]}") + log.error(f"Whisper {r.status_code}: {r.text[:200]}") except Exception as e: - print(f"[Voice] Whisper error: {e}") + log.error(f"Whisper error: {e}") return "" # ── LLM ─────────────────────────────────────────────────────────────── @@ -616,7 +614,7 @@ async def _stream_qwen(self, messages: list, max_tokens: int = 2000): if token: yield token except Exception as e: - print(f"[Voice] Qwen error: {e}") + log.error(f"Qwen error: {e}") yield "Sorry, I had a processing error." finally: await llm_queue.release(Priority.CRITICAL) @@ -656,7 +654,7 @@ def _downscale(): return base64.b64encode(buf.getvalue()).decode() return await loop.run_in_executor(None, _downscale) except Exception as e: - print(f"[Voice] Screenshot failed: {e}") + log.error(f"Screenshot failed: {e}") return None async def _analyze_screenshot(self, image_b64: str, user_text: str) -> str: @@ -707,7 +705,7 @@ async def generate_response(self, user_text: str): # Append after memory block, before next user turn self.messages[0]["content"] += f"\n\n{_obs_summary}" except Exception as _e: - print(f"[Voice] observer injection failed (non-fatal): {_e}") + log.debug(f"observer injection failed (non-fatal): {_e}") full = "" async for chunk in self._stream_qwen(self._trimmed_messages()): full += chunk @@ -728,9 +726,9 @@ async def synthesize(self, text: str) -> Optional[bytes]: ) if r.status_code == 200: return r.content - print(f"[Voice] Kokoro {r.status_code}: {r.text[:200]}") + log.error(f"Kokoro {r.status_code}: {r.text[:200]}") except Exception as e: - print(f"[Voice] TTS error: {e}") + log.error(f"TTS error: {e}") return None # ── Sentence boundary ───────────────────────────────────────────────── @@ -799,7 +797,7 @@ async def _poll_pending_question_for_voice(self) -> Optional[dict]: from codec_ask_user import _load_pending_questions data = _load_pending_questions() except Exception as e: - print(f"[Voice] ask_user poll failed: {e}") + log.error(f"ask_user poll failed: {e}") return None for rec in data.get("pending_questions", []): if rec.get("status") != "pending": @@ -834,8 +832,7 @@ async def _announce_pending_question(self, rec: dict) -> None: ) await self._speak(announcement) except Exception as e: - print(f"[Voice] ask_user announce failed: {e}") - + log.error(f"ask_user announce failed: {e}") async def _handle_voice_ask_user_answer(self, qid: str, user_text: str) -> None: """Route the user's spoken transcript to /api/agents/answer/{qid} @@ -880,7 +877,7 @@ async def _handle_voice_ask_user_answer(self, qid: str, err = result.get("error", "") await self._speak(f"Couldn't record that answer: {err}") except Exception as e: - print(f"[Voice] ask_user answer handler failed: {e}") + log.error(f"ask_user answer handler failed: {e}") try: await self._speak("Something went wrong recording your answer.") except Exception: @@ -888,7 +885,7 @@ async def _handle_voice_ask_user_answer(self, qid: str, async def dispatch_skill(self, skill: dict, user_text: str) -> Optional[str]: try: - print(f"[Voice] → skill: {skill['name']}") + log.info(f"→ skill: {skill['name']}") loop = asyncio.get_event_loop() # Phase 1 Step 2: route the voice skill call through the unified # hook surface. self._cid is set at VoicePipeline.run entry per @@ -916,11 +913,11 @@ def _inner(t, _c): f"'{result.plugin_name}': {result.reason}") result = str(result).strip() if result else "" if not result or result.lower() in ("none", "done, but no output.", ""): - print(f"[Voice] Skill {skill['name']} empty — falling through to Qwen") + log.debug(f"Skill {skill['name']} empty — falling through to Qwen") return None return result except Exception as e: - print(f"[Voice] Skill error: {e}") + log.error(f"Skill error: {e}") return f"There was an error running that: {e}" async def _skill_to_speech(self, result: str) -> str: @@ -1053,10 +1050,9 @@ def save_to_memory(self): continue mem.save(self.session_id, role, str(content)[:2000]) saved += 1 - print(f"[Voice] Saved {saved} messages → {self.session_id}") + log.info(f"Saved {saved} messages → {self.session_id}") except Exception as e: - print(f"[Voice] Memory save error: {e}") - + log.error(f"Memory save error: {e}") # ── Audio receiver task ──────────────────────────────────────────────── async def _audio_receiver(self): @@ -1072,7 +1068,7 @@ async def _audio_receiver(self): msg_type = msg.get("type", "") if msg_type == "websocket.disconnect": - print("[Voice] WebSocket disconnected in receiver") + log.info("WebSocket disconnected in receiver") await self.utterance_queue.put(None) # signal pipeline to stop # If unexpected, save state for possible resume if self._disconnect_reason != "user": @@ -1088,7 +1084,7 @@ async def _audio_receiver(self): if ctrl_type == "interrupt": if self.processing: - print("[Voice] Interrupt received") + log.info("Interrupt received") self.interrupted.set() elif ctrl_type == "your_turn": @@ -1097,39 +1093,36 @@ async def _audio_receiver(self): utterance = bytes(self.audio_buffer) self.audio_buffer = bytearray() self.is_speaking = False - print(f"[Voice] Your-turn: flushing {len(utterance)} bytes") + log.info(f"Your-turn: flushing {len(utterance)} bytes") await self.utterance_queue.put(utterance) elif self.audio_buffer: # Buffer too short — discard and notify self.audio_buffer = bytearray() self.is_speaking = False await self.ws.send_json({"type": "status", "status": "listening"}) - print("[Voice] Your-turn: buffer too short, discarded") + log.debug("Your-turn: buffer too short, discarded") else: - print("[Voice] Your-turn: no audio buffered") - + log.info("Your-turn: no audio buffered") elif ctrl_type == "nudge": # User tapped "still there?" — send reassurance if self.processing: await self.ws.send_json({"type": "transcript", "role": "system", "text": "Still processing — hang on…"}) - print("[Voice] Nudge acknowledged — still processing") - + log.info("Nudge acknowledged — still processing") elif ctrl_type == "ping": await self.ws.send_json({"type": "pong"}) elif ctrl_type == "end_call": # User intentionally ending the call — no resume needed self._disconnect_reason = "user" - print("[Voice] User ended call intentionally") + log.info("User ended call intentionally") await self.utterance_queue.put(None) return elif ctrl_type == "hold_start": # User started hold-to-talk — ensure we're in listening mode - print("[Voice] Hold-to-talk started") - + log.info("Hold-to-talk started") except Exception as e: - print(f"[Voice] WS text parse warning: {e}") + log.warning(f"WS text parse warning: {e}") continue # ── Audio bytes ── @@ -1143,7 +1136,7 @@ async def _audio_receiver(self): rms = self._rms(raw_bytes) if (rms > INTERRUPT_THRESHOLD and time.monotonic() - self.last_tts_end > VAD_ECHO_COOLDOWN): - print(f"[Voice] Interrupt by audio energy (RMS {rms:.0f})") + log.info(f"Interrupt by audio energy (RMS {rms:.0f})") self.interrupted.set() # Feed VAD so utterance is buffered (queued once processing ends) utterance = self.feed_audio(raw_bytes) @@ -1160,7 +1153,7 @@ async def _audio_receiver(self): await self.utterance_queue.put(utterance) except Exception as e: - print(f"[Voice] Receiver error: {type(e).__name__}: {e}") + log.error(f"Receiver error: {type(e).__name__}: {e}") # Try to send reconnect advisory before dying try: await self.ws.send_json({ @@ -1202,7 +1195,7 @@ async def _pipeline(self): await self.ws.send_json({"type": "status", "status": "listening"}) continue - print(f"[Voice] User: {user_text}") + log.info(f"User: {user_text}") await self.ws.send_json({"type": "transcript", "role": "user", "text": user_text}) # Phase 1 Step 3 §5.3 — single-question listen mode. @@ -1234,7 +1227,7 @@ async def _pipeline(self): # 1b. Screenshot + Vision — "look at my screen" etc. if self._is_screen_request(user_text): - print("[Voice] Screen analysis requested") + log.info("Screen analysis requested") await self.ws.send_json({"type": "status", "status": "analyzing_screen"}) # Camera shutter sound + overlay for visual feedback asyncio.get_event_loop().run_in_executor(None, lambda: subprocess.Popen( @@ -1252,16 +1245,16 @@ async def _pipeline(self): "r.after(8000,r.destroy);r.mainloop()"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)) screenshot_b64 = await self._take_screenshot() - print(f"[Voice] Screenshot taken: {'OK' if screenshot_b64 else 'FAILED'}") + log.info(f"Screenshot taken: {'OK' if screenshot_b64 else 'FAILED'}") if screenshot_b64: self.interrupted.clear() await self.ws.send_json({"type": "status", "status": "processing"}) - print("[Voice] Sending to vision model...") + log.info("Sending to vision model...") vision_desc = await self._analyze_screenshot(screenshot_b64, user_text) - print(f"[Voice] Vision result: {'OK (' + str(len(vision_desc)) + ' chars)' if vision_desc else 'EMPTY/FAILED'}") + log.info(f"Vision result: {'OK (' + str(len(vision_desc)) + ' chars)' if vision_desc else 'EMPTY/FAILED'}") # Check for interrupt after vision inference (user may have spoken) if self.interrupted.is_set(): - print("[Voice] Interrupted during vision inference — discarding result") + log.info("Interrupted during vision inference — discarding result") self.processing = False await self.ws.send_json({"type": "status", "status": "listening"}) continue @@ -1269,7 +1262,7 @@ async def _pipeline(self): # Speak the vision description directly — no LLM needed # Clean up vision output for TTS clean_desc = re.sub(r'[*#`\[\]]', '', vision_desc).strip() - print(f"[Voice] Speaking vision result directly: {clean_desc[:100]}...") + log.info(f"Speaking vision result directly: {clean_desc[:100]}...") self.messages.append({"role": "user", "content": user_text}) screen_context = f"I looked at your screen. Here's what I see: {clean_desc}" self.messages.append({"role": "assistant", "content": screen_context}) @@ -1349,11 +1342,9 @@ async def _pipeline(self): }) if interrupted_mid: - print("[Voice] Response interrupted by user") - + log.info("Response interrupted by user") except Exception as e: - print(f"[Voice] Pipeline error: {type(e).__name__}: {e}") - + log.error(f"Pipeline error: {type(e).__name__}: {e}") finally: self.interrupted.clear() self.processing = False @@ -1374,7 +1365,7 @@ async def run(self): # after the answer is routed to /api/agents/answer/{id}. self._awaiting_ask_user = None run_t0 = time.monotonic() - print(f"[Voice] Session {'resumed' if is_resumed else 'started'}: {self.session_id}") + log.info(f"Session {'resumed' if is_resumed else 'started'}: {self.session_id}") # Phase 1 Step 3 §5.3 — touch the active-session marker so # codec_ask_user knows whether to announce-and-listen vs defer # to PWA-only. @@ -1440,12 +1431,12 @@ async def run(self): run_outcome = "error" run_error_type = type(e).__name__ run_error = str(e)[:500] - print(f"[Voice] Session error: {type(e).__name__}: {e}") + log.error(f"Session error: {type(e).__name__}: {e}") finally: receiver.cancel() pipeline.cancel() self.save_to_memory() - print(f"[Voice] Session ended: {self.session_id}") + log.info(f"Session ended: {self.session_id}") try: duration_ms = (time.monotonic() - run_t0) * 1000.0 turns = sum(1 for m in self.messages if m.get("role") == "user") @@ -1486,4 +1477,4 @@ async def close(self): try: await self._http.aclose() except Exception as e: - print(f"[Voice] HTTP client close warning: {e}") + log.warning(f"HTTP client close warning: {e}") \ No newline at end of file diff --git a/tests/test_top10_action_list.py b/tests/test_top10_action_list.py new file mode 100644 index 0000000..967960a --- /dev/null +++ b/tests/test_top10_action_list.py @@ -0,0 +1,268 @@ +"""Regression tests for the 2026-05-30 Top-10 Action List fixes. + +Each Ax test pins a specific finding the final audit surfaced. If any of +these fail in the future, the corresponding fix has regressed. + +- A1 / SR-8: /api/save_file must refuse writes under ~/.codec/, repo skills/, + system roots — mirrors PR-1C's file_write skill blocklist. +- A3 / SR-10: license._fetch_pubkey + license_state must be cached so the + paid edition doesn't hammer the license server on every skill call. +- A5 / SR-5: correlation_id contextvars must live in codec_audit and + codec_agents / codec_voice re-export them. Eliminates 3 import cycles. +- A7 / SR-11: codec_jsonstore.file_lock must not leak the lock-file handle + if open() raises after makedirs. +- A9 / SR-12: codec_memory_upgrade._conn() must set busy_timeout=5000. +- A10 / SR-13: _bg_watcher poll interval must be 1.0s (not 0.2s). +""" + +import os + +import pytest + + +# ── A1: /api/save_file blocklist ──────────────────────────────────────────── +class TestA1SaveFileBlocklist: + """A1: the /api/save_file HTTP endpoint must refuse writes to ~/.codec/ + and other sensitive roots — mirroring PR-1C's file_write skill blocklist. + Before this fix, an authenticated POST could drop a malicious plugin in + ~/.codec/plugins/ + add its hash to plugins.allowlist → RCE on next + dispatch tick. + """ + + def test_codec_home_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe( + os.path.expanduser("~/.codec/plugins/evil.py")) + assert ok is False + assert ".codec" in reason + + def test_codec_skills_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe( + os.path.expanduser("~/.codec/skills/evil.py")) + assert ok is False + + def test_codec_audit_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe( + os.path.expanduser("~/.codec/audit.log")) + assert ok is False + + def test_codec_config_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe( + os.path.expanduser("~/.codec/config.json")) + assert ok is False + + def test_plugins_allowlist_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe( + os.path.expanduser("~/.codec/plugins.allowlist")) + assert ok is False + + def test_system_root_refused(self): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe("/etc/passwd") + assert ok is False + + def test_repo_skills_refused(self): + from codec_dashboard import _save_file_is_safe + # codec_dashboard.py lives in the repo root; its sibling skills/ dir + # is in the blocklist. + import codec_dashboard + repo_skills = os.path.join( + os.path.dirname(os.path.abspath(codec_dashboard.__file__)), + "skills", "evil.py") + ok, reason = _save_file_is_safe(repo_skills) + assert ok is False + + @pytest.mark.parametrize("path", [ + "~/codec-workspace/output.txt", + "~/Desktop/note.md", + "~/Documents/draft.txt", + "/tmp/scratch.py", + ]) + def test_legitimate_paths_allowed(self, path): + from codec_dashboard import _save_file_is_safe + ok, reason = _save_file_is_safe(os.path.expanduser(path)) + assert ok is True, f"{path} should be allowed (reason: {reason})" + + def test_sensitive_filename_refused(self): + from codec_dashboard import _save_file_is_safe + # Filename pattern blocklist applies globally regardless of dir. + ok, reason = _save_file_is_safe( + os.path.expanduser("~/Documents/id_rsa")) + assert ok is False + + +# ── A3: License pubkey + state caching ────────────────────────────────────── +class TestA3LicenseCaching: + """A3: _fetch_pubkey and license_state must be cached so a paid edition + doesn't hammer the license server on every skill call. + """ + + def test_invalidate_caches_exists(self): + import codec_license + # The escape hatch for operators rotating their license token. + assert hasattr(codec_license, "_invalidate_caches") + codec_license._invalidate_caches() # should not raise + + def test_pubkey_cache_globals_exist(self): + import codec_license + assert hasattr(codec_license, "_PUBKEY_CACHE_VALUE") + assert hasattr(codec_license, "_PUBKEY_CACHE_TS") + assert hasattr(codec_license, "_PUBKEY_CACHE_TTL") + assert codec_license._PUBKEY_CACHE_TTL >= 3600.0, ( + "Pubkey TTL should be ≥1h to absorb the per-customer call rate") + + def test_license_state_cache_globals_exist(self): + import codec_license + assert hasattr(codec_license, "_LICENSE_STATE_CACHE_VALUE") + assert hasattr(codec_license, "_LICENSE_STATE_CACHE_TS") + assert hasattr(codec_license, "_LICENSE_STATE_CACHE_TTL") + + def test_pubkey_in_memory_cache_returns_same_bytes(self, monkeypatch): + import codec_license + codec_license._invalidate_caches() + # Stub urlopen to count hits. + calls = {"count": 0} + + class _StubResponse: + def __init__(self, data): + self._data = data + def __enter__(self): return self + def __exit__(self, *a): pass + def read(self): return self._data + + def _stub_urlopen(url, timeout): + calls["count"] += 1 + return _StubResponse( + b"-----BEGIN PUBLIC KEY-----\nfake\n-----END PUBLIC KEY-----") + + monkeypatch.setattr(codec_license.urllib.request, + "urlopen", _stub_urlopen) + # First call → network hit. + pem1 = codec_license._fetch_pubkey({}) + assert pem1 is not None + assert calls["count"] == 1 + # Second call within TTL → cache hit, no new network call. + pem2 = codec_license._fetch_pubkey({}) + assert pem2 == pem1 + assert calls["count"] == 1, ( + "Second call should hit the in-memory cache, not the network") + + +# ── A5: Correlation ID contextvars in codec_audit ─────────────────────────── +class TestA5CorrelationIdRefactor: + """A5: both correlation_id contextvars live in codec_audit; codec_agents + and codec_voice re-export them. Eliminates 3 of 4 documented import + cycles. + """ + + def test_correlation_id_canonical_home(self): + import codec_audit + assert hasattr(codec_audit, "_correlation_id_var") + assert hasattr(codec_audit, "_voice_correlation_id_var") + assert hasattr(codec_audit, "_new_correlation_id") + + def test_agents_reexport_identity(self): + # codec_agents must re-export the SAME object (not a copy) so any + # external module that imports codec_agents._correlation_id_var + # gets the canonical contextvar. + import codec_agents + import codec_audit + assert codec_agents._correlation_id_var is codec_audit._correlation_id_var + assert codec_agents._new_correlation_id is codec_audit._new_correlation_id + + def test_voice_reexport_identity(self): + import codec_voice + import codec_audit + assert codec_voice._voice_correlation_id_var is codec_audit._voice_correlation_id_var + + def test_new_correlation_id_format(self): + import codec_audit + cid = codec_audit._new_correlation_id() + assert isinstance(cid, str) + assert len(cid) == 12 # 6 bytes → 12 hex chars + # Lowercase hex only + assert all(c in "0123456789abcdef" for c in cid) + + def test_ask_user_reads_from_codec_audit(self): + """codec_ask_user should not import codec_agents or codec_voice for + the correlation_id — it should read from codec_audit only. + """ + from pathlib import Path + text = Path(__file__).parent.parent.joinpath("codec_ask_user.py").read_text() + # Grep for the legacy cycle-causing imports. + assert "from codec_agents import _correlation_id_var" not in text, ( + "codec_ask_user must not import _correlation_id_var from codec_agents") + assert "from codec_voice import _voice_correlation_id_var" not in text, ( + "codec_ask_user must not import _voice_correlation_id_var from codec_voice") + + +# ── A7: codec_jsonstore.file_lock no handle leak ──────────────────────────── +class TestA7JsonstoreFileLock: + """A7: file_lock must close the lock-sidecar file handle even when + flock or makedirs raises. + """ + + def test_file_lock_closes_handle_on_success(self, tmp_path): + from codec_jsonstore import file_lock + path = tmp_path / "test.json" + with file_lock(path): + assert (tmp_path / "test.json.lock").exists() + # Lock file is still on disk but the handle has been closed by the + # finally block. Open it again to confirm no exclusive lock leaked. + with open(path.as_posix() + ".lock", "w") as f: + f.write("") # would block if a prior LOCK_EX leaked + + def test_file_lock_handle_close_on_exception(self, tmp_path): + from codec_jsonstore import file_lock + path = tmp_path / "exc.json" + # An exception inside the with-block must still close the handle. + with pytest.raises(RuntimeError): + with file_lock(path): + raise RuntimeError("boom") + # Confirm we can re-acquire the lock (no leaked LOCK_EX). + with file_lock(path): + pass # if a prior fhandle leaked LOCK_EX, this would block + + +# ── A9: SQLite busy_timeout consistency ───────────────────────────────────── +class TestA9SqliteBusyTimeout: + """A9: codec_memory_upgrade._conn() must set busy_timeout=5000 to + eliminate intermittent SQLITE_BUSY under concurrent writes. + """ + + def test_memory_upgrade_conn_has_busy_timeout(self, monkeypatch, tmp_path): + # Use a temp DB so we don't touch the real ~/.codec/memory.db + import codec_memory_upgrade + monkeypatch.setattr(codec_memory_upgrade, "DB_PATH", + str(tmp_path / "facts.db")) + c = codec_memory_upgrade._conn() + try: + row = c.execute("PRAGMA busy_timeout").fetchone() + assert row is not None + assert row[0] == 5000, ( + f"busy_timeout should be 5000ms, got {row[0]}") + finally: + c.close() + + +# ── A10: _bg_watcher poll interval ────────────────────────────────────────── +class TestA10BgWatcherPollInterval: + """A10: _bg_watcher must poll at 1.0s, not the legacy 200ms (5x reduction + in syscall load per customer per day). + """ + + def test_bg_watcher_poll_is_1s(self): + from pathlib import Path + text = Path(__file__).parent.parent.joinpath("codec_dashboard.py").read_text() + # Find the _bg_watcher function body and check its sleep call. + assert "async def _bg_watcher" in text + # The 200ms legacy value must be gone. + assert "await asyncio.sleep(0.2)" not in text, ( + "_bg_watcher should not poll every 200ms anymore") + # The 1.0s value should be present in the watcher body. + assert "await asyncio.sleep(1.0)" in text, ( + "_bg_watcher should poll every 1.0s")