Skip to content

v0.10.0

Latest

Choose a tag to compare

@github-actions github-actions released this 29 Jun 03:03
· 3 commits to main since this release

Removed

  • The optional LLM layer is reduced to a single classify-unknown job (nah-1010).
    Removed the LLM ask-refinement / Layer-2 intent relaxer (the cite-or-ask
    ask → allow path, its tiered risk veto, and every per-action relax opt-in),
    the visible inline lang_exec LLM review, the transcript-reading prompt context,
    and the llm.eligible / llm.deny_limit / llm_risks.py machinery. The optional
    LLM (still off by default; llm.mode: on) can no longer relax a known ask,
    review inline code, or read your conversation — it only classifies unknowns
    (see Added). Claude and Codex share this one path.
  • Removed the LLM write content-review gate (nah-997) that inspected
    Write/Edit/MultiEdit/NotebookEdit and Codex apply_patch payloads as data-at-rest
    and could escalate a clean allow to ask. Write-like tools are now guarded by
    the deterministic floor only — sensitive-path block, project-boundary, and
    destructive-patch checks — which is cheap, clear, and unchanged.
  • Removed the session taint tracking and provenance features entirely
    (src/nah/taint.py, src/nah/provenance.py) along with all runtime wiring
    (Claude hook.py, Codex codex_hooks.py/codex_run.py, terminal guard), the
    taint/provenance config surface, the LLM provenance-review path, and the
    log/message rendering and docs (nah-1009). Both were opt-in and off by default,
    so removal is behavior-neutral for current users; the deterministic classifier,
    LLM classify-unknown path, and the 43 action types are unchanged. The non-headless
    Codex PreToolUse hook is now fully observation-inert (its only job was taint
    state); enforcement still happens at PermissionRequest.
  • Removed deterministic secret-looking and credential-path content scanning, along with
    secret redaction on LLM prompt/transcript context and local post-tool error summaries.
    Secret protection now relies on structural controls such as sensitive paths,
    credential-search detection, and explicit secret-store/env reads
    rather than guessing token-shaped text in write payloads (nah-1006).
  • Removed the /nah-demo Claude Code showcase and its curated cases
    (src/nah/demo_cases.py, src/nah/data/nah_demo.json, the .claude/commands/nah-demo.md
    slash command, and tests/test_nah_demo.py). It was a product demo, not part of the
    guard or the regression suite; pytest remains the coverage source and
    nah audit-threat-model the coverage report.

Added

  • Optional LLM classify-unknown (nah-982, nah-994). When the deterministic
    classifier returns unknown for a Bash command, the optional LLM (still off by
    default; llm.mode: on) maps it to a built-in action type and the kind-tagged
    targets it touches. The mapped type re-enters the normal policy machinery and
    each surfaced path/host target is re-checked through the same deterministic
    floor
    (sensitive paths, project boundary, known hosts): the LLM extracts, the
    floor matches. db/container targets have no faithfully-mirrorable floor (the
    real db/container floors are policy-/cwd-/exec-specific), so they stay
    unverifiable and the mapped type's policy decides — allow-policy safe reads
    clear, context-policy execs ask (nah-994). A read of ~/.ssh is never
    auto-allowed; an unverifiable target falls back to ask; an obfuscated unknown can
    tighten to block. Fail-closed, process-cached, and command-only (no transcript).
    entry["llm"] records the classify pass with a top-level action_type_source
    (deterministic|llm_classify) and a new nah log --classified filter;
    nah test shows the classification and per-target floor verdicts.
  • Flag-aware env_read classification for shell builtins, ps, and caddy fmt (nah-1005).
    Follow-up to nah-1004 covering the cases a static prefix table can't express because the
    safe and unsafe forms are the same command split by flags:
    • bare env (no inner command), bare set, and bare/-p export/declare/typeset
      env_read (ask), while their assignment, option (set -x), and exec-wrapper
      (env FOO=bar cmd) forms keep their existing classification.
    • ps with the BSD environment modifier (ps e, ps eww, ps auxe) → env_read, while
      SysV ps -e/-ef (all processes) and value-flag forms (ps -u <user>, ps -o pid,etime) correctly stay filesystem_read — the classifier is value-flag-aware to
      avoid false positives.
    • caddy fmt --overwritefilesystem_write; bare caddy fmtfilesystem_read.
    • Removes the now-redundant static export -p/declare -p/typeset -p entries from the
      env_read table (the builtin classifier owns them).
  • service_inspect and env_read action types; service_read narrowed to remote (nah-1004).
    service_read was overloaded: its static table was 100% local daemon inspection
    (systemctl status, journalctl) while every remote API read (curl GET, gRPC,
    GraphQL) was classified dynamically, so its single context policy fit only the
    remote half and the audit label ("remote API state") was wrong for the local half.
    • service_inspect (policy allow) is the honest home for local service/daemon
      inspection — the systemd entries move here, joined by caddy version/list-modules,
      launchctl list/print, sc query/queryex/qc, rc-status/rc-service -l, and
      service --status-all. It is deliberately kept out of the data-egress
      boundary (local inspection is not network egress).
    • env_read (policy ask) is the honest home for commands whose purpose is
      exposing environment or secret values — printenv, caddy environ,
      systemctl show-environment, export -p/declare -p/typeset -p, and secret-store
      reads (vault read/kv get, aws secretsmanager get-secret-value,
      aws ssm get-parameter, gcloud secrets versions access, az keyvault secret show,
      kubectl get/describe secret, pass show, op read/item get, bw get,
      heroku config, doppler secrets, infisical secrets, chamber read/export,
      sops -d). These were previously unknown → ask, which lied in the audit log and
      fired a wasted LLM classify on every invocation. systemctl show-environment moves
      from a silent service_read → allow to an honest env_read → ask. Name-only listers
      (gh secret list, etc.) are intentionally excluded; secret-injecting exec wrappers
      (op run, doppler run, aws-vault exec) stay on the exec path. Flag-dependent
      forms (bare env/set/export, ps env-flags, caddy fmt --overwrite) are
      deferred to a follow-up (nah-1005). Also classifies crontab -l and caddy validate
      as filesystem_read.
  • talosctl global flag stripping before subcommand classificationtalosctl -n <ip> get routes, talosctl --nodes=<ip> dmesg, and other talosctl commands that carry connection global flags (-n/--nodes, -e/--endpoints, -c/--cluster, --context, --talosconfig) now strip those flags before the global-table prefix match instead of falling through to unknown. Mirrors the kubectl/flux idiom and fails closed: unknown or malformed pre-subcommand flags stay on the unknown ask path, and dangerous subcommands such as talosctl reboot/talosctl reset still classify as configured. Closes #86; PR #89 by @srgvg.
  • flux global flag stripping before subcommand classificationflux -n <ns> get kustomizations, flux --namespace=<ns> list, and other flux commands that carry kubeconfig-style global flags (-n/--namespace, --context, --kubeconfig, --timeout, --token, ...) now strip those flags before the global-table prefix match instead of falling through to unknown. Mirrors the kubectl/talosctl idiom and fails closed: unknown or malformed pre-subcommand flags stay on the unknown ask path, and destructive subcommands such as flux delete/flux uninstall still classify as configured. Closes #87; PR #90 by @srgvg.
  • Codex hook-timeout probenah run codex --probe[=DELAY] arms a
    debug-only stall in nah's Codex hooks (gated behind NAH_HOOK_PROBE, capped
    at 60s, verdict unchanged) so you can observe the timeout Codex actually
    enforces. nah run codex --measure-hook-timeout drives Codex with the probe
    and reports enforced-vs-configured timeouts, defaulting to PostToolUse (the
    only event that both fires and is enforced under headless codex exec).
    Documented in the CLI reference.

Changed

  • Terminal Guard is deterministic-only (nah-985). The interactive bash/zsh
    terminal guard has no LLM step. A command you type directly into your shell is
    already your own intent, so there is no agent transcript to mine — the guard
    classifies to allow / ask / block and an ask is confirmed inline at the
    prompt. The shared llm.mode and targets.bash.llm.mode / targets.zsh.llm.mode
    knobs are still accepted for backward compatibility but no longer affect terminal
    decisions.
  • Container write taxonomy split by verifiable risk axis (nah-996).
    container_write is replaced by container_lifecycle and
    container_build. Lifecycle operations that act on named containers
    (docker stop api, podman restart worker) are context policy and use
    trusted_containers: every flag-free identity must be trusted, while flags,
    dynamic names, and compose lifecycle commands ask. Build/image/infra commands
    (docker build, docker compose build, docker network create) are
    container_build with default allow and no cwd gate; autonomous presets can
    tighten it with actions: {container_build: block}. Legacy
    container_write in actions: fans out to both new types, classify: maps
    to conservative container_lifecycle, and interactive allow/deny/
    classify/forget commands now ask users to choose one of the new types.
  • Database taxonomy gates SQL-exec capability, not SQL intent (nah-995).
    Replaces db_read/db_write with db_safe/db_exec: structurally-safe
    database surfaces such as dolt log/status/diff/branch and Supabase list/get
    tools are db_safe (allow), while tools that can run caller-supplied SQL
    are db_exec (context) and continue to use db_targets for target-scoped
    allow. The old db_read/db_write config names are accepted as deprecated
    aliases and canonicalized with a one-time warning. The previous
    sqlite3 -readonly and PGOPTIONS/psql -X read-only special cases are
    removed; those invocations now classify as db_exec and ask unless
    db_targets allows the target.
  • Layer 1 classifies into built-in types only (nah-992). The classify-unknown
    pass is not offered the user's custom action types — it maps into the built-in
    taxonomy only. This stops the model from collapsing a whole unknown compound into
    a trusted custom allow type (e.g. a cd repo && molds … && molds wontdo …
    block landing on a custom molds_safe → allow). A custom type the model names
    anyway is coerced to unknown, so the deterministic ask stands.
  • Codex lifecycle commands normalized to nah <command> codex (nah-960).
    nah status codex (read-only preflight), a new top-level nah setup codex,
    and nah uninstall codex now match the install/status shape used by every
    other runtime; nah run codex is unchanged. Breaking: the old
    nah codex doctor / nah codex setup / nah codex remove-setup subcommands
    are removed (no aliases) and exit nonzero — use nah status codex /
    nah setup codex / nah uninstall codex instead. nah status codex also
    fixes a silent no-op (it used to parse and exit 0 with no output) and is
    strictly read-only: it reports missing or stale rules and exits nonzero
    without creating them. nah doctor codex and nah doctor claude now point to
    nah status …. The hook-timeout probe moved from nah codex measure-hook-timeout to the nah run codex --measure-hook-timeout debug mode.

Fixed

  • nah test dry-runs no longer self-flag on sensitive paths in their arguments
    (nah-qb3). A nah test invocation whose arguments named a sensitive path as a
    bareword or flag value (e.g. nah test --tool Read ~/.ssh/id_rsa) was flagged by
    nah's own hook as a real sensitive access and paused for approval, even though
    nah test is a pure dry-run classifier with no filesystem or execution side
    effects. The _classify_nah_cli classifier now recognizes nah test and allows
    it without scanning its argument tokens for sensitive paths. Output redirections
    (caught by the redirect guard) and command/process substitutions (classified
    independently upstream) stay guarded, and the exemption is exact-match and
    stage-local, so adjacent stages like nah test foo && rm -rf ~/.ssh are unaffected.