diff --git a/AGENTS.md b/AGENTS.md
index 7176d3c3..cffa28c9 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -27,14 +27,13 @@ Fix root cause. Unsure: read more code; if stuck, ask w/ short options. Unrecogn
 | 002-parser | Bash syntax parser design |
 | 003-vfs | Virtual filesystem abstraction |
 | 004-testing | Testing strategy and patterns |
-| 005-builtins | Builtin command implementations |
+| 005-builtins | Builtin command design (trait, ShellRef, ExecutionPlan) |
 | 005-security-testing | Fail-point injection for security testing |
 | 006-threat-model | Security threats and mitigations |
 | 007-parallel-execution | Threading model, Arc usage |
 | 008-documentation | Rustdoc guides, embedded markdown |
-| 008-posix-compliance | POSIX design rationale, security exclusions |
-| 008-release-process | Version tagging, crates.io + PyPI publishing |
-| 009-implementation-status | Feature status, test coverage, limitations |
+| 008-release-process | Version tagging, crates.io + PyPI + npm publishing |
+| 009-implementation-status | Feature status, test coverage, limitations, POSIX compliance |
 | 009-tool-contract | Public LLM Tool trait contract |
 | 010-git-support | Sandboxed git operations on VFS |
 | 011-python-builtin | Embedded Python via Monty, security, resource limits |
@@ -42,6 +41,7 @@ Fix root cause. Unsure: read more code; if stuck, ask w/ short options. Unrecogn
 | 012-maintenance | Pre-release maintenance requirements |
 | 013-python-package | Python package, PyPI wheels, platform matrix |
 | 014-scripted-tool-orchestration | Compose ToolDef+callback pairs into OrchestratorTool via bash scripts |
+| 015-ssh-support | Sandboxed SSH/SCP/SFTP operations |
 | 016-zapcode-runtime | Embedded TypeScript via ZapCode, VFS bridging, resource limits |
 | 017-request-signing | Transparent Ed25519 request signing (bot-auth) per RFC 9421 |
 | 018-interactive-shell | Interactive REPL mode with rustyline line editing |
diff --git a/crates/bashkit/docs/compatibility.md b/crates/bashkit/docs/compatibility.md
index 0753ff2e..9a714a57 100644
--- a/crates/bashkit/docs/compatibility.md
+++ b/crates/bashkit/docs/compatibility.md
@@ -12,7 +12,7 @@
 ## POSIX Shell Compliance
 
 Bashkit provides substantial compliance with IEEE Std 1003.1-2024 (POSIX.1-2024)
-Shell Command Language. See [specs/008-posix-compliance.md](../specs/008-posix-compliance.md)
+Shell Command Language. See [specs/009-implementation-status.md](../specs/009-implementation-status.md)
 for detailed compliance status.
 
 | POSIX Category | Status |
diff --git a/docs/security.md b/docs/security.md
index d7f6badd..a9fdd97f 100644
--- a/docs/security.md
+++ b/docs/security.md
@@ -50,7 +50,7 @@ the sandbox. Key exclusions:
 - **`trap`** — conflicts with the stateless execution model
 - **Real process spawning** — all subprocess commands stay within the virtual interpreter (`TM-ESC-015`)
 
-These decisions are documented in [`specs/008-posix-compliance.md`](../specs/008-posix-compliance.md).
+These decisions are documented in [`specs/009-implementation-status.md`](../specs/009-implementation-status.md).
 
 ## Security testing
 
diff --git a/specs/001-architecture.md b/specs/001-architecture.md
index 84d73cc9..2a6f1158 100644
--- a/specs/001-architecture.md
+++ b/specs/001-architecture.md
@@ -9,103 +9,22 @@ Implemented
 
 Bashkit uses a Cargo workspace with multiple crates:
 
-```
-bashkit/
-├── crates/
-│   ├── bashkit/           # Core library
-│   └── bashkit-cli/       # CLI binary (future)
-├── specs/                 # Design specifications
-├── tests/                 # Integration tests (future)
-└── Cargo.toml            # Workspace root
-```
-
-### Core Library Structure (`crates/bashkit/`)
-
-```
-src/
-├── lib.rs                # Public API: Bash struct
-├── error.rs              # Error types
-├── limits.rs             # Execution limits
-├── tool.rs               # LLM tool contract (Tool, ToolBuilder)
-├── parser/               # Lexer + Parser + AST
-│   ├── mod.rs           # Parser implementation
-│   ├── lexer.rs         # Tokenization
-│   ├── tokens.rs        # Token types
-│   └── ast.rs           # AST node types
-├── interpreter/          # Execution engine
-│   ├── mod.rs           # Interpreter implementation
-│   ├── glob.rs          # Glob pattern matching and expansion
-│   ├── state.rs         # ExecResult and state types
-│   └── jobs.rs          # Job table for background execution
-├── fs/                   # Virtual filesystem
-│   ├── mod.rs           # Module exports
-│   ├── traits.rs        # FileSystem trait
-│   └── memory.rs        # InMemoryFs implementation
-├── network/              # Network access (optional)
-│   ├── mod.rs           # Module exports
-│   ├── allowlist.rs     # URL allowlist
-│   └── client.rs        # HTTP client
-└── builtins/            # Built-in commands
-    ├── mod.rs           # Builtin trait + Context
-    ├── echo.rs          # echo, printf
-    ├── flow.rs          # true, false, exit, break, continue, return
-    ├── navigation.rs    # cd, pwd
-    ├── fileops.rs       # mkdir, rm, cp, mv, touch, chmod
-    ├── headtail.rs      # head, tail
-    ├── sortuniq.rs      # sort, uniq
-    ├── cuttr.rs         # cut, tr
-    ├── wc.rs            # wc
-    ├── date.rs          # date
-    ├── sleep.rs         # sleep
-    ├── wait.rs          # wait
-    ├── curl.rs          # curl, wget
-    └── ...              # grep, sed, awk, jq, etc.
-```
+| Crate | Purpose |
+|-------|---------|
+| `crates/bashkit/` | Core library (parser, interpreter, VFS, builtins, tool contract) |
+| `crates/bashkit-cli/` | CLI binary |
+| `crates/bashkit-python/` | Python bindings (PyO3) |
+| `crates/bashkit-js/` | JavaScript bindings (NAPI-RS) |
+| `crates/bashkit-eval/` | LLM evaluation harness |
+
+The core library modules: `parser/`, `interpreter/`, `fs/`, `builtins/`,
+`network/`, `git/`, `ssh/`, `scripted_tool/`. See the source for current
+structure — it evolves as features are added.
 
 ### Public API
 
-```rust
-// Main entry point
-pub struct Bash {
-    fs: Arc<dyn FileSystem>,
-    interpreter: Interpreter,
-}
-
-impl Bash {
-    pub fn new() -> Self;
-    pub fn builder() -> BashBuilder;
-    pub async fn exec(&mut self, script: &str) -> Result<ExecResult>;
-}
-
-pub struct ExecResult {
-    pub stdout: String,
-    pub stderr: String,
-    pub exit_code: i32,
-}
-
-// LLM Tool Contract
-pub trait Tool: Send + Sync {
-    fn name(&self) -> &str;
-    fn short_description(&self) -> &str;
-    fn description(&self) -> String;          // Dynamic, includes custom builtins
-    fn help(&self) -> String;              // Full docs for LLMs
-    fn system_prompt(&self) -> String;        // Token-efficient for sysprompt
-    fn input_schema(&self) -> serde_json::Value;
-    fn output_schema(&self) -> serde_json::Value;
-    fn version(&self) -> &str;
-    async fn execute(&mut self, req: ToolRequest) -> ToolResponse;
-    async fn execute_with_status(...) -> ToolResponse;
-}
-
-pub struct BashTool { /* virtual bash implementing Tool */ }
-pub struct BashToolBuilder { /* builder pattern */ }
-pub struct ToolRequest { commands: String }   // Like bash -c
-pub struct ToolResponse { stdout, stderr, exit_code, error }
-
-impl BashTool {
-    pub fn builder() -> BashToolBuilder;
-}
-```
+Main entry point is `Bash` (library) and `BashTool` (LLM tool contract).
+See `crates/bashkit/src/lib.rs` for the full public API surface.
 
 ### Design Principles
 
@@ -119,24 +38,18 @@ impl BashTool {
 ### Single crate vs workspace
 Rejected single crate because:
 - CLI binary would bloat the library
-- Python package needs separate crate
+- Python/JS packages need separate crates
 - Cleaner separation of concerns
 
 ### Sync vs async filesystem
 Rejected sync because:
-- just-bash is fully async
-- Future network operations need async
+- Bashkit is fully async
+- Network operations need async
 - tokio is already a dependency
 
 ## Verification
 
 ```bash
-# Build succeeds
 cargo build
-
-# Tests pass including e2e
 cargo test
-
-# Basic usage works
-cargo test --lib -- tests::test_echo_hello
 ```
diff --git a/specs/002-parser.md b/specs/002-parser.md
index 0be2dfff..64017a54 100644
--- a/specs/002-parser.md
+++ b/specs/002-parser.md
@@ -13,93 +13,8 @@ Bashkit uses a recursive descent parser with a context-aware lexer.
 Input → Lexer → Tokens → Parser → AST
 ```
 
-### Token Types
-
-```rust
-pub enum Token {
-    // Literals
-    Word(String),           // Unquoted or quoted words
-    Number(i64),            // Integer literals
-
-    // Operators
-    Pipe,                   // |
-    And,                    // &&
-    Or,                     // ||
-    Semicolon,              // ;
-    Newline,                // \n
-    Background,             // &
-
-    // Redirections
-    RedirectOut,            // >
-    RedirectAppend,         // >>
-    RedirectIn,             // <
-    HereDoc,                // <<
-    HereString,             // <<<
-    ProcessSubIn,           // <(
-    ProcessSubOut,          // >(
-
-    // File descriptor redirections
-    RedirectBoth,           // &>
-    DupOutput,              // >&
-    RedirectFd(i32),        // 2>
-    RedirectFdAppend(i32),  // 2>>
-    DupFd(i32, i32),        // 2>&1
-
-    // Grouping
-    LeftParen,              // (
-    RightParen,             // )
-    LeftBrace,              // {
-    RightBrace,             // }
-    DoubleLeftBracket,      // [[
-    DoubleRightBracket,     // ]]
-
-    // Keywords (detected after lexing)
-    // if, then, else, elif, fi
-    // for, while, until, do, done
-    // case, esac, in
-    // function
-}
-```
-
-### AST Structure
-
-```rust
-pub struct Script {
-    pub commands: Vec<Command>,
-}
-
-pub enum Command {
-    Simple(SimpleCommand),      // ls -la
-    Pipeline(Pipeline),         // cmd1 | cmd2 | cmd3
-    List(CommandList),          // cmd1 && cmd2 || cmd3
-    Compound(CompoundCommand),  // if, for, while, case, { }, ( )
-    Function(FunctionDef),      // function foo() { }
-}
-
-pub struct SimpleCommand {
-    pub name: Word,
-    pub args: Vec<Word>,
-    pub redirects: Vec<Redirect>,
-    pub assignments: Vec<Assignment>,  // VAR=value cmd
-}
-
-pub struct Word {
-    pub parts: Vec<WordPart>,
-}
-
-pub enum WordPart {
-    Literal(String),
-    Variable(String),           // $VAR
-    CommandSub(Script),         // $(cmd) or `cmd`
-    ArithmeticSub(String),      // $((expr))
-    DoubleQuoted(Vec<WordPart>), // "text $var"
-    Length(String),             // ${#var}
-    ArrayAccess { name, index },// ${arr[i]} or ${arr[@]}
-    ArrayLength(String),        // ${#arr[@]}
-    ArrayIndices(String),       // ${!arr[@]}
-    ParameterExpansion { ... }, // ${var:-default}, ${var#pattern}, etc.
-}
-```
+Token types, AST structures, and parser grammar are defined in
+`crates/bashkit/src/parser/`. They evolve as features are added.
 
 ### Parser Rules (Simplified)
 
@@ -113,12 +28,9 @@ redirect      → ('>' | '>>' | '<' | '<<' | '<<<') word
                | NUMBER ('>' | '<') word
 ```
 
-Note: The `&` operator marks the preceding command for background execution.
-Currently, background commands run synchronously but are parsed correctly.
-
 ### Context-Aware Lexing
 
-The lexer must handle bash's context-sensitivity:
+The lexer handles bash's context-sensitivity:
 - `$var` in double quotes: expand variable
 - `$var` in single quotes: literal text
 - Word splitting after expansion
@@ -128,51 +40,26 @@ The lexer must handle bash's context-sensitivity:
 
 ### Arithmetic Expressions
 
-Arithmetic expansion `$((expr))` supports:
-- Basic operators: `+`, `-`, `*`, `/`, `%`
-- Comparison: `==`, `!=`, `<`, `>`, `<=`, `>=`
-- Logical: `&&`, `||` (with short-circuit evaluation)
-- Bitwise: `&`, `|`, `^`, `~`, `<<`, `>>`
-- Ternary: `cond ? true : false`
-- Variable references with or without `$` prefix
+`$((expr))` supports: `+`, `-`, `*`, `/`, `%`, comparisons, logical `&&`/`||`
+(short-circuit), bitwise operators, ternary `?:`, variable references.
 
 ### Error Recovery
 
-Parser produces errors with:
-- Line and column numbers
-- Expected vs. found token
-- Context (what was being parsed)
+Parser produces errors with line/column numbers, expected vs. found token,
+and context (what was being parsed).
 
 ## Alternatives Considered
 
 ### PEG parser (pest, pom)
-Rejected because:
-- Bash grammar is context-sensitive
-- PEG can't handle here-docs well
-- Manual parser gives better error messages
+Rejected: Bash grammar is context-sensitive, PEG can't handle here-docs well,
+manual parser gives better error messages.
 
 ### Tree-sitter
-Rejected because:
-- Designed for incremental parsing (overkill)
-- Would add large dependency
-- Harder to customize for our needs
+Rejected: Designed for incremental parsing (overkill), large dependency,
+harder to customize.
 
 ## Verification
 
-```rust
-#[test]
-fn test_parse_pipeline() {
-    let parser = Parser::new("echo hello | cat");
-    let script = parser.parse().unwrap();
-    assert!(matches!(script.commands[0], Command::Pipeline(_)));
-}
-
-#[test]
-fn test_parse_redirect() {
-    let parser = Parser::new("echo hello > /tmp/out");
-    let script = parser.parse().unwrap();
-    if let Command::Simple(cmd) = &script.commands[0] {
-        assert_eq!(cmd.redirects.len(), 1);
-    }
-}
+```bash
+cargo test --lib -- parser
 ```
diff --git a/specs/003-vfs.md b/specs/003-vfs.md
index 52c11c6b..b2c201d4 100644
--- a/specs/003-vfs.md
+++ b/specs/003-vfs.md
@@ -12,269 +12,90 @@ Bashkit uses a two-layer filesystem abstraction:
 | Backend | `FsBackend` | Raw storage operations (minimal contract) |
 | POSIX | `FileSystem` / `PosixFs` | POSIX-like semantics enforcement |
 
-### FsBackend Trait (Raw Storage)
-
-```rust
-#[async_trait]
-pub trait FsBackend: Send + Sync {
-    async fn read(&self, path: &Path) -> Result<Vec<u8>>;
-    async fn write(&self, path: &Path, content: &[u8]) -> Result<()>;
-    async fn append(&self, path: &Path, content: &[u8]) -> Result<()>;
-    async fn mkdir(&self, path: &Path, recursive: bool) -> Result<()>;
-    async fn remove(&self, path: &Path, recursive: bool) -> Result<()>;
-    async fn stat(&self, path: &Path) -> Result<Metadata>;
-    async fn read_dir(&self, path: &Path) -> Result<Vec<DirEntry>>;
-    async fn exists(&self, path: &Path) -> Result<bool>;
-    async fn rename(&self, from: &Path, to: &Path) -> Result<()>;
-    async fn copy(&self, from: &Path, to: &Path) -> Result<()>;
-    async fn symlink(&self, target: &Path, link: &Path) -> Result<()>;
-    async fn read_link(&self, path: &Path) -> Result<PathBuf>;
-    async fn chmod(&self, path: &Path, mode: u32) -> Result<()>;
-}
-```
-
 `FsBackend` handles raw storage without enforcing POSIX semantics.
 Implementations can be wrapped with `PosixFs` to get type-safe behavior.
 
-### FileSystem Trait (POSIX Semantics)
+See `crates/bashkit/src/fs/` for trait definitions and implementations.
 
-```rust
-#[async_trait]
-pub trait FileSystem: Send + Sync {
-    // Read operations
-    async fn read_file(&self, path: &Path) -> Result<Vec<u8>>;
-    async fn stat(&self, path: &Path) -> Result<Metadata>;
-    async fn read_dir(&self, path: &Path) -> Result<Vec<DirEntry>>;
-    async fn exists(&self, path: &Path) -> Result<bool>;
-    async fn read_link(&self, path: &Path) -> Result<PathBuf>;
-
-    // Write operations
-    async fn write_file(&self, path: &Path, content: &[u8]) -> Result<()>;
-    async fn append_file(&self, path: &Path, content: &[u8]) -> Result<()>;
-    async fn mkdir(&self, path: &Path, recursive: bool) -> Result<()>;
-    async fn remove(&self, path: &Path, recursive: bool) -> Result<()>;
-    async fn rename(&self, from: &Path, to: &Path) -> Result<()>;
-    async fn copy(&self, from: &Path, to: &Path) -> Result<()>;
-    async fn symlink(&self, target: &Path, link: &Path) -> Result<()>;
-    async fn chmod(&self, path: &Path, mode: u32) -> Result<()>;
-}
-```
-
-### Public API
+### Which Trait Should I Implement?
 
-All filesystem types are exported from the crate root:
-
-```rust
-pub use bashkit::{
-    // Traits for implementing custom filesystems
-    async_trait,       // Re-exported for convenience
-    FsBackend,         // Low-level storage trait (implement this)
-    FileSystem,        // High-level POSIX trait
-
-    // Types
-    FileType,          // File/Directory/Symlink enum
-    Metadata,          // File metadata struct
-    DirEntry,          // Directory entry struct
-    Result,            // Result type alias
-    Error,             // Error type
-
-    // Built-in implementations
-    InMemoryFs,        // In-memory filesystem (implements FileSystem)
-    OverlayFs,         // Copy-on-write overlay
-    MountableFs,       // Multiple mount points
-    PosixFs,           // POSIX wrapper for any FsBackend
-};
 ```
-
-### Metadata
-
-```rust
-pub struct Metadata {
-    pub file_type: FileType,
-    pub size: u64,
-    pub mode: u32,        // Unix permissions
-    pub modified: SystemTime,
-    pub created: SystemTime,
-}
-
-pub enum FileType {
-    File,
-    Directory,
-    Symlink,
-}
-
-pub struct DirEntry {
-    pub name: String,
-    pub metadata: Metadata,
-}
+Do you need a custom filesystem?
+    │
+    ├─ NO → Use InMemoryFs (default with Bash::new())
+    │
+    └─ YES → Is your storage simple (key-value, database, cloud)?
+              │
+              ├─ YES → Implement FsBackend + wrap with PosixFs
+              │        (POSIX checks are automatic, less code)
+              │
+              └─ NO → Implement FileSystem directly
+                      (full control, you handle all checks)
 ```
 
-#### File Size Reporting
-
-The `size` field in `Metadata` must be set correctly for builtins like `ls -l`, `stat`, and `test -s` to work:
-
-| Entry Type | Expected `size` Value |
-|------------|----------------------|
-| Regular file | Actual content length in bytes |
-| Empty file | `0` |
-| Directory | `0` (not the size of contents) |
-| Symlink | `0` or length of target path |
-
-**Critical for custom implementations**: Both `stat()` and `read_dir()` must return consistent, accurate sizes. The `DirEntry.metadata.size` returned by `read_dir()` is used directly by `ls -l` without additional `stat()` calls.
-
-**Common pitfall**: When implementing `read_dir()`, ensure directory entries report size `0`, not the size of files within them. See `tests/custom_fs_tests.rs` for reference implementation.
+| Approach | Implement | POSIX Checks | Best For |
+|----------|-----------|--------------|----------|
+| `FsBackend` + `PosixFs` | Raw storage only | Automatic | Databases, cloud, key-value stores |
+| `FileSystem` directly | Everything | Manual | Complex caching, custom semantics |
 
 ### Implementations
 
 #### InMemoryFs
-- All files stored in `HashMap<PathBuf, FsEntry>`
-- Thread-safe via `RwLock`
+- All files stored in `HashMap<PathBuf, FsEntry>`, thread-safe via `RwLock`
 - Initial directories: `/`, `/tmp`, `/home`, `/home/user`, `/dev`
 - Special handling for `/dev/null`, `/dev/urandom`, `/dev/random`
-- No persistence - state lost on drop
-- Synchronous `add_file()` method for pre-population during construction
+- No persistence — state lost on drop
 
 ##### Mounting Files with BashBuilder
 
-The `BashBuilder` provides convenience methods to mount files in the virtual filesystem:
-
 ```rust
 let mut bash = Bash::builder()
-    // Writable file (mode 0o644)
     .mount_text("/config/app.conf", "debug=true\nport=8080\n")
-    // Readonly file (mode 0o444)
     .mount_readonly_text("/etc/version", "1.2.3")
     .build();
 ```
 
-- `mount_text(path, content)` - Creates writable file (mode `0o644`)
-- `mount_readonly_text(path, content)` - Creates readonly file (mode `0o444`)
-- Parent directories are created automatically
-- Works with any filesystem - mounted files are added via an OverlayFs layer
-
-When mounted files are specified, they're added to an OverlayFs layer on top of
-the base filesystem. This means:
-- The base filesystem remains unchanged
-- Mounted files take precedence over base filesystem files
-- Works with both default InMemoryFs and custom filesystems
-
-Use cases:
-- Configuration files for scripts to read
-- Reference data that shouldn't be modified
-- Pre-seeding test fixtures
-- Simulating system files like `/etc/passwd`
-
-See `examples/text_files.rs` for comprehensive examples.
-
 #### OverlayFs
 - Copy-on-write layer over another FileSystem
-- Read from base, write to overlay
 - Whiteout tracking for deleted files
 - Useful for: temp modifications, testing, isolation
 
-```rust
-pub struct OverlayFs {
-    lower: Arc<dyn FileSystem>,
-    upper: InMemoryFs,
-    whiteouts: RwLock<HashSet<PathBuf>>,
-}
-```
-
 #### MountableFs
 - Mount multiple filesystems at different paths
-- Like Unix mount points
 - Longest-prefix matching for nested mounts
 - Always used as outermost FS layer for live mount/unmount support
 
-```rust
-pub struct MountableFs {
-    root: Arc<dyn FileSystem>,
-    mounts: RwLock<BTreeMap<PathBuf, Arc<dyn FileSystem>>>,
-}
-```
-
 #### RealFs (Optional, `realfs` feature)
 - Direct access to a host directory as an `FsBackend`
-- Gated behind `realfs` feature flag (breaks sandbox boundary)
 - Two modes: `ReadOnly` (safe) and `ReadWrite` (dangerous)
 - Path traversal prevented via canonicalization + root prefix check
-- Use with `PosixFs` wrapper for POSIX semantics
-
-```rust
-pub struct RealFs {
-    root: PathBuf,      // Canonicalized host directory
-    mode: RealFsMode,   // ReadOnly or ReadWrite
-}
-```
 
 ##### Builder Methods
 
 ```rust
-// Overlay host dir at VFS root (readonly)
 Bash::builder().mount_real_readonly("/path/to/dir")
-// Mount host dir at specific VFS path (readonly)
 Bash::builder().mount_real_readonly_at("/path/to/dir", "/mnt/data")
-// Same with read-write access
 Bash::builder().mount_real_readwrite("/path/to/dir")
-Bash::builder().mount_real_readwrite_at("/path/to/dir", "/mnt/workspace")
 ```
 
 ##### CLI Usage
 
 ```bash
-# Readonly mount
 bashkit --mount-ro /path/to/project -c 'cat /README.md'
-bashkit --mount-ro /path/to/data:/mnt/data -c 'cat /mnt/data/file.txt'
-
-# Read-write mount (WARNING: breaks sandbox)
 bashkit --mount-rw /path/to/output:/mnt/output -c 'echo result > /mnt/output/result.txt'
 ```
 
-##### Layering
-
-When using root overlays (no mount point), the real FS becomes the lower
-layer of an `OverlayFs`. Writes go to the in-memory upper layer:
-
-```text
-┌──────────────────┐
-│  OverlayFs       │
-│  ┌─────────────┐ │  writes go here (in-memory)
-│  │ Upper layer  │ │
-│  └─────────────┘ │
-│  ┌─────────────┐ │  reads fall through here (real FS)
-│  │ RealFs      │ │
-│  └─────────────┘ │
-└──────────────────┘
-```
-
-When using mount points, `MountableFs` routes paths to the appropriate FS:
-
-```text
-┌─────────────────────────────────────┐
-│ MountableFs                         │
-│  /          → InMemoryFs (default)  │
-│  /mnt/data  → PosixFs<RealFs>      │
-│  /mnt/out   → PosixFs<RealFs(rw)>  │
-└─────────────────────────────────────┘
-```
-
-See `crates/bashkit/examples/realfs_readonly.rs`,
-`crates/bashkit/examples/realfs_readwrite.rs`, and
-`examples/realfs_mount.sh` (bash script calling bashkit CLI).
-
 #### Live Mount/Unmount
 
-Every `Bash` instance wraps its filesystem stack in a `MountableFs` as
-the outermost layer. This enables post-build mount/unmount without
-rebuilding the interpreter:
+Every `Bash` instance wraps its filesystem stack in a `MountableFs`. This
+enables post-build mount/unmount without rebuilding the interpreter:
 
 ```rust
-bash.mount("/mnt/data", data_fs)?;   // attach filesystem
-bash.unmount("/mnt/data")?;          // detach filesystem
+bash.mount("/mnt/data", data_fs)?;
+bash.unmount("/mnt/data")?;
 ```
 
-Shell state (env vars, cwd, history) is fully preserved across operations.
-The full layered stack:
+### FS Layering Stack
 
 ```text
 ┌──────────────────────────────────┐
@@ -288,304 +109,55 @@ The full layered stack:
 └──────────────────────────────────┘
 ```
 
-See `crates/bashkit/examples/live_mounts.rs` and
-`crates/bashkit/docs/live_mounts.md` for details.
-
-### Custom FileSystem Implementations
-
-#### Which Trait Should I Use?
-
-```
-Do you need a custom filesystem?
-    │
-    ├─ NO → Use InMemoryFs (default with Bash::new())
-    │
-    └─ YES → Is your storage simple (key-value, database, cloud)?
-              │
-              ├─ YES → Implement FsBackend + wrap with PosixFs
-              │        (POSIX checks are automatic, less code)
-              │
-              └─ NO → Implement FileSystem directly
-                      (full control, you handle all checks)
-```
-
-| Approach | Implement | POSIX Checks | Best For |
-|----------|-----------|--------------|----------|
-| `FsBackend` + `PosixFs` | Raw storage only | Automatic | Databases, cloud, key-value stores |
-| `FileSystem` directly | Everything | Manual | Complex caching, custom semantics |
-
-#### Option 1: FsBackend + PosixFs (Recommended)
-
-Best for simple storage backends. You implement raw read/write/list operations,
-and `PosixFs` handles all POSIX semantics (type checks, parent directories).
-
-```rust
-use bashkit::{async_trait, FsBackend, PosixFs, Bash, Result, Metadata, DirEntry};
-use std::sync::Arc;
-
-struct MyStorage { /* your storage */ }
-
-#[async_trait]
-impl FsBackend for MyStorage {
-    async fn read(&self, path: &Path) -> Result<Vec<u8>> { /* ... */ }
-    async fn write(&self, path: &Path, content: &[u8]) -> Result<()> { /* ... */ }
-    async fn stat(&self, path: &Path) -> Result<Metadata> { /* ... */ }
-    // ... other methods (no POSIX checks needed)
-}
-
-// Wrap with PosixFs - POSIX semantics are automatic
-let fs = Arc::new(PosixFs::new(MyStorage::new()));
-let mut bash = Bash::builder().fs(fs).build();
-```
-
-See `examples/custom_backend.rs` for a complete working example.
-
-#### Size Reporting Checklist
-
-When implementing a custom filesystem, verify:
-
-1. **`stat()` returns correct size** - File size matches content length
-2. **`read_dir()` entries have correct sizes** - Each `DirEntry.metadata.size` is accurate
-3. **Directories always report size 0** - Never inherit child file sizes
-4. **Sizes are consistent** - `stat(path).size == read_dir(parent).find(name).metadata.size`
-
-#### Option 2: FileSystem Directly
-
-Best for complex behavior requiring full control. You must implement all
-POSIX checks manually.
-
-```rust
-use bashkit::{async_trait, FileSystem, fs::fs_errors, Result};
-
-struct MyFs { /* ... */ }
-
-#[async_trait]
-impl FileSystem for MyFs {
-    async fn write_file(&self, path: &Path, content: &[u8]) -> Result<()> {
-        // YOU must check: is path a directory?
-        if self.is_directory(path) {
-            return Err(fs_errors::is_a_directory());
-        }
-        // YOU must check: does parent exist?
-        if !self.parent_exists(path) {
-            return Err(fs_errors::parent_not_found());
-        }
-        // Now write...
-    }
-    // ... other methods with manual POSIX checks
-}
-
-let fs = Arc::new(MyFs::new());
-let mut bash = Bash::builder().fs(fs).build();
-```
-
-See `examples/custom_filesystem_impl.rs` for a complete working example.
-
-#### Use Cases
-
-- **Session file stores** - Bridge to external storage during execution
-- **Database-backed filesystems** - Store files in a database
-- **Remote filesystems** - Access files over network protocols
-- **Cached filesystems** - Add caching layers
-
-### Path Handling
-
-All paths normalized before use:
-- Resolve `.` and `..` components
-- Remove trailing slashes
-- Ensure absolute paths start with `/`
-
-```rust
-fn normalize_path(path: &Path) -> PathBuf {
-    // "/foo/../bar/./baz" -> "/bar/baz"
-}
-```
-
-### Symlink Handling
-
-Symlinks are stored but intentionally not followed for security:
-- **TM-ESC-002**: Prevents symlink escape attacks
-- **TM-DOS-011**: Prevents symlink loop DoS attacks
-
-See `specs/006-threat-model.md` for details.
-
 ### Special Device Files
 
 #### /dev/null
-
-`/dev/null` is handled at the **interpreter level**, not the filesystem level. This is a security-critical design decision:
-
-- **Writes**: All output redirected to `/dev/null` is silently discarded
-- **Reads**: Reading from `/dev/null` returns EOF (empty content)
-- **Path normalization**: Handles bypass attempts like `/dev/../dev/null` or `/dev/./null`
-
-**Security Property**: Custom filesystem implementations, overlays, and mount points **cannot** intercept or modify `/dev/null` behavior. This prevents:
-- Malicious filesystems from capturing sensitive data written to `/dev/null`
-- Unexpected behavior when mounting custom filesystems at `/dev`
-- Path traversal attacks attempting to bypass `/dev/null` handling
-
-```rust
-// These all behave identically - output is discarded at interpreter level
-echo "secret" > /dev/null           // Discarded
-echo "secret" > /dev/../dev/null    // Discarded (normalized)
-echo "secret" > /dev/./null         // Discarded (normalized)
-```
+Handled at the **interpreter level**, not filesystem. Security-critical: custom
+filesystem implementations cannot intercept `/dev/null` behavior. Path
+normalization handles bypass attempts.
 
 #### /dev/urandom and /dev/random
+Handled at filesystem level: return 8192 bytes of random data per read
+(bounded to prevent memory growth).
 
-Both `/dev/urandom` and `/dev/random` are handled at the **filesystem level** in `InMemoryFs::read_file`:
+### File Size Reporting
 
-- **Reads**: Return 8192 bytes of random data per read (THREAT[TM-DOS-003])
-- **Writes**: Accepted and discarded (like writing to the real device)
-- **Bounded size**: Prevents `x=$(cat /dev/urandom)` from causing unbounded memory growth
-
-```bash
-# These work as expected
-od -An -N8 -tx1 /dev/urandom    # 8 random hex bytes
-head -c 16 /dev/urandom | xxd   # 16 random bytes in hex
-```
+`Metadata.size` must be correct for `ls -l`, `stat`, `test -s`:
+- Regular files: actual content length
+- Empty files: 0
+- Directories: always 0
+- Both `stat()` and `read_dir()` must return consistent sizes
 
 ### POSIX Semantics Contract
 
-All `FileSystem` implementations MUST enforce these POSIX-like semantics:
-
-1. **No duplicate names**: A file and directory cannot share the same path.
-   - `write_file` on existing directory → error "is a directory"
-   - `mkdir` on existing file → error "already exists"
-
-2. **Type-safe operations**:
-   - `write_file`/`append_file` MUST fail if path is a directory
-   - `mkdir` MUST fail if path exists (file, dir, or symlink), unless `recursive=true` and existing is a directory
-   - `read_dir` MUST fail if path is not a directory
-
-3. **Parent directory requirement**:
-   - Write operations require parent directory to exist
-   - Exception: `mkdir` with `recursive=true` creates parents
-
-Custom implementations can use the `fs_errors` module for consistent error messages:
-
-```rust
-use bashkit::fs::fs_errors;
-
-// In write_file implementation:
-if path_is_directory {
-    return Err(fs_errors::is_a_directory());
-}
-```
+All `FileSystem` implementations MUST enforce:
+1. No duplicate names (file and dir can't share path)
+2. Type-safe operations (`write_file` on dir → error)
+3. Parent directory requirement (exception: `mkdir -p`)
 
-### Error Handling
+### Symlink Handling
 
-All operations return `Result<T>` with IO errors:
-- `NotFound` - file/directory doesn't exist
-- `AlreadyExists` - path already exists (mkdir, file exists at path)
-- `Other("is a directory")` - operation not valid for directories (write to dir)
-- `Other("not a directory")` - expected directory (read_dir on file)
-- `Other("directory not empty")` - non-recursive delete of non-empty dir
+Symlinks are stored but intentionally not followed for security:
+- Prevents symlink escape attacks (TM-ESC-002)
+- Prevents symlink loop DoS (TM-DOS-011)
 
 ## Binding API Parity
 
-All language bindings (Rust builder, Node/JS, Python) must expose the same
-conceptual mount API. The Rust builder is the canonical internal API; bindings
-map their idiomatic config shapes onto it.
-
-### Unified config shape
+All language bindings must expose the same mount API:
 
 ```
-files:  { "/path": "content" }                                  # text files (writable, in-memory)
-mounts: [{ host_path, vfs_path?, writable? }]                   # real FS (read-only by default)
+files:  { "/path": "content" }                # text files (writable, in-memory)
+mounts: [{ host_path, vfs_path?, writable? }] # real FS (read-only by default)
 ```
 
-### Runtime methods
-
-All bindings expose:
-
-- `mount(host_path, vfs_path, writable=false)` — mount real FS at runtime
-- `unmount(vfs_path)` — unmount
-
-### Per-binding mapping
+Runtime methods: `mount(host_path, vfs_path, writable=false)`, `unmount(vfs_path)`.
 
-| Config field | Rust builder | Node/JS | Python |
-|---|---|---|---|
-| `files` | `mount_text()` | `files: Record<string, FileValue>` | `files: dict[str, str]` |
-| `mounts` | `mount_real_readonly[_at]()` / `mount_real_readwrite[_at]()` | `mounts: [{ hostPath, vfsPath?, writable? }]` | `mounts: [{ "host_path", "vfs_path"?, "writable"? }]` |
-| runtime mount | `Bash::mount(path, fs)` | `bash.mount(host, vfs, writable?)` | `bash.mount(vfs, FileSystem.real(host, writable))` |
-| runtime unmount | `Bash::unmount(path)` | `bash.unmount(vfs)` | `bash.unmount(vfs)` |
-
-### Safety defaults
-
-- Real mounts are **read-only by default** — `writable` defaults to `false`
-- Text files are writable (in-memory, sandboxed, no security risk)
-- Callers must explicitly opt-in to writable mounts
+Safety: real mounts are **read-only by default**. Text files are writable (sandboxed).
 
 ## Alternatives Considered
 
 ### Real filesystem with chroot
-Rejected because:
-- Requires root privileges
-- Not portable across OSes
-- Doesn't work in WASM
+Rejected: requires root, not portable, doesn't work in WASM.
 
 ### tokio::fs wrapper
-Rejected because:
-- Always hits real filesystem
-- Can't isolate or virtualize
-- No multi-tenant isolation
-
-## Verification
-
-```rust
-#[tokio::test]
-async fn test_write_and_read() {
-    let fs = InMemoryFs::new();
-    fs.write_file(Path::new("/tmp/test"), b"hello").await.unwrap();
-    let content = fs.read_file(Path::new("/tmp/test")).await.unwrap();
-    assert_eq!(content, b"hello");
-}
-
-#[tokio::test]
-async fn test_mkdir_recursive() {
-    let fs = InMemoryFs::new();
-    fs.mkdir(Path::new("/a/b/c"), true).await.unwrap();
-    assert!(fs.exists(Path::new("/a")).await.unwrap());
-    assert!(fs.exists(Path::new("/a/b")).await.unwrap());
-    assert!(fs.exists(Path::new("/a/b/c")).await.unwrap());
-}
-
-#[tokio::test]
-async fn test_custom_filesystem_integration() {
-    // Custom filesystems work with Bash
-    let custom_fs = Arc::new(MyCustomFs::new());
-    let mut bash = Bash::builder().fs(custom_fs).build();
-    let result = bash.exec("echo test > /tmp/file && cat /tmp/file").await.unwrap();
-    assert_eq!(result.stdout, "test\n");
-}
-
-#[tokio::test]
-async fn test_file_size_reporting() {
-    // File sizes must be correctly reported by stat() and read_dir()
-    let fs = InMemoryFs::new();
-    fs.write_file(Path::new("/tmp/file.txt"), b"hello").await.unwrap();
-    fs.mkdir(Path::new("/tmp/subdir"), false).await.unwrap();
-
-    // stat() returns correct sizes
-    let file_meta = fs.stat(Path::new("/tmp/file.txt")).await.unwrap();
-    assert_eq!(file_meta.size, 5); // "hello" = 5 bytes
-
-    let dir_meta = fs.stat(Path::new("/tmp/subdir")).await.unwrap();
-    assert_eq!(dir_meta.size, 0); // Directories always 0
-
-    // read_dir() entries have correct sizes
-    let entries = fs.read_dir(Path::new("/tmp")).await.unwrap();
-    let file_entry = entries.iter().find(|e| e.name == "file.txt").unwrap();
-    assert_eq!(file_entry.metadata.size, 5);
-
-    let dir_entry = entries.iter().find(|e| e.name == "subdir").unwrap();
-    assert_eq!(dir_entry.metadata.size, 0);
-}
-```
-
-### Test Coverage
-
-File size reporting is verified by:
-- `crates/bashkit/src/builtins/ls.rs` - `test_ls_long_format_*` tests
-- `crates/bashkit/tests/custom_fs_tests.rs` - `test_custom_fs_*_size*` tests
+Rejected: always hits real FS, can't isolate or virtualize.
diff --git a/specs/004-testing.md b/specs/004-testing.md
index 704f6e60..3e61f264 100644
--- a/specs/004-testing.md
+++ b/specs/004-testing.md
@@ -13,70 +13,10 @@ Bashkit uses a multi-layer testing strategy:
 4. **Comparison tests** - Direct comparison with real bash
 5. **Differential fuzzing** - Property-based testing against real bash
 
-## CI Test Summary
-
-Tests run automatically on every PR via `cargo test --features http_client`:
-
-| Test Suite | Test Functions | Notes |
-|------------|---------------|-------|
-| Unit tests (bashkit lib) | 286 | Core interpreter tests |
-| limits.rs | 5 | Resource limit tests |
-| spec_tests.rs | 10 (1 ignored) | Spec compatibility tests |
-| threat_model_tests | 39 | Security tests |
-| security_failpoint_tests | 14 | Fault injection tests |
-| Doc tests | 2 | Documentation examples |
-| **Total** | **356** | Plus 4 examples executed |
+For current test counts and pass rates, see `specs/009-implementation-status.md`.
 
 ## Spec Test Framework
 
-### Location
-```
-crates/bashkit/tests/
-├── spec_runner.rs      # Test parser and runner
-├── spec_tests.rs       # Integration test entry point
-├── debug_spec.rs       # Debugging utilities
-├── threat_model_tests.rs    # Security threat model tests
-├── security_failpoint_tests.rs  # Fault injection tests
-├── proptest_differential.rs # Grammar-based differential fuzzing
-└── spec_cases/
-    ├── bash/           # Core bash compatibility (20 files, 471 cases)
-    │   ├── arithmetic.test.sh
-    │   ├── arrays.test.sh
-    │   ├── background.test.sh
-    │   ├── command-subst.test.sh
-    │   ├── control-flow.test.sh
-    │   ├── cuttr.test.sh
-    │   ├── date.test.sh
-    │   ├── echo.test.sh
-    │   ├── fileops.test.sh
-    │   ├── functions.test.sh
-    │   ├── globs.test.sh
-    │   ├── headtail.test.sh
-    │   ├── herestring.test.sh
-    │   ├── path.test.sh
-    │   ├── pipes-redirects.test.sh
-    │   ├── procsub.test.sh
-    │   ├── sleep.test.sh
-    │   ├── sortuniq.test.sh
-    │   ├── variables.test.sh
-    │   └── wc.test.sh
-    ├── awk/            # AWK builtin tests (89 cases)
-    ├── grep/           # Grep builtin tests (70 cases)
-    ├── sed/            # Sed builtin tests (65 cases)
-    └── jq/             # JQ builtin tests (95 cases)
-```
-
-### Spec Test Counts
-
-| Category | Test Cases | In CI | Pass | Skip |
-|----------|------------|-------|------|------|
-| Bash | 471 | Yes | 367 | 104 |
-| AWK | 89 | Yes | 48 | 41 |
-| Grep | 70 | Yes | 65 | 5 |
-| Sed | 65 | Yes | 50 | 15 |
-| JQ | 95 | Yes | 80 | 15 |
-| **Total** | **790** | **790** | 601 | 189 |
-
 ### Test File Format
 
 ```sh
@@ -119,21 +59,11 @@ cargo test --test spec_tests
 # Single category
 cargo test --test spec_tests -- bash_spec_tests
 
-# With output
-cargo test --test spec_tests -- --nocapture
-
 # Check spec tests match real bash
 just check-bash-compat
 
-# Check spec tests match real bash (verbose - shows each test)
-just check-bash-compat-verbose
-
 # Generate comprehensive compatibility report
 just compat-report
-
-# Or directly with cargo:
-cargo test --test spec_tests -- bash_comparison_tests --nocapture
-cargo test --test spec_tests -- compatibility_report --ignored --nocapture
 ```
 
 ## Coverage
@@ -141,45 +71,9 @@ cargo test --test spec_tests -- compatibility_report --ignored --nocapture
 Coverage is tracked with cargo-tarpaulin and uploaded to Codecov.
 
 ```bash
-# Generate local coverage report
 cargo tarpaulin --features http_client --out html --output-dir coverage
-
-# View coverage report
-open coverage/tarpaulin-report.html
 ```
 
-The coverage workflow runs on every PR and push to main. Reports are uploaded
-to Codecov and available as CI artifacts.
-
-### Current Status
-- All spec tests: 76% pass rate (601/790 running in CI, 189 skipped)
-- Text processing tools: 73% pass rate (234/319 running, 85 skipped)
-- Core bash specs: 78% pass rate (367/471 running, 104 skipped)
-
-## Known Testing Gaps
-
-Completed:
-
-- [x] Enable bash_spec_tests in CI — 330/435 tests running
-- [x] Add bash_comparison_tests to CI — 309 tests compared against real bash
-- [x] Fix control-flow.test.sh — 31 tests now running
-- [x] Add coverage tooling — cargo-tarpaulin + Codecov via `.github/workflows/coverage.yml`
-
-Outstanding:
-
-- [ ] **Fix skipped spec tests** (189 total):
-  - Bash: 104 skipped (various implementation gaps)
-  - AWK: 41 skipped (operators, control flow, functions)
-  - Grep: 5 skipped (include/exclude, binary detection)
-  - Sed: 15 skipped (features)
-  - JQ: 15 skipped (functions, flags)
-- [ ] **Fix bash_diff tests** (21 total):
-  - wc: 14 tests (output formatting differs)
-  - background: 2 tests (non-deterministic order)
-  - globs: 2 tests (VFS vs real filesystem glob expansion)
-  - timeout: 1 test (timeout 0 behavior)
-  - brace-expansion: 1 test (empty item handling)
-
 ## Adding New Tests
 
 1. Create or edit `.test.sh` file in appropriate category
@@ -191,8 +85,6 @@ Outstanding:
 
 ### Checking Expected Outputs
 
-The `scripts/update-spec-expected.sh` script helps verify expected outputs:
-
 ```bash
 # Check all tests match real bash
 ./scripts/update-spec-expected.sh
@@ -201,137 +93,29 @@ The `scripts/update-spec-expected.sh` script helps verify expected outputs:
 ./scripts/update-spec-expected.sh --verbose
 ```
 
-If a test fails, either:
-1. Fix the expected output to match real bash, or
-2. Add `### bash_diff: reason` if the difference is intentional
-
 ## Comparison Testing
 
-The `bash_comparison_tests` test runs in CI and compares Bashkit output against real bash:
-
-```rust
-pub fn run_real_bash(script: &str) -> (String, i32) {
-    Command::new("bash")
-        .arg("-c")
-        .arg(script)
-        .output()
-}
-```
-
-Tests marked with `### bash_diff` are excluded from comparison (known intentional differences).
+The `bash_comparison_tests` test runs in CI and compares Bashkit output against
+real bash. Tests marked with `### bash_diff` are excluded from comparison.
 Tests marked with `### skip` are excluded from both spec tests and comparison.
 
-The test fails if any non-excluded test produces different output than real bash.
-
-A verbose version `bash_comparison_tests_verbose` is available (ignored by default) for debugging.
-
-## Compatibility Report
-
-The `compatibility_report` test generates a comprehensive summary of Bashkit's
-compatibility with real bash. Run with:
-
-```bash
-just compat-report
-```
-
-Example output:
-```
-╔══════════════════════════════════════════════════════════════════╗
-║                 Bashkit Compatibility Report                     ║
-╚══════════════════════════════════════════════════════════════════╝
-
-┌─────────────┬───────┬────────┬─────────┬───────────┬─────────────────┐
-│  Category   │ Total │ Passed │ Skipped │ BashDiff  │   Bash Compat   │
-├─────────────┼───────┼────────┼─────────┼───────────┼─────────────────┤
-│    bash     │  404  │ 294/294│   110   │    21     │ 273/273 (100.0%)│
-│     awk     │  89   │  48/48 │   41    │     0     │  48/48  (100.0%)│
-│    grep     │  55   │  34/34 │   21    │     0     │  33/34  ( 97.1%)│
-│     sed     │  65   │  40/40 │   25    │     0     │  40/40  (100.0%)│
-│     jq      │  95   │  58/58 │   37    │     0     │  58/58  (100.0%)│
-└─────────────┴───────┴────────┴─────────┴───────────┴─────────────────┘
-
-Summary:
-  Bash compatibility: 452/453 (99.8%)
-```
-
-The report shows:
-- **Passed**: Tests passing against expected output
-- **Skipped**: Tests for unimplemented features
-- **BashDiff**: Tests with known intentional differences from bash
-- **Bash Compat**: Tests producing identical output to real bash
-
 ## Differential Fuzzing
 
 Grammar-based property testing using proptest generates random valid bash scripts
-and compares Bashkit output against real bash. This helps find edge cases that
-aren't covered by hand-written spec tests.
-
-### Running Differential Fuzzing
+and compares Bashkit output against real bash.
 
 ```bash
-# Run with default 50 cases per test
-cargo test --test proptest_differential
-
-# Run with more cases for deeper testing
-PROPTEST_CASES=1000 cargo test --test proptest_differential
-
-# Run with output to see generated scripts
-cargo test --test proptest_differential -- --nocapture
-
-# Using just commands
-just fuzz-diff
-just fuzz-diff-deep
+just fuzz-diff         # default 50 cases
+just fuzz-diff-deep    # 1000 cases
 ```
 
-### Script Generators
-
-The fuzzer generates scripts in these categories:
-- **Echo commands** - Various quoting styles, flags (-n), multiple args
-- **Arithmetic** - Addition, subtraction, multiplication, division, modulo
-- **Control flow** - if/else, for loops, while loops, case statements
-- **Pipelines** - echo | cat, multi-stage pipes
-- **Logical operators** - &&, ||, combined chains
-- **Command substitution** - $() and backticks
-- **Functions** - Definition and invocation
-
-### Known Limitations
-
-Some features are intentionally excluded from fuzzing:
-- `pwd` - Path differs between Bashkit VFS and real filesystem
-- `wc` - Output formatting differs (column alignment)
-- Filesystem operations - Bashkit uses virtual filesystem
+Known exclusions: `pwd` (path differs), `wc` (formatting), filesystem ops (VFS).
 
 ## JavaScript Runtime Compatibility Tests
 
-### Motivation
-
-The NAPI-RS JS bindings must work across Node.js, Bun, and Deno. The primary
-test suite uses ava (a Node-specific test runner), so it can only validate Node.
-To prove the bindings work under other runtimes, we maintain a separate
-**runtime-compat** test suite using only `node:test` and `node:assert` — APIs
-supported natively by all three runtimes.
-
-### Architecture
-
-```
-crates/bashkit-js/__test__/
-├── *.spec.ts                  # ava tests (Node only, TypeScript)
-└── runtime-compat/
-    ├── _setup.mjs             # Shared: loads native NAPI binding
-    ├── basics.test.mjs        # Constructors, execution, variables, reset, isolation
-    ├── builtins.test.mjs      # grep, sed, awk, sort, uniq, tr, cut, jq, etc.
-    ├── control-flow.test.mjs  # if/elif, for, while, case, functions, subshells
-    ├── error-handling.test.mjs # Exit codes, BashError, recovery, parse errors
-    ├── filesystem.test.mjs    # File I/O, pipes, redirection, heredocs
-    ├── vfs.test.mjs           # VFS API (writeFile, readFile, mkdir, exists, remove)
-    ├── tool-metadata.test.mjs # BashTool name, version, schemas, systemPrompt
-    ├── security.test.mjs      # Resource limits, sandbox escape, path traversal
-    └── scripts.test.mjs       # Real-world patterns: JSON pipelines, large output
-```
-
-### CI Matrix
-
-All runtimes build with npm (napi-rs requires Node tooling). Test execution:
+The NAPI-RS JS bindings must work across Node.js, Bun, and Deno. A separate
+**runtime-compat** test suite using only `node:test` and `node:assert` validates
+cross-runtime compatibility.
 
 | Runtime | Versions | ava tests | runtime-compat | Examples |
 |---------|----------|-----------|----------------|----------|
@@ -339,35 +123,12 @@ All runtimes build with npm (napi-rs requires Node tooling). Test execution:
 | Bun     | latest, canary | No | Yes | Yes |
 | Deno    | 2.x, canary | No | Yes | Yes |
 
-- **Node** runs both ava (full functional suite) and runtime-compat (via `node --test`)
-- **Bun/Deno** run runtime-compat files directly with their native runtimes
-- All runtimes run the example `.mjs` files
-
 ### Maintenance Rules
 
-1. **When adding a new ava test**: consider if it covers a new API surface or
-   behavior that should also be validated across runtimes. If so, add a
-   corresponding test to the appropriate `runtime-compat/*.test.mjs` file.
-2. **runtime-compat tests use only** `node:test`, `node:assert`, and
-   `node:module` — no npm dependencies. This ensures they run under all runtimes.
-3. **Files are plain `.mjs`** (not TypeScript) to avoid transpilation steps.
-4. **Shared setup** lives in `_setup.mjs` — it loads the native binding via
-   `createRequire` which works in Node, Bun, and Deno.
-5. **Keep files focused** — one file per concern area, mirroring the ava test
-   structure. Each file should be independently runnable.
-
-### Running Locally
-
-```bash
-# Node (native test runner)
-node --test crates/bashkit-js/__test__/runtime-compat/*.test.mjs
-
-# Bun
-for f in crates/bashkit-js/__test__/runtime-compat/*.test.mjs; do bun "$f"; done
-
-# Deno
-for f in crates/bashkit-js/__test__/runtime-compat/*.test.mjs; do deno run -A "$f"; done
-```
+1. New ava tests covering new API surface → add runtime-compat counterpart
+2. runtime-compat tests use only `node:test`, `node:assert`, `node:module`
+3. Files are plain `.mjs` (no TypeScript)
+4. Keep files focused — one file per concern area
 
 ## Alternatives Considered
 
@@ -384,12 +145,6 @@ Future consideration: Would help find parser crashes via mutation.
 cargo test --features http_client
 cargo test --features failpoints --test security_failpoint_tests -- --test-threads=1
 
-# Run ALL spec tests including ignored bash tests (manual)
-cargo test --test spec_tests -- --include-ignored --nocapture
-
-# Check pass rates for each category
-cargo test --test spec_tests -- --nocapture 2>&1 | grep "Total:"
-
 # Run differential fuzzing
 cargo test --test proptest_differential -- --nocapture
 ```
diff --git a/specs/005-builtins.md b/specs/005-builtins.md
index 9090c361..b35df4f9 100644
--- a/specs/005-builtins.md
+++ b/specs/005-builtins.md
@@ -5,193 +5,25 @@ Implemented
 
 ## Decision
 
-Bashkit provides a comprehensive set of built-in commands for script execution
-in a virtual environment. All builtins operate on the virtual filesystem.
+Bashkit provides built-in commands for script execution in a virtual environment.
+All builtins operate on the virtual filesystem. For the complete list of 156
+builtins and per-command details, see `specs/009-implementation-status.md`.
 
 ### Standard Flags
 
-All external-style builtins (non-shell-intrinsic commands) support `--help` and
-`--version` flags. The `check_help_version()` helper in `builtins/mod.rs`
-handles `--help` and `--version` (long flags only — short flags `-h`/`-V` are
-not handled by the helper since they have different meanings in many tools like
-`sort -V`, `ls -h`, `grep -h`). Tools where `-h`/`-V` genuinely mean
-help/version (e.g. `jq`) handle them directly in their `execute()` method.
+All external-style builtins support `--help` and `--version` flags via the
+`check_help_version()` helper in `builtins/mod.rs` (long flags only — short
+flags `-h`/`-V` are not handled by the helper since they have different meanings
+in many tools). Tools where `-h`/`-V` genuinely mean help/version handle them
+directly in their `execute()` method.
 
-### Builtin Categories
+### Command Dispatch Order
 
-#### Core Shell Builtins
-- `echo`, `printf` - Output text
-- `true`, `false` - Exit status
-- `exit`, `return` - Control flow
-- `break`, `continue` - Loop control
-- `cd`, `pwd` - Navigation
-- `export`, `local`, `set`, `unset`, `shift` - Variable management
-- `source`, `.` - Script sourcing (functions, variables, PATH search, positional params)
-- `test`, `[` - Conditionals (see Test Operators below)
-- `read` - Input
-- `alias`, `unalias` - Alias management (gated by `shopt -s expand_aliases`, parser-time first-word expansion, trailing-space chaining, recursion guard)
+functions → special commands → builtins → path execution → $PATH search → "command not found"
 
-#### Script Execution by Path
-
-Commands containing `/` (absolute or relative paths) are resolved against the
-VFS. Commands without `/` are searched in `$PATH` directories for executable
-files. The dispatch order is: functions → special commands → builtins → path
-execution → $PATH search → "command not found".
-
-- Absolute: `/path/to/script.sh` — resolved directly
-- Relative: `./script.sh` — resolved relative to cwd
-- $PATH search: `myscript` — searches each `$PATH` directory for executable file
-- Shebang (`#!/bin/bash`) stripped; content executed as bash
-- `$0` = script name, `$1..N` = arguments
-- Exit 127: file not found; Exit 126: not executable or is a directory
-
-#### Test Operators (`test` / `[`)
-
-**String tests:**
-- `-z string` - True if string is empty
-- `-n string` - True if string is non-empty
-- `s1 = s2`, `s1 == s2` - String equality
-- `s1 != s2` - String inequality
-- `s1 < s2`, `s1 > s2` - String comparison (lexicographic)
-
-**Numeric tests:**
-- `-eq`, `-ne`, `-lt`, `-le`, `-gt`, `-ge` - Integer comparisons
-
-**File tests:**
-- `-e file` - File exists
-- `-f file` - Regular file
-- `-d file` - Directory
-- `-r file` - Readable
-- `-w file` - Writable
-- `-x file` - Executable
-- `-s file` - Non-empty file
-- `-L file`, `-h file` - Symbolic link
-
-**Logical:**
-- `! expr` - Negation
-- `expr1 -a expr2` - AND
-- `expr1 -o expr2` - OR
-
-#### Shell Options (`set`)
-- `set -e` / `set -o errexit` - Exit on error
-- `set +e` - Disable errexit
-- errexit respects conditionals (if, while, &&, ||)
-
-#### File Operations
-- `mkdir` - Create directories (`-p` for parents)
-- `rm` - Remove files/directories (`-r`, `-f`)
-- `cp` - Copy files (`-r` for directories)
-- `mv` - Move/rename files
-- `touch` - Create empty files
-- `chmod` - Change permissions (octal mode)
-- `chown` - Change ownership (no-op in VFS, validates file existence)
-- `ln` - Create links (`-s` symbolic, `-f` force)
-- `kill` - Send signals (no-op in VFS, `-l` lists signals)
-
-#### Text Processing
-- `cat` - Concatenate files (`-v`, `-n`, `-e`, `-t`)
-- `nl` - Number lines (`-b`, `-n`, `-s`, `-i`, `-v`, `-w`)
-- `head`, `tail` - First/last N lines
-- `grep` - Pattern matching (`-i`, `-v`, `-c`, `-n`, `-o`, `-l`, `-w`, `-E`, `-F`, `-P`, `-q`, `-m`, `-x`, `-A`, `-B`, `-C`, `-e`, `-f`, `-H`, `-h`, `-b`, `-a`, `-z`, `-r`)
-- `sed` - Stream editing (s/pat/repl/, d, p, a, i; `-E`, `-e`, `-i`, `-n`; nth occurrence, `!` negation)
-- `awk` - Text processing (print, -F, variables, `--csv`/`-k`, `\u` Unicode escapes)
-- `jq` - JSON processing (file arguments, `-s`, `-r`, `-c`, `-n`, `-S`, `-e`, `--tab`, `-j`, `--arg`, `--argjson`, `-h`/`--help`, `-V`/`--version`, combined short flags)
-- `sort` - Sort lines (`-r`, `-n`, `-u`)
-- `uniq` - Filter duplicates (`-c`, `-d`, `-u`)
-- `cut` - Extract fields (`-d`, `-f`)
-- `tr` - Translate characters (`-d` for delete)
-- `wc` - Count lines/words/bytes (`-l`, `-w`, `-c`)
-- `paste` - Merge lines of files (`-d`, `-s`)
-- `column` - Columnate lists (`-t`, `-s`, `-o`)
-- `diff` - Compare files line by line (`-u`, `-q`)
-- `comm` - Compare two sorted files (`-1`, `-2`, `-3`)
-
-#### Byte Inspection
-- `od` - Octal dump (`-A`, `-t`, `-N`, `-j`)
-- `xxd` - Hex dump (`-l`, `-s`, `-c`, `-g`, `-p`, `-r`)
-- `hexdump` - Hex display (`-C`, `-n`, `-s`)
-- `strings` - Extract printable strings (`-n`, `-t`)
-
-#### Utilities
-- `sleep` - Pause execution (max 60s for safety)
-- `date` - Date/time formatting (`+FORMAT`, `-u`)
-- `basename`, `dirname` - Path manipulation
-- `wait` - Wait for background jobs
-- `timeout` - Run command with time limit (stub, max 300s)
-
-#### System Information
-- `hostname` - Display virtual hostname (configurable, default: "bashkit-sandbox")
-- `uname` - System info (`-a`, `-s`, `-n`, `-r`, `-v`, `-m`, `-o`)
-- `whoami` - Display virtual username (configurable, default: "sandbox")
-- `id` - User/group IDs (`-u`, `-g`, `-n`)
-
-These builtins return configurable virtual values to prevent host information disclosure.
-Configure via `BashBuilder`:
-
-```rust
-Bash::builder()
-    .username("deploy")      // Sets whoami, id, and $USER
-    .hostname("my-server")   // Sets hostname, uname -n
-    .build();
-```
-
-#### Directory Listing and Search
-- `ls` - List directory contents (`-l`, `-a`, `-h`, `-1`, `-R`, `-t`)
-- `find` - Search for files (`-name PATTERN`, `-type f|d|l`, `-maxdepth N`, `-mindepth N`, `-print`)
-- `rmdir` - Remove empty directories (`-p` for parents)
-
-#### File Inspection
-- `less` - View file contents (virtual mode: behaves like `cat`, no interactive paging)
-- `file` - Detect file type via magic bytes (text, binary, PNG, JPEG, gzip, etc.)
-- `stat` - Display file metadata (`-c FORMAT` with %n, %s, %F, %a, %U, %G, %Y, %Z)
-
-#### Archive Operations
-- `tar` - Create/extract tar archives (`-c`, `-x`, `-t`, `-v`, `-f`, `-z` for gzip)
-- `gzip` - Compress files (`-d` decompress, `-k` keep, `-f` force)
-- `gunzip` - Decompress files (`-k` keep, `-f` force)
-
-#### Environment
-- `env` - Print environment or run command with modified environment
-- `printenv` - Print environment variable values
-- `history` - Command history (virtual mode: limited, no persistent history)
-
-#### Prefix Environment Assignments
-
-Bash supports `VAR=value command` syntax where the assignment is temporary and
-scoped to the command's environment. Bashkit implements this: prefix assignments
-are injected into `ctx.env` for the command's duration, then both `env` and
-`variables` are restored. Assignment-only commands (`VAR=value` with no command)
-persist in shell variables as usual.
-
-#### Pipeline Control
-- `xargs` - Build commands from stdin (`-I REPL`, `-n MAX`, `-d DELIM`)
-- `tee` - Write to files and stdout (`-a` append)
-- `watch` - Execute command periodically (virtual mode: shows command info, no continuous execution)
-
-#### Network
-- `curl` - HTTP client (requires http_client feature + allowlist)
-  - Options: `-s/--silent`, `-o FILE`, `-X METHOD`, `-d DATA` (supports `@-` for stdin, `@file` for VFS file), `-H HEADER`, `-I/--head`, `-f/--fail`, `-L/--location`, `-w FORMAT`, `--compressed`, `-u/--user`, `-A/--user-agent`, `-e/--referer`, `-v/--verbose`, `-m/--max-time`, `--connect-timeout`
-  - Security: URL allowlist enforced, 10MB response limit, timeouts clamped to [1s, 10min], zip bomb protection via size-limited decompression
-- `wget` - Download files (requires http_client feature + allowlist)
-  - Options: `-q/--quiet`, `-O FILE`, `--spider`, `--header`, `-U/--user-agent`, `--post-data`, `-t/--tries`, `-T/--timeout`, `--connect-timeout`
-  - Security: URL allowlist enforced, 10MB response limit, timeouts clamped to [1s, 10min]
-- `http` - HTTPie-style HTTP client (requires http_client feature + allowlist)
-  - Syntax: `http [OPTIONS] [METHOD] URL [ITEMS...]` where items are `key=value` (JSON string), `key:=value` (JSON raw), `Header:value`, `key==value` (query param)
-  - Options: `--json/-j`, `--form/-f`, `-v/--verbose`, `-h/--headers`, `-b/--body`, `-o FILE`
-  - Security: URL allowlist enforced, JSON/form injection prevention, query parameter encoding
-
-**Request Signing**: When the `bot-auth` feature is enabled and configured, all outbound HTTP requests from curl, wget, and http builtins are transparently signed with Ed25519 per RFC 9421. See `specs/017-request-signing.md`.
-
-**Network Configuration**:
-```rust
-use bashkit::{Bash, NetworkAllowlist};
-
-let bash = Bash::builder()
-    .network(NetworkAllowlist::new()
-        .allow("https://api.example.com")
-        .allow("https://cdn.example.com"))
-    .build();
-```
+Scripts containing `/` are resolved against VFS. Commands without `/` are
+searched in `$PATH` directories. Shebang lines are stripped; content executed
+as bash. Exit 127: not found; Exit 126: not executable or is a directory.
 
 ### Builtin Trait
 
@@ -227,18 +59,6 @@ pub struct Context<'a> {
 
 Internal builtins that need interpreter state receive it via `Context.shell`:
 
-```rust
-pub(crate) struct ShellRef<'a> {
-    // Direct mutable access (simple HashMap state, no invariants)
-    pub(crate) aliases: &'a mut HashMap<String, String>,
-    pub(crate) traps: &'a mut HashMap<String, String>,
-    // Read-only introspection (accessed via methods)
-    // has_builtin(), has_function(), is_keyword(),
-    // call_stack_depth(), call_stack_frame_name(),
-    // history_entries(), jobs()
-}
-```
-
 **Design rationale:**
 - **Direct mutation** for aliases/traps — simple HashMaps with no invariants
 - **Side effects** for arrays (budget checks), positional params (call stack),
@@ -250,7 +70,6 @@ pub(crate) struct ShellRef<'a> {
 
 **Builtins using ShellRef:**
 - `type`, `which` — read-only: check builtin/function/keyword names
-- `hash` — no-op (no PATH cache in sandbox)
 - `alias`, `unalias` — direct mutation of `shell.aliases`
 - `trap` — direct mutation of `shell.traps`
 - `caller` — read call stack depth/frame names
@@ -278,34 +97,16 @@ before `execute()` — when it returns `Some(plan)`, the interpreter fulfills th
 plan instead of using the `execute()` result.
 
 ```rust
-pub struct SubCommand {
-    pub name: String,
-    pub args: Vec<String>,
-    pub stdin: Option<String>,
-}
-
 pub enum ExecutionPlan {
-    /// Run a single command with a timeout.
-    Timeout {
-        duration: Duration,
-        preserve_status: bool,
-        command: SubCommand,
-    },
-    /// Run a sequence of commands, collecting output.
-    Batch {
-        commands: Vec<SubCommand>,
-    },
+    Timeout { duration: Duration, preserve_status: bool, command: SubCommand },
+    Batch { commands: Vec<SubCommand> },
 }
 ```
 
-**Current users:**
-- `timeout` → `ExecutionPlan::Timeout` — wraps a sub-command with a time limit
-- `xargs` → `ExecutionPlan::Batch` — builds commands from stdin lines
-- `find -exec` → `ExecutionPlan::Batch` — runs commands on matched files
+**Current users:** `timeout` → Timeout, `xargs` → Batch, `find -exec` → Batch.
 
 **Adding new execution plans:** Add a variant to `ExecutionPlan` and handle it
-in the interpreter's plan fulfillment code (`interpreter/mod.rs`). Custom
-builtins can also override `execution_plan()` to request sub-command execution.
+in the interpreter's plan fulfillment code (`interpreter/mod.rs`).
 
 ### Adding Internal Builtins
 
@@ -314,4 +115,16 @@ macro in `interpreter/mod.rs`. To add a new one:
 
 1. Create the builtin module in `crates/bashkit/src/builtins/` (implement `Builtin` trait)
 2. Add `mod mycommand;` and `pub use mycommand::MyCommand;` in `builtins/mod.rs`
-3. Add one line to the `register_builtins!` table i
\ No newline at end of file
+3. Add one line to the `register_builtins!` table in `interpreter/mod.rs`
+4. Add spec tests in `tests/spec_cases/`
+5. Update `specs/009-implementation-status.md`
+
+### Network Builtins
+
+`curl`, `wget`, `http` require the `http_client` feature + URL allowlist.
+When `bot-auth` feature is enabled, all outbound HTTP requests are transparently
+signed with Ed25519 per RFC 9421 (see `specs/017-request-signing.md`).
+
+## Alternatives Considered
+
+Inline within design sections above.
diff --git a/specs/008-posix-compliance.md b/specs/008-posix-compliance.md
deleted file mode 100644
index 1918b1cc..00000000
--- a/specs/008-posix-compliance.md
+++ /dev/null
@@ -1,83 +0,0 @@
-# 008: POSIX Shell Command Language Compliance
-
-## Status
-Implemented (substantial compliance)
-
-## Summary
-
-Bashkit aims for substantial compliance with IEEE Std 1003.1-2024 (POSIX.1-2024)
-Shell Command Language specification. This document explains our compliance
-approach and security-motivated deviations.
-
-For detailed implementation status, see [009-implementation-status.md](009-implementation-status.md).
-
-## Design Philosophy
-
-Bashkit prioritizes:
-1. **Security over completeness** - exclude features that break sandbox containment
-2. **Stateless execution** - no persistent state between command invocations
-3. **Deterministic behavior** - predictable results for AI agent workflows
-
-## Security Exclusions
-
-Two POSIX special builtins are intentionally excluded:
-
-**`exec`**: The POSIX `exec` replaces the current shell process, which would
-break sandbox containment. Scripts requiring `exec` should be refactored to
-use standard command execution.
-
-**`trap`**: Signal handlers require persistent state across commands, conflicting
-with Bashkit's stateless execution model. Additionally, there are no signal
-sources in the virtual environment (no external processes send SIGINT/SIGTERM). Scripts
-should handle errors through exit codes and conditional execution.
-
-## Intentional Deviations
-
-### Security-Motivated
-
-1. **No OS process spawning**: External commands run as builtins or virtual script
-   re-invocations, not OS subprocesses. Scripts can be executed by absolute path,
-   relative path, or `$PATH` search within the VFS.
-2. **No signal handling**: `trap` excluded for sandbox isolation
-3. **No process replacement**: `exec` excluded for containment
-4. **Virtual filesystem**: Real FS access requires explicit configuration
-5. **Network allowlist**: HTTP requires URL allowlist configuration
-
-### Simplification (Stateless Model)
-
-1. **Background execution**: `&` is parsed but runs synchronously
-2. **Job control**: Not implemented (interactive feature)
-3. **Process times**: `times` returns zeros (no CPU tracking)
-
-## Testing Approach
-
-POSIX compliance is verified through:
-
-1. **Unit Tests** - 22 POSIX-specific tests for special builtins and parameters
-2. **Spec Tests** - 435 bash spec test cases in CI
-3. **Bash Comparison** - differential testing against real bash
-
-```bash
-# Run POSIX compliance tests
-cargo test --lib -- interpreter::tests::test_colon
-cargo test --lib -- interpreter::tests::test_readonly
-cargo test --lib -- interpreter::tests::test_times
-cargo test --lib -- interpreter::tests::test_eval
-cargo test --lib -- interpreter::tests::test_special_param
-
-# Compare with real bash
-cargo test --test spec_tests -- bash_comparison_tests --ignored
-```
-
-## Future Work
-
-- [ ] Implement `getopts` builtin for option parsing
-- [ ] Add `command` builtin for command lookup control
-- [ ] Consider `type` builtin for command type detection
-- [ ] Evaluate `hash` builtin for command caching info
-
-## References
-
-- [IEEE Std 1003.1-2024](https://pubs.opengroup.org/onlinepubs/9699919799/)
-- [Shell Command Language](https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html)
-- [Special Built-in Utilities](https://pubs.opengroup.org/onlinepubs/007904975/idx/sbi.html)
diff --git a/specs/009-implementation-status.md b/specs/009-implementation-status.md
index 97dfc29d..8d322fcb 100644
--- a/specs/009-implementation-status.md
+++ b/specs/009-implementation-status.md
@@ -53,8 +53,7 @@ See [006-threat-model.md](006-threat-model.md) threat TM-ESC-015 for security an
 
 ## POSIX Compliance
 
-Bashkit implements IEEE 1003.1-2024 Shell Command Language. See
-[008-posix-compliance.md](008-posix-compliance.md) for design rationale.
+Bashkit implements IEEE 1003.1-2024 Shell Command Language.
 
 ### Compliance Level
 
diff --git a/specs/015-ssh-support.md b/specs/015-ssh-support.md
index c23f1bd6..6f8eabfc 100644
--- a/specs/015-ssh-support.md
+++ b/specs/015-ssh-support.md
@@ -2,7 +2,7 @@
 
 ## Status
 
-Phase 1: In Progress — Handler trait, allowlist, ssh/scp/sftp builtins
+Phase 1: Implemented — Handler trait, allowlist, ssh/scp/sftp builtins
 
 ## Decision