feat(tool): add file and search tools with structured metadata by hakula139 · Pull Request #4 · hakula139/oxide-code

hakula139 · 2026-04-03T05:32:59Z

Summary

Add five file-oriented tools — read, write, edit, glob, grep — giving the agent structured file interaction instead of routing everything through bash. Align tool output formats with Claude Code and OpenCode conventions, and add structured metadata for future TUI rendering.

Add file tools: read (line-numbered, paginated, byte-budgeted), write (with parent dir creation), edit (exact replacement with CRLF-aware normalize-match-restore)
Add search tools: glob (ignore + globset, mtime-sorted, relative paths), grep (three output modes, context merging, head limit, include filter, relative paths)
Add ToolMetadata struct carrying title and exit code for TUI display, separate from model-facing content
Align bash output: is_error only on infrastructure failures (not nonzero exit), drop STDERR: prefix, append (exit code N) to content, add description parameter for TUI labels
Extract shared helpers into tool.rs: parse_input, is_binary, display_path, file_name, truncate_line, walk_files, entry_mtime, resolve_base_dir

Design Decisions

Sync tools on spawn_blocking: glob and grep use filesystem-heavy sync crates (ignore, globset, regex); read / write / edit use tokio::fs directly.
CRLF normalization in edit: Detect dominant EOL, normalize to LF for matching, restore dominant style on write-back. Mixed-ending files get uniformized.
is_error semantics: Only infrastructure failures set is_error: true. Nonzero exit codes are informational — the exit code is appended to content so the model can interpret severity itself. Matches Claude Code and OpenCode behavior.
Relative paths: Glob and grep return paths relative to the search base directory. Saves tokens and matches developer mental models.
gitignore-aware walking: The ignore crate respects .gitignore, .ignore, .git/info/exclude, and global ignore rules. Walks stay within the same filesystem.

Changes

File	Description
crates/oxide-code/src/tool.rs	`ToolMetadata` struct, `ToolOutput::with_title`, shared helpers (`parse_input`, `is_binary`, `display_path`, `file_name`, `truncate_line`, `walk_files`, `entry_mtime`), docstrings
crates/oxide-code/src/tool/read.rs	ReadTool — compact line numbers, pagination, byte budget, BOM / binary handling
crates/oxide-code/src/tool/write.rs	WriteTool — file writing with parent directory creation
crates/oxide-code/src/tool/edit.rs	EditTool — exact string replacement, CRLF handling, uniqueness check, file-size guard
crates/oxide-code/src/tool/glob.rs	GlobTool — gitignore-aware pattern matching, mtime sort, relative paths, result cap
crates/oxide-code/src/tool/grep.rs	GrepTool — regex search, three output modes, context merging, head limit, relative paths
crates/oxide-code/src/tool/bash.rs	`is_error` semantics fix, `description` parameter, drop `STDERR:` prefix, `exit_code` metadata
crates/oxide-code/src/main.rs	Register all six tools, display metadata title in output
crates/oxide-code/src/client/anthropic.rs	`claude-cli` User-Agent header for OAuth compatibility
Cargo.toml, Cargo.lock	Add `ignore`, `globset`, `regex` workspace deps
CLAUDE.md	Crate structure diagram, cargo fmt in verification, blank-line convention
docs/roadmap.md	Tool metadata, centralized truncation, file-change tracking plans
docs/research/anthropic-api.md	macOS Keychain OAuth findings and token divergence
README.md	Configuration section, Rust version fix (1.85 → 1.91)

Test plan

cargo fmt --all --check — clean
cargo build compiles cleanly
cargo clippy --all-targets -- -D warnings — zero warnings
cargo test — 144 tests pass (101 new)
cargo llvm-cov --ignore-filename-regex 'main\.rs' — 85% line coverage (tool files: bash 94%, edit 91%, glob 85%, grep 93%, read 89%, write 81%, tool.rs 99%)

The summary reported file_count from the truncated list while total_matches spanned all files, producing a misleading message. Also add tests for count-mode truncation and context-range merging.

Replace blocking Path::exists() with tokio::fs::metadata().await to match the async filesystem pattern used in read.rs.

Empty old_string matched at every position in the file, causing replace() to insert new_string between every character.

…ators

- Extract shared `MAX_LINE_LENGTH` and `truncate_line` into `tool.rs`, returning `Cow<str>` to avoid allocation on the common short path. - Remove duplicate implementations from `read.rs` and `grep.rs`. - Replace raw string matching for grep output mode with a `#[derive(Deserialize)]` `OutputMode` enum; unrecognized values now produce a deserialization error instead of silently falling back. - Move `HIDDEN_DIRS` to module-level constants and reorder `is_hidden_dir` after its caller for top-down reading order. - Remove WHAT-not-WHY comments in `grep.rs` and `glob.rs`.

Replace `let _ = write!(...)` with `_ = write!(...)` across all tool modules. Both are correct; the shorter form is idiomatic modern Rust (stable since 1.66).

Replace `walkdir` + hardcoded `HIDDEN_DIRS` list with the `ignore` crate (from the ripgrep ecosystem). Both grep and glob now respect `.gitignore`, `.ignore`, `.git/info/exclude`, and global gitignore rules automatically. Replace `glob` crate with `globset` for pattern matching. In glob tool, patterns without `/` now match against the basename (e.g. `*.rs` matches at any depth), consistent with ripgrep behavior. Grep no longer silently skips files exceeding the 1 MB size limit. Skipped files are reported at the end of the output so the agent knows they exist and can use the read tool to inspect them. Dependency changes: +ignore, +globset, -walkdir, -glob.

Convert multiline string literals to indoc! in edit, grep, and read tests for readability. Add gitignore integration tests for grep and glob to verify the ignore crate respects .gitignore rules.

Extract `parse_input`, `is_binary`, and `resolve_base_dir` into the parent tool module to eliminate duplication across all five tool files. - `parse_input<T>`: replaces identical 7-line deserialization blocks in bash, read, write, edit, grep, and glob. - `is_binary`: shared null-byte check, previously duplicated between read.rs (inline) and grep.rs (standalone function). - `resolve_base_dir`: shared cwd-fallback resolution, previously duplicated between grep.rs and glob.rs. Additional fixes from code review: - `pub` → `pub(crate)` on ToolDefinition fields, ToolOutput fields, and ToolRegistry methods to match their type visibility. - Reorder test sections in all tool files to follow the convention: happy path → variants → edge cases → error cases. - Move `read_text` / `is_binary` after their callers in grep.rs for top-down reading order. - Trim WHAT comments to WHY-only in read.rs and edit.rs. - Fix test import order in tool.rs (`super::*` before `super::bash`).

Replace prepend-newline pattern with writeln! so each output line terminates with \n. Produces cleaner indoc! strings in tests.

BINARY_CHECK_SIZE and is_binary were placed under the "Path Resolution" section divider despite being unrelated.

- Use String::from_utf8_lossy in read_text for consistency with ReadTool. Previously, files with stray invalid bytes were silently invisible to grep while readable via the read tool. - Sort collected files by mtime (newest first) in collect_files so all output modes get deterministic ordering. Previously only files_with_matches sorted; content and count modes used walker-dependent order. - Simplify format_files_with_matches now that input is pre-sorted.

The Result<String, String> → ToolOutput conversion was repeated identically in every tool's run() function. Centralizes it as a method on ToolOutput, reducing boilerplate in read, write, edit, glob, and grep tools.

Stop scanning files once head_limit + 1 matches are collected, matching the early-exit pattern already used in format_content. Previously all files were read and matched before truncation.

…in place - Add `use std::path::{Path, PathBuf}` and replace inline qualifiers. - Replace write!+push_str entry construction with single format! call (Cow<str> implements Display, so truncate_line can interpolate directly). - Rename format_skipped_warnings → append_skipped_warnings, taking &mut String to write directly into the caller's buffer instead of allocating an intermediate String.

- Rename search_dir/path → search_path in glob_files and resolve_base_dir for consistency with GrepParams. - Simplify entry_mtime: ok().and_then().unwrap_or() instead of nested unwrap_or calls. - Rewrite truncate_line to use a single char_indices pass instead of floor_char_boundary + separate chars().count().

Report MB instead of raw bytes, consistent with grep's format_skipped_warnings style.

- tool.rs: from_result Ok/Err arms, resolve_base_dir with/without path - write.rs: parent-is-a-file and path-is-a-directory error paths - edit.rs: too-large file guard, read-only file write rejection - glob.rs: invalid glob pattern error - grep.rs: invalid include pattern, single-file too large, context mode with no matches, context mode head_limit, files_with_matches no matches, count mode singular forms, head_limit zero (unlimited), head_limit across multiple files

Move MAX_OUTPUT_BYTES to tool.rs so bash and read share the same cap. Previously bash used 100 KB and read used 128 KB for the same purpose (preventing context window flooding).

…ystem - bash: use `bash -c` instead of `sh -c` to match the tool name and support bashisms (arrays, [[ ]], process substitution, etc.) - grep: clarify include parameter description to state it matches filenames only, not full paths - walk_files: add same_file_system(true) to prevent crossing mount points (Docker volumes, NFS, etc.)

format_files_with_matches used `>` instead of `>=` for the head_limit early-exit check, causing one extra file to be read and regex-matched before breaking. The redundant truncate() masked the bug — observable output was correct but work was wasted. Align with format_content which already uses `>=` for the same pattern. Remove the now-unnecessary truncate() call.

The normalize-then-restore pipeline only normalized file content, leaving replacement strings raw. If new_string contained \r\n on a CRLF-dominant file, apply_eol would double-expand it to \r\r\n. Normalize both strings alongside the file content so matching is line-ending-agnostic. Add a test that verifies \r\n in new_string does not produce \r\r\n. Also document that bare CR (\r without \n) is not detected by dominant_eol.

The workspace sets rust-version = "1.91" but the README claimed 1.85+.

The Anthropic API gates OAuth access for some organizations on the `claude-cli/` User-Agent prefix. Pin the version to the installed Claude Code release (2.1.87) for maximum compatibility.

Key finding: Claude Code on macOS reads OAuth tokens from the macOS Keychain (service "Claude Code-credentials"), not from ~/.claude/.credentials.json. The file is a fallback. The two can hold different tokens, causing stale-token 401 errors when ox reads only from the file. - Update anthropic-api.md with Keychain storage details, User-Agent format, and source references - Add macOS Keychain OAuth to Current Focus in roadmap

Group single-line computations with their immediate validation guards (early-return `if`) without a blank line between them, matching the pattern already used in stream_sse. Document the convention in CLAUDE.md.

Replace a handful of short `indoc!` expectations in bash and read tests with plain string literals. This keeps the assertions more compact and removes the now-unused `indoc` import from read.rs.

…tency

Bash: - Only set is_error on timeout / spawn failure, not nonzero exit codes. Many commands use nonzero exits normally (grep returns 1 for no matches, diff returns 1 for differences). Flagging these as errors caused the model to apologize and retry unnecessarily. - Append exit code to content when nonzero so the model still sees the failure signal without the behavioral side-effects of is_error. - Drop the "STDERR:" prefix — join stdout and stderr raw. - Trim leading blank lines from stdout and trailing whitespace from both streams. Read: - Switch to compact unpadded line numbers (N\t instead of right-aligned padding) to save tokens. Glob / Grep: - Return relative paths instead of absolute when inside the search base directory. Saves tokens and matches how developers think about file locations. - Add "Found N file(s)" header to grep files_with_matches output. Shared: - Add display_path helper for absolute-to-relative path conversion with single-file fallback.

Document ToolOutput is_error semantics (infrastructure failures only), is_binary detection strategy, MAX_LINE_LENGTH origin, truncate_line behavior, parse_input error return, and resolve_base_dir fallback.

Introduce ToolMetadata alongside ToolOutput to carry structured data for UI display and logging, separate from the model-facing content. Every tool now sets a title field with a concise summary (e.g., "Read Cargo.toml", "Created src/main.rs", "3 matches"). ToolMetadata fields: - title: short label for TUI rendering (all tools). - exit_code: process exit code (bash tool only). Also adds a description parameter to the bash tool schema. When the model provides a short description of what the command does (e.g., "Lists files in current directory"), it becomes the title displayed in the TUI for at-a-glance session history.

Add planned tool improvements: centralized output truncation pipeline and file-change tracking with read-before-write guards. Note ToolMetadata::title usage for TUI inline display.

- Extract duplicated file_name() from read, write, and edit into tool.rs as a shared pub(crate) helper. - Merge "Path Resolution" and "Display Path" sections into a unified "Path Utilities" section in tool.rs. - Split generic "Path Utilities" test section header into per-function headers (resolve_base_dir, display_path, file_name) per convention. - Move read_text() into the Search section of grep.rs alongside collect_files (both are file I/O helpers). - Add "Output Truncation" section divider in bash.rs. - Use raw strings for tool descriptions containing escaped quotes. - Add indoc! for multi-line test fixtures in read.rs and grep.rs.

Decision #4 in the design section claimed `truncated_total` would become the single structural signal; the PR ended up splitting into `truncated_total` (view-shape) + `truncated_bytes` (byte cap) after review caught a unit-conflation hazard. Notes now describe the split and the rationale. Source-line list also updated: the bash and read self-cap references are gone with the code; remaining entries point at the constants and helpers without brittle line ranges. Test-name references in decision #2 follow the rename from `truncate_output_*` to `cap_output_*`.

The original design doc described all three reference projects as turn-boundary queues, which is wrong for the default UX of every one of them. Claude Code's keyboard prompts default to `next` priority, meaning mid-turn drain between tool waves; Codex's Enter routes to `steer_input` → `pending_input` drained at sampling boundaries; and OpenCode's default `steer` setting persists user messages so the running `runLoop` sees them on the next `loadTranscript` reload, wrapped in `<system-reminder>`. All three converge on "fold queued text into the same multi-step turn at the round boundary, no abort". Update each reference's Queue section, refresh the comparison table to distinguish drain timing from queue location, rewrite the oxide-code Today section so it reflects the shipped phase 1 (queue exists, drains at turn end only) instead of the pre-shipping prediction, replace decision #4 with the mid-turn-drain design, and refresh the Sources section with current line numbers and reference precedents for the upcoming refactor.

PR #64 (modal infrastructure) shipped Option C: bare /model opens the combined picker, bare /effort errors with a usage hint pointing at /model. The user guide, design notes, and roadmap still described the older "both bare forms open the picker with different initial focus" shape. Updated: - docs/guide/slash-commands.md — table description, mid-turn classification paragraph, and the "Switching the Effort" / "Switching the Model" sections. - docs/design/slash/commands.md — design decision #5, /effort and /model per-command notes, source list (`agent_loop_task` → `agent_turn`). - docs/design/slash/modals.md — design decisions #4 (`SessionInfo` → `LiveSessionInfo`) and #7 (typed-arg-only contract). - docs/roadmap.md — moved the combined picker out of "Current Focus" (shipped in PR #64) into Working Today; replaced with the deferred /effort slider. - CLAUDE.md — `slash/effort.rs` description updated to match the typed-arg contract.

hakula139 added 15 commits April 3, 2026 11:58

chore(deps): add glob, regex, and walkdir

7c0933b

feat(read): add file reading tool with pagination and byte budget

7ff831d

feat(write): add file writing tool with directory creation

81f19ee

feat(edit): add exact string replacement tool with CRLF handling

ece4c7d

feat(glob): add file pattern matching tool

6fe55c1

feat(grep): add regex content search with output modes and context

5f73abd

feat(tool): register file and search tools in agent loop

4e3424f

docs(CLAUDE): add file tool modules to crate structure

a24afd1

docs(roadmap): move file tools to working today

13e22a9

fix(grep): use pre-truncation file count in count-mode summary

53490d4

The summary reported file_count from the truncated list while total_matches spanned all files, producing a misleading message. Also add tests for count-mode truncation and context-range merging.

refactor(write): use async metadata check for consistency

d488ab7

Replace blocking Path::exists() with tokio::fs::metadata().await to match the async filesystem pattern used in read.rs.

fix(edit): reject empty old_string to prevent file corruption

c549b89

Empty old_string matched at every position in the file, causing replace() to insert new_string between every character.

test(tool): cover file-size guard, truncation caps, and context separ…

439fe98

…ators

style(grep): apply rustfmt

5393b0b

docs(CLAUDE): add cargo fmt to verification checklist

3f142cd

hakula139 added the enhancement New feature or request label Apr 3, 2026

hakula139 self-assigned this Apr 3, 2026

hakula139 added the enhancement New feature or request label Apr 3, 2026

hakula139 added 12 commits April 3, 2026 13:38

docs(README): add configuration section with auth and env vars

6873194

style: sort Cargo features and align README table

ac3b04d

style(tool): use destructuring assignment for discarded fmt results

f4716c0

Replace `let _ = write!(...)` with `_ = write!(...)` across all tool modules. Both are correct; the shorter form is idiomatic modern Rust (stable since 1.66).

test(tool): use indoc for multiline strings and add gitignore tests

b0bdc55

Convert multiline string literals to indoc! in edit, grep, and read tests for readability. Add gitignore integration tests for grep and glob to verify the ignore crate respects .gitignore rules.

chore(cspell): add new words to dictionary

0ff8097

style(read): use writeln for trailing newlines in output

858af4b

Replace prepend-newline pattern with writeln! so each output line terminates with \n. Produces cleaner indoc! strings in tests.

style(tool): move binary detection to its own section

0422229

BINARY_CHECK_SIZE and is_binary were placed under the "Path Resolution" section divider despite being unrelated.

refactor(tool): add ToolOutput::from_result to eliminate duplication

6d8e02e

The Result<String, String> → ToolOutput conversion was repeated identically in every tool's run() function. Centralizes it as a method on ToolOutput, reducing boilerplate in read, write, edit, glob, and grep tools.

hakula139 added 22 commits April 3, 2026 18:12

perf(grep): short-circuit files_with_matches at head_limit

b11d407

Stop scanning files once head_limit + 1 matches are collected, matching the early-exit pattern already used in format_content. Previously all files were read and matched before truncation.

style(read): humanize file size in too-large error message

df1907d

Report MB instead of raw bytes, consistent with grep's format_skipped_warnings style.

refactor(tool): unify MAX_OUTPUT_BYTES as shared constant (128 KB)

a0ab439

Move MAX_OUTPUT_BYTES to tool.rs so bash and read share the same cap. Previously bash used 100 KB and read used 128 KB for the same purpose (preventing context window flooding).

docs(README): fix Rust version requirement (1.85 → 1.91)

4db2f27

The workspace sets rust-version = "1.91" but the README claimed 1.85+.

refactor(bash): extract TRUNCATION_OVERHEAD constant from magic numbers

d8db8bf

feat(client): add claude-cli User-Agent header for OAuth compatibility

2dc128f

The Anthropic API gates OAuth access for some organizations on the `claude-cli/` User-Agent prefix. Pin the version to the installed Claude Code release (2.1.87) for maximum compatibility.

style(oxide-code): normalize blank lines around compute + guard pairs

2c1c0bc

Group single-line computations with their immediate validation guards (early-return `if`) without a blank line between them, matching the pattern already used in stream_sse. Document the convention in CLAUDE.md.

test(tool): inline short expected output literals

2d419a1

Replace a handful of short `indoc!` expectations in bash and read tests with plain string literals. This keeps the assertions more compact and removes the now-unused `indoc` import from read.rs.

test(edit): reorder CRLF tests to match production function order

02a045c

docs(client): clarify CLI version comment wording

15d7c6e

style(oxide-code): add trailing periods to inline comments for consis…

67950db

…tency

docs(tool): add docstrings to pub items with non-obvious contracts

7dce1fd

Document ToolOutput is_error semantics (infrastructure failures only), is_binary detection strategy, MAX_LINE_LENGTH origin, truncate_line behavior, parse_input error return, and resolve_base_dir fallback.

docs(roadmap): add tool enhancement and TUI metadata plans

f945e5b

Add planned tool improvements: centralized output truncation pipeline and file-change tracking with read-before-write guards. Note ToolMetadata::title usage for TUI inline display.

hakula139 changed the title ~~feat(oxide-code): add file and search tools~~ feat(tool): add file and search tools with structured metadata Apr 4, 2026

hakula139 force-pushed the feat/file-tools branch from 575dd9d to dc3841b Compare April 4, 2026 16:12

hakula139 merged commit c6d23a0 into main Apr 4, 2026
1 check passed

hakula139 deleted the feat/file-tools branch April 4, 2026 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tool): add file and search tools with structured metadata#4

feat(tool): add file and search tools with structured metadata#4
hakula139 merged 58 commits intomainfrom
feat/file-tools

hakula139 commented Apr 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hakula139 commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hakula139 commented Apr 3, 2026 •

edited

Loading