feat(tools): coding-harness baseline primitives (#1205)#1208
Conversation
Adds the missing first-class tools that were forcing coding-oriented agents to fall through to `shell`. Existing tools (`file_read`, `file_write`, `shell`, `web_search`, `ask_clarification`, `spawn_subagent`) already cover read/write/bash/websearch/question/ task; this lands the genuinely missing primitives: - `grep` — regex search across files (Rust regex syntax, capped output) - `glob` — glob-pattern file discovery (newest-first) - `list` — non-recursive directory listing - `edit` — exact-match string replace (unique-or-`replace_all`) - `apply_patch` — atomic multi-edit batch (rolls back on validation failure) - `todowrite` — process-global todo registry (one in_progress at a time) - `plan_exit` — emits a stable `[plan_exit]` marker for plan→build hand-off (mode runner is follow-up) - `web_fetch` — single-purpose GET→text body, gated by allowed_domains - `lsp` — capability-gated stub (registered only when `OPENHUMAN_LSP_ENABLED=1`); schema is stable, backend is follow-up All new tools share the existing path-sandboxing, rate-limiting, and permission-level machinery in `SecurityPolicy` / `Tool`. Registered in `all_tools` (the canonical registry); `default_tools` keeps its minimal subset. Adds `docs/CODING_HARNESS.md` mapping the canonical surface — both existing tools and the new ones — with a permission table and notes on plan-mode / build-mode (the runner switch is explicitly listed as follow-up). Tests: 43 new unit tests across the new tools (path traversal, atomic rollback, ambiguous-match rejection, capability-gate parsing, …). All pass. Out of scope (called out in `docs/CODING_HARNESS.md`): - plan/build mode runners - child-session-backed `task` - real LSP backend - richer permission model - controller-registry RPC exposure of agent-only tools
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (8)
📝 WalkthroughWalkthroughAdds a coding-harness surface: new filesystem navigation/editing tools (grep, glob, list, edit, apply_patch), web fetch, agent control-flow primitives (plan_exit, todowrite), an optional LSP stub behind an env gate, wiring/exports, tests, docs, and two new Cargo dependencies. ChangesCoding-harness tool additions
Sequence Diagram(s)sequenceDiagram
actor Agent
participant Edit as EditFileTool
participant Sec as SecurityPolicy
participant FS as WorkspaceFS
Agent->>Edit: execute({path, old_string, new_string, replace_all})
rect rgba(100, 150, 200, 0.5)
Edit->>Sec: can_act(), is_rate_limited(), record_action()
Sec-->>Edit: OK
end
rect rgba(150, 100, 200, 0.5)
Edit->>FS: canonicalize(path), check symlink, read file
FS-->>Edit: file contents (UTF‑8)
end
rect rgba(100, 200, 150, 0.5)
Edit->>Edit: count occurrences, validate (0 => error, >1 && !replace_all => error)
Edit->>FS: write updated contents
FS-->>Edit: write result
end
Edit-->>Agent: ToolResult::success / ToolResult::error
sequenceDiagram
actor Agent
participant Plan as PlanExitTool
participant Todo as TodoWriteTool
participant Store as TodoStore
Agent->>Plan: execute({plan})
Plan->>Plan: trim & validate plan
Plan-->>Agent: ToolResult::success("[plan_exit]\n{plan}")
Agent->>Todo: execute({todos: [...]})
Todo->>Todo: validate items (non-empty, ≤1 in_progress)
Todo->>Store: replace(items)
Store-->>Todo: OK
Todo-->>Agent: ToolResult::success(rendered list)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (2)
src/openhuman/tools/impl/filesystem/apply_patch.rs (1)
69-231: ⚡ Quick winAdd grep-friendly tracing around validation and writes.
This new write path has security gates, many failure branches, and filesystem I/O, but it currently emits no stable diagnostics. A few
debug/tracelogs for entry, rejection reasons, and per-file write completion would make production failures much easier to root-cause without exposing file contents.As per coding guidelines,
src/**/*.rs: Uselog/tracingatdebug/tracelevels for Rust code diagnostics with stable grep-friendly prefixes ([domain],[rpc],[ui-flow]) and redact secrets/PII.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/impl/filesystem/apply_patch.rs` around lines 69 - 231, Add grep-friendly debug/trace logs in execute to make validation and writes observable: log entry into execute, each validation rejection branch (e.g., path not allowed, rate limit, missing/empty fields, old_string not found, file too large, symlink reject, resolve/read/write failures) using stable prefixes like "[rpc]" or "[ui-flow]" and redact file contents; log per-file successful write completions and the final summary. Place these logs near the relevant symbols/blocks: at the start of async fn execute(&self,...), inside the parsed loop where ParsedEdit is constructed and checks (old_string empty, is_path_allowed), in the path resolution/metadata block before inserting FileBuffer, when counting matches on buf.contents and before returning errors, and after tokio::fs::write for each FileBuffer; use log::debug or tracing::trace consistently and include identifying context (edit.index, edit.path, count) but never print file contents or secrets.src/openhuman/tools/impl/filesystem/edit_file.rs (1)
59-162: 💤 Low valueConsider adding diagnostic logging for the edit operation.
The
execute()method performs security checks, file operations, and validation without any logging. Per coding guidelines, new flows should includedebug/tracelevel logging with grep-friendly prefixes for diagnostics.📝 Example logging additions
+use tracing::{debug, trace}; // At start of execute(): +debug!("[edit] path={path} replace_all={replace_all}"); // After successful write: +debug!("[edit] completed: {path}, {count} replacement(s)"); // On validation failures: +trace!("[edit] rejected: old_string not found in {path}");Based on learnings: "Log entry/exit, branches, external calls, retries/timeouts, state transitions, and errors with default verbose diagnostics on new/changed flows"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/tools/impl/filesystem/edit_file.rs` around lines 59 - 162, Add grep-friendly debug/trace logging to the execute() flow: log entry/exit for execute(), key branch outcomes (security.can_act, is_rate_limited, is_path_allowed, is_resolved_path_allowed, record_action), filesystem calls/results (resolved path from tokio::fs::canonicalize, symlink check via tokio::fs::symlink_metadata, read_to_string contents length, replacement count and whether replace_all was used), and errors returned when canonicalize/read_to_string/write fail; use the existing logger (or introduce one if missing) and include the path/resolved.display(), count, and error messages in logs while keeping error messages returned via ToolResult unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/CODING_HARNESS.md`:
- Around line 36-38: The sentence containing "issue `#1205`" is being rendered as
a heading; update the line in CODING_HARNESS.md that reads "Names in parentheses
are the canonical coding-harness names from issue `#1205`." to prevent Markdown
treating the hash as a heading — either escape the hash as "\#1205" or reflow
the sentence so "issue `#1205`" stays on the previous line (or wrap the reference
in backticks) so it renders as plain text; make the change where the sentence
"Names in parentheses are the canonical coding-harness names..." appears.
In `@src/openhuman/tools/impl/filesystem/apply_patch.rs`:
- Around line 142-160: The code currently calls tokio::fs::canonicalize(&full)
before checking for symlinks, which loses the symlink bit; call
tokio::fs::symlink_metadata(&full).await first and if
meta.file_type().is_symlink() return ToolResult::error for edit.index/edit.path,
then proceed to tokio::fs::canonicalize(&full).await and the existing
self.security.is_resolved_path_allowed(&resolved) check; ensure the error
messages use the same format as the current ToolResult::error calls.
- Around line 214-223: The current write loop in apply_patch.rs writes buffers
directly (using tokio::fs::write on buf.resolved with buf.contents) which means
partial commits occur if a later write fails; change this to perform atomic
writes by writing each buffer to a temporary sibling file (e.g., buf.resolved +
".tmp" or use a unique temp name), fsync the temp file and its parent directory
as appropriate, then rename (std::fs::rename / tokio equivalent) the temp file
over the target to atomically replace it; on any failure, delete any created
temps and return ToolResult::error with the original error (preserving the
existing error handling), or alternatively stage all temps first and only rename
them after all temps are successfully created to guarantee an all-or-nothing
outcome (reference buffers, buf.resolved, buf.contents, and the
ToolResult::error return path).
In `@src/openhuman/tools/impl/filesystem/edit_file.rs`:
- Around line 117-123: The symlink check is ineffective because
tokio::fs::canonicalize was run before calling tokio::fs::symlink_metadata on
resolved; canonicalize follows symlinks so meta.file_type().is_symlink() will
always be false. Move or add a symlink_metadata check that inspects the original
un-canonicalized input path (the variable holding the user-provided path) prior
to calling tokio::fs::canonicalize, and return the same
ToolResult::error("Refusing to edit through symlink: ...") when that metadata
reports is_symlink(); keep the existing resolved-based checks for other
validations after canonicalization.
- Around line 99-103: record_action() is being called too early in EditFile::...
(in src/openhuman/tools/impl/filesystem/edit_file.rs), consuming the action
budget before path resolution, symlink/size checks, file reading, and match
validation; move the call to self.security.record_action() so it runs only after
all validations pass and immediately before the actual write operation (i.e.,
after path resolution, symlink/size checks, reading the file content, and match
validation) so failures during validation do not decrement the budget; ensure
any early-return error paths do not call record_action(), and keep the same
error handling if record_action() itself returns false.
In `@src/openhuman/tools/impl/filesystem/grep.rs`:
- Around line 191-193: The truncation code uses &line[..MAX_LINE_BYTES] which
can panic on UTF-8 multibyte boundaries; change the logic that builds
display_line (the branch that checks line.len() > MAX_LINE_BYTES) to compute a
safe byte index at or before MAX_LINE_BYTES by testing is_char_boundary() on the
candidate end and stepping backward until you hit a valid character boundary (or
use char-based slicing to accumulate characters up to the byte limit), then
slice line[..safe_end] and append the ellipsis; update references around
display_line, MAX_LINE_BYTES and line to use this safe_end approach so no
runtime panic occurs.
In `@src/openhuman/tools/impl/filesystem/list_files.rs`:
- Around line 90-102: The loop using read.next_entry().await currently ignores
Err by exiting the while-let and returning partial results; change it to
explicitly match the result from read.next_entry().await (handle
Ok(Some(entry)), Ok(None) => break, and Err(e) => propagate/return the error) so
I/O failures are not silently swallowed; update the logic around the entries
vector push (and MAX_ENTRIES check) inside the Ok(Some(entry)) arm and ensure
the function's return type returns or maps the error from next_entry() instead
of treating it as success.
In `@src/openhuman/tools/impl/network/web_fetch.rs`:
- Around line 96-114: The code currently only validates the initial URL with
validate_url, letting reqwest follow redirects and thereby bypass the allowlist;
fix by building the reqwest client with
.redirect(reqwest::redirect::Policy::none()) and then manually handling
redirects: after client.get(&url).send().await inspect 3xx responses for a
Location header, resolve it against resp.url(), call validate_url(new_url,
&self.allowed_domains) for each redirect hop (limit hops to a safe number), and
only follow to the next location if validation succeeds; perform the actual
fetch on the final validated URL and return ToolResult::error on any validation
failure or redirect resolution error (use the same error patterns as the
existing client/build/send branches).
- Around line 115-128: Currently the code calls resp.text().await which buffers
the entire response into memory before truncating; change the implementation to
consume resp.bytes_stream() (using futures_util::StreamExt) and iteratively
append chunks into a growable buffer until you've reached max_bytes or the
stream ends, stopping early to avoid allocating more than max_bytes; when
stopping early set the truncated flag true, ensure you cut the final buffer on a
char boundary (like the existing is_char_boundary logic) before converting to
&str, and map any stream/IO errors into the same ToolResult::error path so
errors from the streaming read are returned consistently (update the
snippet/truncated logic around the resp handling and keep the same return
behavior).
In `@src/openhuman/tools/impl/system/lsp.rs`:
- Around line 131-157: These tests mutate the process-wide env var
LSP_ENABLED_ENV and can race when run in parallel; wrap the body of
lsp_capability_gate_off_by_default and lsp_capability_gate_accepts_truthy_values
with a shared process-wide mutex (e.g., a static Mutex via once_cell::sync::Lazy
or lazy_static) and acquire the lock at the start of each test before changing
the env, then release at the end (drop the guard) so tests run serially;
reference LSP_ENABLED_ENV, lsp_capability_enabled, and the two test functions to
locate where to add the guard.
In `@src/openhuman/tools/ops.rs`:
- Around line 118-124: The TodoStore is being created inline inside
all_tools_with_runtime (via Arc::new(TodoStore::new()) passed to
TodoWriteTool::new) which makes todowrite state transient; instead hoist the
Arc<TodoStore> creation out of the registry constructor and accept or capture a
shared Arc<TodoStore> so the same store is reused across calls. Update the
all_tools_with_runtime signature (or its caller) to take an Arc<TodoStore>
parameter and pass that Arc into TodoWriteTool::new, or create a process-wide
singleton Arc<TodoStore> and use that, ensuring TodoWriteTool::new receives the
shared store rather than constructing TodoStore::new inline.
---
Nitpick comments:
In `@src/openhuman/tools/impl/filesystem/apply_patch.rs`:
- Around line 69-231: Add grep-friendly debug/trace logs in execute to make
validation and writes observable: log entry into execute, each validation
rejection branch (e.g., path not allowed, rate limit, missing/empty fields,
old_string not found, file too large, symlink reject, resolve/read/write
failures) using stable prefixes like "[rpc]" or "[ui-flow]" and redact file
contents; log per-file successful write completions and the final summary. Place
these logs near the relevant symbols/blocks: at the start of async fn
execute(&self,...), inside the parsed loop where ParsedEdit is constructed and
checks (old_string empty, is_path_allowed), in the path resolution/metadata
block before inserting FileBuffer, when counting matches on buf.contents and
before returning errors, and after tokio::fs::write for each FileBuffer; use
log::debug or tracing::trace consistently and include identifying context
(edit.index, edit.path, count) but never print file contents or secrets.
In `@src/openhuman/tools/impl/filesystem/edit_file.rs`:
- Around line 59-162: Add grep-friendly debug/trace logging to the execute()
flow: log entry/exit for execute(), key branch outcomes (security.can_act,
is_rate_limited, is_path_allowed, is_resolved_path_allowed, record_action),
filesystem calls/results (resolved path from tokio::fs::canonicalize, symlink
check via tokio::fs::symlink_metadata, read_to_string contents length,
replacement count and whether replace_all was used), and errors returned when
canonicalize/read_to_string/write fail; use the existing logger (or introduce
one if missing) and include the path/resolved.display(), count, and error
messages in logs while keeping error messages returned via ToolResult unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6353139b-4faf-4a3a-9825-9cb2f350a224
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
CLAUDE.mdCargo.tomldocs/CODING_HARNESS.mdsrc/openhuman/tools/impl/agent/mod.rssrc/openhuman/tools/impl/agent/plan_exit.rssrc/openhuman/tools/impl/agent/todo_write.rssrc/openhuman/tools/impl/filesystem/apply_patch.rssrc/openhuman/tools/impl/filesystem/edit_file.rssrc/openhuman/tools/impl/filesystem/glob_search.rssrc/openhuman/tools/impl/filesystem/grep.rssrc/openhuman/tools/impl/filesystem/list_files.rssrc/openhuman/tools/impl/filesystem/mod.rssrc/openhuman/tools/impl/network/mod.rssrc/openhuman/tools/impl/network/web_fetch.rssrc/openhuman/tools/impl/system/lsp.rssrc/openhuman/tools/impl/system/mod.rssrc/openhuman/tools/ops.rs
…yhumansai#1205) - grep: walk back to UTF-8 char boundary before slicing line preview (previously could panic on multi-byte chars near the 2KB cap). - edit, apply_patch: check symlink on the *unresolved* path — `canonicalize` resolves symlinks, so the prior post-canonicalize check never fired. Also split metadata/symlink check to keep size + symlink guards correct. - apply_patch: best-effort rollback on partial-write failure. We cannot get true multi-file atomicity without filesystem transactions, but if the i-th write fails we now restore the previously-written files from in-memory snapshots taken at read time. Stored as a new `original` field on `FileBuffer`. - list: surface directory-iter errors instead of treating any `Err` from `next_entry()` as end-of-stream (silently truncating output). - web_fetch: disable reqwest's default redirect policy. A 3xx target could be on a host outside the allowed-domains list, so we now surface the redirect to the caller and require an explicit refetch. - todo_write: hoist the `TodoStore` to a process-global `OnceCell` so todo state survives registry rebuilds. The previous `Arc::new(TodoStore::new())` baked into `all_tools_with_runtime` would reset the list whenever the agent loop rebuilt its tool set. - lsp tests: serialize env-var mutation behind a module-level mutex so the truthy/falsy and gate-off-by-default tests don't race. - docs/CODING_HARNESS.md: rewrap so a sentence-leading `tinyhumansai#1205` doesn't get parsed as a Markdown heading. Replied inline on the `record_action()` ordering comment dismissing it: the call-before-validate ordering is intentional (matches `file_read.rs:65`) and prevents probing-via-error-path.
* feat(remotion): Ghosty character library with transparent MOV variants (tinyhumansai#1059) Co-authored-by: WOZCODE <contact@withwoz.com> * feat(composio/gmail): sync into memory tree (Slack-parity) (tinyhumansai#1056) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(scheduler-gate): throttle background AI on battery / busy CPU (tinyhumansai#1062) * fix(core,cef): run core in-process and stop orphaning CEF helpers on Cmd+Q (tinyhumansai#1061) * ci: add dedicated staging release workflow (tinyhumansai#1066) * fix(sentry): Rust source context + per-release deploy marker (tinyhumansai#405) (tinyhumansai#1067) * fix(welcome): re-enable OAuth buttons with focus/timeout recovery (tinyhumansai#1049) (tinyhumansai#1069) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(dependencies): update pnpm-lock.yaml and Cargo.lock for package… (tinyhumansai#1082) * fix(onboarding): personalize welcome agent greeting with user identity (tinyhumansai#1078) * fix(chat): make agent message bubbles fit content width (tinyhumansai#1083) * Feat/dmg checks (tinyhumansai#1084) * fix(linux): Add X11 platform flags to .deb package launcher (tinyhumansai#1087) Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com> * fix(sentry): auto-send React events; collapse core→tauri for desktop (tinyhumansai#1086) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * fix(cef): run blank reload guard on the CEF UI thread (tinyhumansai#1092) * fix(app): reload webview instead of restart_app in dev mode (tinyhumansai#1068) (tinyhumansai#1071) * fix(linux): deliver X11 ozone flags via custom .desktop template (tinyhumansai#1091) * fix(webview-accounts): retry data-dir purge so CEF handle race doesn't leak cookies (tinyhumansai#1076) (tinyhumansai#1081) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * fix(webview/slack): media perms + deep-link isolation (tinyhumansai#1074) (tinyhumansai#1080) Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * ci(release): split staging vs production workflows; promote staging tags (tinyhumansai#1094) * Update release-staging.yml (tinyhumansai#1097) * chore(staging): v0.53.5 * chore(staging): v0.53.6 * ci(staging): cut staging from main; add act local-debug helper (tinyhumansai#1099) * chore(staging): v0.53.7 * fix(ci): correct sentry-cli download URL and trap scope (tinyhumansai#1100) * chore(staging): v0.53.8 * feat(chat): forward thread_id to backend for KV cache locality (tinyhumansai#1095) * fix(ci): bump pinned sentry-cli to 3.4.1 (2.34.2 was never published) (tinyhumansai#1102) * chore(staging): v0.53.9 * fix(ci): drop bash trap in upload_sentry_symbols.sh; inline cleanup (tinyhumansai#1103) * chore(staging): v0.53.10 * refactor(session): flatten session_raw/, switch md to YYYY_MM_DD (tinyhumansai#1098) * Add full Composio managed-auth toolkit catalog (tinyhumansai#1093) * ci: add diff-aware 80% coverage gate (Vitest + cargo-llvm-cov) (tinyhumansai#1104) * feat(scripts): pnpm work + pnpm debug for agent-driven workflows (tinyhumansai#1105) * ci: pull pnpm into CI image, drop redundant setup steps (tinyhumansai#1107) * docs: add Cursor Cloud specific instructions to AGENTS.md (tinyhumansai#1106) Co-authored-by: Cursor Agent <cursoragent@cursor.com> * chore(staging): v0.53.11 * docs: surface 80% coverage gate and scripts/debug runners (tinyhumansai#1108) * feat(app): show Composio integrations as sorted icon grid on Skills (tinyhumansai#1109) Co-authored-by: Cursor Agent <cursoragent@cursor.com> * feat(composio): client-side trigger enable/disable toggles (tinyhumansai#1110) * feat(skills): channels grid + integrations card polish; tolerant Composio trigger decode (tinyhumansai#1112) * chore(staging): v0.53.12 * feat(home): early-bird banner + assistant→agent terminology (tinyhumansai#1113) * feat(updater): in-app auto-update with auto-download + restart prompt (tinyhumansai#677) (tinyhumansai#1114) * chore(claude): add ship-and-babysit slash command (tinyhumansai#1115) * feat(home): EarlyBirdyBanner + agent terminology + LinkedIn enrichment model pin (tinyhumansai#1118) * fix(chat): single onboarding thread in sidebar after wizard (tinyhumansai#1116) Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com> * fix: filter out global namespace from citation chips (tinyhumansai#1124) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> * feat(nav): enable Memory tab in BottomTabBar (tinyhumansai#1125) * feat(memory): singleton ingestion + status RPC + UI pill (tinyhumansai#1126) * feat(human): mascot tab with viseme-driven lipsync (staging only) (tinyhumansai#1127) * Fix CEF zombie processes on full app close and restart (tinyhumansai#1128) Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> * Update issue templates for GitHub issue types (tinyhumansai#1146) * feat(human): expand mascot expressions and tighten reply-speech state machine (tinyhumansai#1147) * feat(memory): ingestion pipeline + tree-architecture docs + ops/schemas split (tinyhumansai#1142) * feat(threads): surface live subagent work in parent thread (tinyhumansai#1122) (tinyhumansai#1159) * fix(human): keep mascot mouth animating when TTS ships no viseme data (tinyhumansai#1160) * feat(composio): consume backend markdownFormatted for LLM output (tinyhumansai#1165) * fix(subagent): lazy-register toolkit actions filtered out of fuzzy top-K (tinyhumansai#1162) * feat(memory): user-facing long-term memory window preset (tinyhumansai#1137) (tinyhumansai#1161) * fix(tauri-shell): proactively kill stale openhuman RPC on startup (tinyhumansai#1166) * chore(staging): v0.53.13 * fix(composio): per-action tool consumes backend markdownFormatted (tinyhumansai#1167) * fix(threads): persist selectedThreadId across reloads (tinyhumansai#1168) * feat(memory_tree): switch embed model to bge-m3 (1024-dim, 8K context) (tinyhumansai#1174) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(agent): drop redundant [Memory context] recall injection (tinyhumansai#1173) * chore(memory_tree): drop body-read timeouts on Ollama HTTP calls (tinyhumansai#1171) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(transcript): emit thread_id + fix orchestrator missing cost (tinyhumansai#1169) * fix(composio/gmail): phase out html2md, prefer text/plain MIME part (tinyhumansai#1170) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): markdown output for internal tool results (tinyhumansai#1172) * feat(security): enforce prompt-injection guard before model and tool execution (tinyhumansai#1175) * fix(cef): popup paint dies after first frame — skip blank-page guard for popups (tinyhumansai#1079) (tinyhumansai#1182) Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com> * chore(sentry): rename OPENHUMAN_SENTRY_DSN → OPENHUMAN_CORE_SENTRY_DSN (tinyhumansai#1186) * feat(remotion): add yellow mascot character with all animation variants (tinyhumansai#1193) Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(composio): hide raw connection ID, derive friendly label (tinyhumansai#1153) (tinyhumansai#1185) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> * fix(windows): align install.ps1 MSI with per-machine scope (tinyhumansai#913) (tinyhumansai#1187) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(tauri): deterministic CEF teardown on full app close (tinyhumansai#1120) (tinyhumansai#1189) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(composio): cap Gmail HTML body before strip (crash mitigation) (tinyhumansai#1191) Co-authored-by: Cursor <cursoragent@cursor.com> * fix(auth): stop stale chat threads after signup (tinyhumansai#1192) Co-authored-by: Cursor <cursoragent@cursor.com> * feat(sentry): staging-only "Trigger Sentry Test" button (tinyhumansai#1072) (tinyhumansai#1183) * chore(staging): v0.53.14 * chore(staging): v0.53.15 * feat(composio): format trigger slugs into human-readable labels (tinyhumansai#1129) (tinyhumansai#1179) Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> * fix(ui): hide unsupported permission UI on non-macOS for Screen Intelligence (tinyhumansai#1194) Co-authored-by: Cursor <cursoragent@cursor.com> * chore(tauri-shell): retire embedded Gmail webview-account flow (tinyhumansai#1181) * feat(onboarding): replace welcome-agent bot with react-joyride walkthrough (tinyhumansai#1180) * chore(release): v0.53.16 * fix(threads): preserve selectedThreadId on cold-boot identity hydration (tinyhumansai#1196) * feat(core): version/shutdown/update RPCs + mid-thread integration refresh (tinyhumansai#1195) * fix(mascot): swap to yellow mascot via @remotion/player (tinyhumansai#1200) * feat(memory_tree): cloud-default LLM, queue priority, entity filter, Memory tab UI (tinyhumansai#1198) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Persist turn state + restore conversation history on cold-boot (tinyhumansai#1202) * feat(mascot): floating desktop mascot via native NSPanel + WKWebView (macOS) (tinyhumansai#1203) * fix(memory/tree): emit summary children as Obsidian wikilinks (tinyhumansai#1210) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(tools): coding-harness baseline primitives (tinyhumansai#1205) (tinyhumansai#1208) * docs: add Codex PR checklist for remote agents --------- Co-authored-by: Steven Enamakel <31011319+senamakel@users.noreply.github.com> Co-authored-by: WOZCODE <contact@withwoz.com> Co-authored-by: sanil-23 <sanil@vezures.xyz> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Cyrus Gray <144336577+graycyrus@users.noreply.github.com> Co-authored-by: CodeGhost21 <164498022+CodeGhost21@users.noreply.github.com> Co-authored-by: oxoxDev <164490987+oxoxDev@users.noreply.github.com> Co-authored-by: Mega Mind <146339422+M3gA-Mind@users.noreply.github.com> Co-authored-by: Gaurang Patel <ptelgm.yt@gmail.com> Co-authored-by: unn-Known1 <unn-known1@users.noreply.github.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Cursor Agent <cursoragent@cursor.com> Co-authored-by: Steven Enamakel <senamakel@users.noreply.github.com> Co-authored-by: Steven Enamakel's Droid <enamakel.agent@tinyhumans.ai> Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: senamakel-droid <281415773+senamakel-droid@users.noreply.github.com> Co-authored-by: YellowSnnowmann <167776381+YellowSnnowmann@users.noreply.github.com> Co-authored-by: Neil <neil@maha.xyz> Co-authored-by: Neel Mistry <neelmistry@Neels-MacBook-Pro.local> Co-authored-by: obchain <167975049+obchain@users.noreply.github.com> Co-authored-by: Jwalin Shah <jshah1331@gmail.com>
Summary
grep,glob,list,edit,apply_patch,todowrite,plan_exit,web_fetch,lsp) so coding-oriented agents stop falling through toshellfor file navigation and editing.docs/CODING_HARNESS.mdmapping the canonical surface (existing tools + new ones) with a permission table and notes on plan/build modes.file_read,file_write,shell,web_search,ask_clarification,spawn_subagent) already cover read/write/bash/websearch/question/task — this PR adds only the genuinely missing primitives instead of duplicating those.Problem
OpenHuman's tool surface for coding-oriented agents was uneven:
grep,glob,list) didn't exist as first-class tools — agents had to useshellwith platform-specific commands.edit,apply_patch) was missing —file_writeis whole-file overwrite, which is wasteful for small changes and dangerous for large files.todowrite) and a plan→build hand-off marker (plan_exit) didn't exist.web_fetch) sat betweenweb_search(search) andhttp_request(full HTTP), but wasn't exposed.This is the baseline cut from issue #1205 (path A in scoping discussion): land the primitives + the mapping doc, defer plan/build mode runners, child-session task identity, real LSP backend, and richer permissions to follow-ups (explicitly listed at the bottom of
docs/CODING_HARNESS.md).Solution
New tools live under
src/openhuman/tools/impl/{filesystem,agent,network,system}/following the existingTooltrait pattern. Each shares the project's path-sandboxing, rate-limiting, andPermissionLevelmachinery fromSecurityPolicy. Registered inall_tools(the canonical agent registry);default_toolskeeps its minimal subset.Highlights:
grep/globusewalkdirand skip.git,node_modules,target,.next,dist,build,.cache. Output is capped (max_matches/max_results) with explicit truncation markers.editrequiresold_stringto be unique by default —replace_allopts into bulk replace. Same symlink + size guards asfile_write.apply_patchis atomic: every edit is validated up front (paths, exact-match, uniqueness) before any file is written. Validation failure leaves the workspace untouched. Multiple edits to the same file chain through an in-memory buffer, so a sequence like{old:"one"→"ONE"}, {old:"two"→"TWO"}works as expected.todowriteis a process-globalArc<TodoStore>(one shared registry per core). Per-session scoping deferred untiltaskcarries a stable session id.plan_exitemits a stable[plan_exit]marker plus the plan text. The plan→build mode runner that consumes the marker is follow-up work; today the marker is just stable enough to write prompts against.web_fetchreuses the sameallowed_domainsgate ashttp_request. Body is capped + truncated on UTF-8 char boundaries.lspis capability-gated byOPENHUMAN_LSP_ENABLED. When the gate is off the tool isn't registered (so agents don't see a method that always errors). When on, the tool returns a clearnot yet implementederror — schema is stable so callers can feature-detect.Adds
walkdir = \"2\"andglob = \"0.3\"as direct deps (both already transitive inCargo.lock).Submission Checklist
diff-coverlocally. If the gate fails, will add tests.N/A: tool catalog is documented in docs/CODING_HARNESS.md; the matrix tracks user-visible product features, not internal agent toolsN/A: no matrix rows touchedwalkdir+globare both pure-Rust crates already transitively presentN/A: agent-facing tool catalog, no UI surfaceCloses #NNNin the## RelatedsectionImpact
all_toolsso existing agent loops pick them up automatically; no wiring change in the Tauri shell.SecurityPolicy(workspace path sandbox, symlink-escape blocks, rate limit, autonomy gate).apply_patchruns all path/permission checks before the first write.--no-verifybecause the local pre-push hook failed on pre-existing app/ ESLint warnings (ReactsetState-in-effect lints in mascot / commands code) plus a Tailwind class lint check — none of which this PR touches. Per CLAUDE.md (When the user asks you to push or open a PR, resolve blockers and push — don't prompt for permission. If a pre-push hook fails on something unrelated to your changes ... push with --no-verify and call it out in the PR body.)Related
[plan_exit]markertaskexecution with stable session idslsptool (server spawn, JSON-RPC bridge, definition/references/hover/completion)Summary by CodeRabbit
New Features
Documentation
Chores