Skip to content

Initial implementation of LLM coding tools#1

Merged
Sewer56 merged 64 commits intomainfrom
impl-initial-tools
Jan 15, 2026
Merged

Initial implementation of LLM coding tools#1
Sewer56 merged 64 commits intomainfrom
impl-initial-tools

Conversation

@Sewer56
Copy link
Copy Markdown
Member

@Sewer56 Sewer56 commented Jan 5, 2026

Summary

  • Adds llm-coding-tools-core crate with lightweight, high-performance implementations of common LLM coding operations (bash, edit, glob, grep, read, write, task, todo, webfetch)
  • Adds llm-coding-tools-rig crate with rig framework integration
  • Includes tool context/preamble generation system for LLM prompts
  • Supports both absolute path and allowed-directory-sandboxed path modes

ReadTool<const LINE_NUMBERS: bool = true> allows compile-time selection
of output format. When false, returns raw content without L{n}: prefixes.
Branch in hot loop is eliminated at compile time for zero runtime cost.
Use the grep crate as a library instead of spawning the rg binary,
eliminating the external dependency requirement. This also:
- Uses schemars::schema_for! for consistent schema generation
- Reuses Searcher instance across files for better performance
- Enables binary detection to skip binary files
- Removes rg_available() guards from tests
…tions

- Return line numbers and matched content instead of just file paths
- Add LINE_NUMBERS const generic for optional L{n}: prefix (like ReadTool)
- Use hierarchical GrepFileMatches/GrepLineMatch structure for efficient grouping
- Optimize with single Vec: collect, sort, truncate in place
- Pre-allocate output string (~8KiB) and files vec (~4KiB)
- Remove redundant matcher.find() check (searcher already filters)
- Change limit semantics from files to matches
- Export new types: GrepArgs, GrepFileMatches, GrepLineMatch, GrepOutput
Eliminates unnecessary memory copy from BufReader's internal buffer to user
buffer by using fill_buf() + consume() instead of read_until(). Lines are
now processed directly from BufReader's buffer with memchr for SIMD-optimized
newline scanning. Falls back to overflow buffer only for lines spanning
buffer boundaries.
- Add html-to-markdown-rs dependency for proper HTML to markdown conversion
- Increase MAX_RESPONSE_SIZE from 1MB to 5MB
- Use aggressive preprocessing to remove navigation, forms, scripts
- Strip img, svg, script, style, noscript tags
- Preserves semantic structure (headings, lists, links) for better LLM comprehension
- Add maybe-async dependency to reduce code duplication
- Refactor edit.rs, write.rs, read.rs to use #[maybe_async] attribute
- Split bash.rs into bash/mod.rs, async_impl.rs, blocking_impl.rs
- Rename 'tokio' feature to 'async', add 'blocking' feature for sync mode
- Update fs.rs cfg flags from 'tokio' to 'blocking'
- Update CI workflow to test both async and blocking modes with coverage
- Update AGENTS.md with correct verification commands

Lines saved: ~540 (async/sync duplication eliminated)
Restructure webfetch.rs into webfetch/ module with separate async and
blocking implementations using reqwest's blocking feature. This allows
fetch_url to work in both async and blocking modes, similar to bash.
- Introduce feature hierarchy: 'tokio' (default) depends on 'async' for compile-time safety
- 'async' is now a base feature requiring a runtime; tokio runtime is provided by 'tokio' feature
- Add compile_error! macros to prevent invalid feature combinations (async without runtime, async + blocking)
- Update coding-tools-rig to depend on 'tokio' feature explicitly
- Update CI workflow to use explicit --features tokio instead of --all-features
- Document feature flags in AGENTS.md with clear guidance on usage
- Create new context module in coding-tools-core with 15 static string constants
- Add 15 context .txt files: 5 for non-path tools (bash, task, todoread, todowrite, webfetch)
- Add 10 context files for path tool variants (read/write/edit/glob/grep with absolute/allowed)
- Each context string provides LLM-focused guidance for proper tool usage
- Re-export context module from coding-tools-rig for public API access
- Update Cargo.toml include directive to package README.md
- Create examples/basic.rs showing absolute and allowed path tools setup
- Demonstrate context string usage for LLM system prompts
- Include ToolSet builder pattern for dynamic tool management
- Compile and run without API keys (cargo run --example basic)
- Update tokio dev-dependency to include rt-multi-thread feature
- Document Feature Flags section in root README with tokio/blocking modes
- Add Context Module documentation to coding-tools-core README
- Document context strings usage with examples for path-based tools
- Add Quick Start section to coding-tools-rig README with example reference
- Update example code in root README with generic type annotations
- Enhance rig README with complete usage examples for tool instantiation
- Add ToolSet builder pattern and context string re-export documentation
- Update documentation links and provide clearer guidance on available tools
…le generation

- Introduce ToolContext trait in coding-tools-core for tools to provide preamble context
- Implement PreambleBuilder in core crate for framework-agnostic preamble tracking and generation
- Implement ToolContext for all 15 tools in coding-tools-rig (absolute/allowed variants and standalone)
- Use pass-through tracking pattern: track() records context and returns tool unchanged
- Zero-cost abstraction with trait method returning &'static str
- Update re-exports in both core and rig crates for public API access
- Demonstrate PreambleBuilder usage in examples/basic.rs
- Fix WriteTool<true> bug in root README (WriteTool has no const generic)
- Restructure rig README to distinguish file tools vs other tools
- Add PreambleBuilder section with usage pattern to rig README
- Add full_agent.rs example showing complete agent configuration
- Add sandboxed.rs example demonstrating allowed::* tools
- Enhance basic.rs with TodoTools demonstration
- Add Examples section to both READMEs
- Renamed repo: rig-coding-tools -> llm-coding-tools
- Renamed crates: coding-tools-core -> llm-coding-tools-core
                  coding-tools-rig -> llm-coding-tools-rig
- Updated all imports, docs, examples, and CI workflows
- Removed non-existent CONTRIBUTING.md reference from PR template
@Sewer56
Copy link
Copy Markdown
Member Author

Sewer56 commented Jan 5, 2026

@coderabbitai full-review

Sewer56 added 15 commits January 8, 2026 11:47
…ctions

Add documentation clarifying that when bash/shell tool is enabled, the
path resolver's protections are advisory since arbitrary shell commands
can access any file. Recommends disabling bash or using OS-level
sandboxing for actual filesystem restrictions.
Replace byte-index slicing with char-based truncation to prevent
panics on multi-byte UTF-8 characters in description and preamble
output.
Remove redundant generic type annotations (ReadTool<true>, GrepTool<true>)
since LINE_NUMBERS defaults to true. Add example showing how to opt out
of line numbers using the explicit generic (ReadTool::<false>).
The BashArgs.workdir documentation states it must be an absolute path,
but this constraint was not enforced. Added is_absolute() check before
is_dir() check to provide a clearer error message when a relative path
is provided.
…er substitution

- Introduce const generic ENV parameter to PreambleBuilder for compile-time elimination of environment section
- Add working_directory() method (only available when ENV=true) accepting runtime string paths
- Implement separate build() methods optimized for each ENV variant via impl specialization
- Add Substitute extension trait providing substitute() and substitute_all() for string placeholder replacement
- Update bash.txt context to reference Environment section instead of inline {directory} placeholder
- Update examples with explicit PreambleBuilder::<false> type annotations for clarity
- Add comprehensive tests for environment section rendering, working directory acceptance, and placeholder substitution
- Maintain full backwards compatibility: default ENV=false preserves existing API
…ontext files

- Enhanced all context files with structured sections: Description, Parameters, When to Use, When NOT to Use, Examples, Best Practices
- Increased documentation depth from 10-20 lines to 40-130 lines per file
- Added critical safety guidance to grep files: 'NEVER invoke grep or rg as a Bash command'
- Expanded bash.txt with complete git workflows, PR creation protocols, and safety constraints
- Standardized parameter names to snake_case across all context files
- Improved usability and consistency for AI agent tool integration
Aligns examples with a runnable agent setup and trims usage details for clarity.
Add rustdoc and README snippets showing the preamble format.
Add a test that pins mtimes to validate ordering.
Read into an uninitialized buffer to reduce overhead.
Adds sparse comments and marks helper inline for readability.
Add a generic PreambleBuilder example alongside the rig ToolSet snippet, and annotate the read operation's buffering flow for easier maintenance.
Switch preamble examples to core-only types with no_run blocks and explicit generics so doctests compile cleanly.
Swap to globset matchers, drop regex error handling, and pin ripgrep-aligned crate versions.
@Sewer56 Sewer56 merged commit 8406a4f into main Jan 15, 2026
5 checks passed
Sewer56 added a commit that referenced this pull request Mar 30, 2026
Initial implementation of LLM coding tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant