Extract llm-coding-tools-agents crate and CI wiring#33
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #33 +/- ##
==========================================
+ Coverage 66.75% 71.03% +4.28%
==========================================
Files 41 48 +7
Lines 1158 1502 +344
==========================================
+ Hits 773 1067 +294
- Misses 385 435 +50
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
Introduce a new crate for agent framework support with catalog-based tool discovery and TOML manifest configuration. This provides a foundational integration point for registering and discovering agent-capable tools within the workspace. Changes: - Add llm-coding-tools-agents crate with catalog, config, and loader modules - Add workspace member entry for llm-coding-tools-agents in Cargo.toml - Update Cargo.lock with new crate dependencies Benefits: - Enables agent framework integration with tool discovery - Provides structured manifest configuration for agent capabilities - Establishes foundation for agent orchestration workflows
WalkthroughAdds a new workspace crate Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (7)
src/llm-coding-tools-agents/src/config.rs (1)
90-124: Serialization round-trip loses thepromptfield.
AgentConfigderives bothSerializeandDeserialize, butpromptis#[serde(skip)]. Serializing and deserializing anAgentConfigwill silently drop the prompt. If round-tripping is intended (e.g., caching configs), this will be a data-loss bug. If serialization is only used for inspection/logging, this is fine — consider documenting the limitation on the struct or thepromptfield.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/config.rs` around lines 90 - 124, AgentConfig currently uses #[serde(skip)] on the prompt field which prevents serialization/deserialization and causes data loss on round-trips; to fix it, stop skipping prompt and make it serde-friendly (e.g., remove #[serde(skip)] and add #[serde(default)] or otherwise implement Serialize/Deserialize handling) so prompt is preserved across Serialize/Deserialize operations—update the AgentConfig struct's prompt field (named prompt) and ensure any deserialization path supplies a default empty string when the field is missing.src/llm-coding-tools-agents/src/error.rs (1)
35-57: Repeated path formatting logic acrossDisplayarms.Lines 39–41, 45–47, and 51–53 contain identical
path_strextraction. Consider a small helper to DRY this up.Suggested refactor
+fn format_path(path: &Option<PathBuf>) -> &str { + path.as_deref() + .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")) +} + impl fmt::Display for AgentLoadError { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { AgentLoadError::Io { path, source } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); + let path_str = format_path(path); write!(f, "I/O error reading {path_str}: {source}") } AgentLoadError::Parse { path, source } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); + let path_str = format_path(path); write!(f, "parse error in {path_str}: {source}") } AgentLoadError::SchemaValidation { path, message } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); + let path_str = format_path(path); write!(f, "schema validation failed in {path_str}: {message}") } } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/error.rs` around lines 35 - 57, The Display impl for AgentLoadError repeats the same path extraction in fmt; refactor by moving that logic into a small helper (either a private function or a closure) inside fmt or an impl helper on AgentLoadError (e.g., a method like AgentLoadError::path_display(&self) -> String) and replace the duplicated let path_str = ... blocks in the match arms with a single call to that helper, then use the returned string in each write! invocation.src/llm-coding-tools-agents/src/loader.rs (2)
237-249:add_from_bytesloses potential owned allocation from the caller.The signature
bytes: impl AsRef<[u8]>borrows the input, sofrom_utf8returns a&strthat must be cloned into aStringfor parsing. If callers commonly passVec<u8>, an overload acceptingVec<u8>(usingString::from_utf8) could save one allocation. Not critical for typical agent-file sizes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 237 - 249, The add_from_bytes method currently accepts bytes: impl AsRef<[u8]>, forcing a borrowed path that produces a &str via std::str::from_utf8 and causes callers who own a Vec<u8> to incur an extra allocation when they must clone into a String; add a consuming overload that takes Vec<u8> (e.g., add_from_vec or change signature to bytes: impl Into<Vec<u8>>) and parse with String::from_utf8 to reuse the owned buffer, mapping FromUtf8Error into AgentLoadError::schema_validation the same way as the existing UTF-8 error handling, while keeping the existing add_from_bytes (or delegating it to the new consuming implementation) and continuing to call config_from_str_strict and catalog.insert as before.
106-126:add_directory_with_errors: file-level errors are swallowed when no callback is provided.When
on_errorisNone, file-level load failures are silently dropped (Line 118–121 only invokes the handler ifSome). This is documented behavior matchingadd_directory, but callers should be aware that partial loads are indistinguishable from full loads at the return-type level. Consider whether a summary (e.g., a count of skipped files) would be valuable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 106 - 126, add_directory_with_errors currently swallows file-level load errors when on_error is None; modify the function (and AgentLoadError) to accumulate failures during load_directory_with (track a skipped_count and optionally a Vec<Path>) and after iteration, if skipped_count > 0 and on_error.is_none() return an Err(AgentLoadError::DirectoryLoadPartial { skipped: skipped_count, paths: skipped_paths }) instead of Ok(()); keep existing behavior of invoking handler when on_error.is_some(), and ensure to update the load_agent_file/error handling in add_directory_with_errors and the AgentLoadError enum to include a DirectoryLoadPartial variant so callers can distinguish partial loads from full success.src/llm-coding-tools-agents/src/parser/mod.rs (2)
126-168:find_frontmatter_offsets: closing delimiter not verified as line-exclusive.Line 141 searches for
"\n---"but does not verify that the closing---occupies the rest of the line (or is followed by only whitespace/newline). If the YAML body contains a line like\n---sometext, it will be treated as the closing delimiter and"sometext"will bleed into the body content.This matches the behavior of many simple frontmatter parsers, but it's worth noting. If a stricter check is desired, you could verify that the byte after the closing
---is\n, EOF, or whitespace.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 126 - 168, The closing delimiter detection in find_frontmatter_offsets currently uses tail.find("\n---") without ensuring the three dashes are alone on the line; modify find_frontmatter_offsets to validate that the byte(s) following the matched "\n---" are either EOF, a newline, or only whitespace (e.g., space or tab) up to a newline; if the check fails, continue searching for the next occurrence of "\n---" in tail until a valid line-exclusive closing delimiter is found (then compute closing_newline, yaml_end, closing_start, after_closing, and body_start from that validated match as currently done).
87-107: Minor:validate_headless_contractallocates twoStrings per call for map lookups.Lines 91–92 create owned
Value::String("permission".to_string())andValue::String("task".to_string())on every invocation. Since this runs once per parsed file, the cost is negligible, but if you ever call this in a hot loop, consider caching these keys or using a helper that searches by&str.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 87 - 107, The current validate_headless_contract creates owned Value::String for "permission" and "task" on each call (permission_key and task_key); replace those allocations by looking up keys via a small helper that compares Value::String contents to a &str (e.g., add fn find_by_str(map: &Mapping, key: &str) -> Option<&Value> that checks if a mapping key is a Value::String and equals key), then use that helper to get permission_map and task_rule instead of creating permission_key/task_key; keep the existing logic and error (AgentParseError::SchemaValidation) and still call task_rule_contains_ask(task_rule).src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
131-166: Capacity estimate underestimates for multiple rewrites.Line 146 allocates
input.len() + 5, which accounts for one block-scalar expansion (~5 extra chars per rewrite). If multiple values need rewriting (e.g.,model:andapi_url:both contain colons), each adds ~5 bytes, but the buffer will need to reallocate. This is fine for correctness—just a minor optimization opportunity.Suggested improvement
If you wanted a tighter estimate without a counting pass, a simple heuristic like
+ 16or+ 32would cover typical frontmatter with 2–6 rewritten lines.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs` around lines 131 - 166, The initial capacity for the output String in convert_block_scalars underestimates growth because it only adds +5 bytes for a single rewrite; when multiple block scalars are rewritten the buffer will reallocate. Fix by improving the capacity estimate before building `out`: either pre-scan `input` for additional block-scalar matches (using find_first_block_scalar/block_scalar_parts) and add ~5 bytes per match to the capacity, or use a simple larger heuristic (e.g. +32) instead of +5 to avoid reallocations for typical inputs.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
.github/workflows/rust.ymlsrc/Cargo.tomlsrc/llm-coding-tools-agents/Cargo.tomlsrc/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.mdsrc/llm-coding-tools-agents/benches/parser.rssrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/config.rssrc/llm-coding-tools-agents/src/error.rssrc/llm-coding-tools-agents/src/extensions.rssrc/llm-coding-tools-agents/src/lib.rssrc/llm-coding-tools-agents/src/loader.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rs
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-agents/src/extensions.rssrc/llm-coding-tools-agents/benches/parser.rssrc/llm-coding-tools-agents/src/lib.rssrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-agents/src/error.rssrc/llm-coding-tools-agents/src/loader.rssrc/llm-coding-tools-agents/src/config.rs
src/**/Cargo.toml
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/Cargo.toml: Enabletokiofeature flag (default) for async mode with tokio runtime, orblockingfor sync/blocking mode. Theasyncandblockingfeatures are mutually exclusive.
Prefer performance-oriented crates:parking_lotoverstd::sync,memchrfor byte search
Files:
src/Cargo.tomlsrc/llm-coding-tools-agents/Cargo.toml
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Applied to files:
src/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-agents/src/lib.rssrc/Cargo.tomlsrc/llm-coding-tools-agents/Cargo.tomlsrc/llm-coding-tools-agents/src/error.rs.github/workflows/rust.yml
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
Applied to files:
src/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-agents/src/lib.rssrc/Cargo.tomlsrc/llm-coding-tools-agents/Cargo.tomlsrc/llm-coding-tools-agents/src/error.rssrc/llm-coding-tools-agents/src/config.rs.github/workflows/rust.yml
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Document public items with `///` and add examples in docs where helpful
Applied to files:
src/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-agents/src/lib.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `//!` for module-level documentation
Applied to files:
src/llm-coding-tools-agents/src/lib.rs
🧬 Code graph analysis (4)
src/llm-coding-tools-agents/benches/parser.rs (2)
src/llm-coding-tools-agents/src/catalog.rs (1)
new(24-28)src/llm-coding-tools-agents/src/loader.rs (1)
new(64-66)
src/llm-coding-tools-agents/src/parser/mod.rs (1)
src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
preprocess_frontmatter_yaml(55-64)
src/llm-coding-tools-agents/src/error.rs (1)
src/llm-coding-tools-agents/src/loader.rs (1)
None(87-87)
src/llm-coding-tools-agents/src/loader.rs (4)
src/llm-coding-tools-agents/src/parser/mod.rs (4)
parse_agent(49-79)parse_agent(293-293)parse_agent(417-417)parse_agent(436-436)src/llm-coding-tools-agents/src/catalog.rs (3)
new(24-28)by_name(45-47)iter(34-36)src/llm-coding-tools-agents/src/error.rs (3)
schema_validation(82-87)io(72-74)parse(77-79)src/llm-coding-tools-agents/src/config.rs (1)
from_raw(128-141)
🪛 markdownlint-cli2 (0.20.0)
src/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.md
[warning] 95-95: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 155-155: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-and-test (macos-latest, aarch64-apple-darwin, false)
🔇 Additional comments (19)
src/Cargo.toml (1)
4-4: LGTM!Workspace member addition is correctly placed. The ordering (core → agents → serdesai) respects the dependency graph for publishing.
.github/workflows/rust.yml (1)
68-68: LGTM!The new crate is consistently added across all CI steps (semver, docs, clippy, publish) following the same patterns as existing crates, with correct dependency ordering.
Also applies to: 93-93, 102-102, 126-126
src/llm-coding-tools-agents/README.md (1)
1-61: LGTM!Clear documentation covering purpose, quick start, file format, compatibility notes, and integration guidance.
src/llm-coding-tools-agents/benches/parser.rs (1)
1-42: LGTM!Well-structured benchmark covering both LF and CRLF inputs with throughput tracking. Using
core::hint::black_boxaligns with the coding guidelines preference forcoreoverstd.src/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.md (1)
1-162: LGTM!Good benchmark fixture with realistic agent configuration. The static analysis MD040 warnings about unlabeled fenced code blocks are false positives here — those blocks define output format templates, not code requiring syntax highlighting.
src/llm-coding-tools-agents/src/config.rs (2)
36-47: LGTM!Clean enum with appropriate serde rename and default variant.
49-63: LGTM!
#[serde(untagged)]is the correct choice here for the YAML format where permission values can be either a plain action string or a nested map.src/llm-coding-tools-agents/src/catalog.rs (1)
1-68: LGTM!Clean catalog implementation with good encapsulation (
insertispub(crate)), proper#[inline]on small accessors, and clear documentation with rustdoc links.src/llm-coding-tools-agents/src/lib.rs (1)
1-16: Clean crate root with well-scoped public API.The module structure and re-exports are well-organized, exposing only the necessary types at the crate root while keeping internal modules private.
src/llm-coding-tools-agents/src/extensions.rs (2)
1-60: Well-structured extension trait with good documentation and pre-allocation.The
RulesetExttrait cleanly separates construction logic from the coreRulesettype. Pre-allocation at Line 43 and the doc example are both appreciated. The capacity estimate ofconfig.len() * 2is a reasonable heuristic.
62-103: Good test coverage for both rule variants.Tests verify single-action rules, pattern-map rules, and correct evaluation semantics.
src/llm-coding-tools-agents/src/parser/mod.rs (2)
48-79: Solid parse flow with in-place CRLF normalization and buffer reuse.The two-phase YAML deserialization (first to
Valuefor validation, then toT) is a clean approach. Buffer reuse viaextract_body_inplaceis a nice touch. As per coding guidelines, theStringownership pattern and in-place mutation align well with the "reuse buffers" guidance.
170-214: Unsafe block is well-justified and correctly usesptr::copy(memmove semantics).The ASCII-whitespace-trimming safety argument (Lines 184–190) is thorough — since continuation bytes (
0x80..=0xBF) and leading bytes (0xC2..=0xF4) never satisfyis_ascii_whitespace(), the byte-level trimming cannot split a multibyte code point. Thedebug_assert!char-boundary checks at Lines 198–200 provide an extra safety net in debug builds.src/llm-coding-tools-agents/src/parser/preprocessor.rs (2)
1-64: Excellent module documentation and zero-copy fast path.The module-level docs clearly explain the problem, transformations, and edge cases. The
Cowreturn frompreprocess_frontmatter_yamlavoids allocation when no rewriting is needed, aligning well with the coding guidelines for conditional ownership.
80-129: Thorough guard logic inblock_scalar_parts.The function correctly skips comments, indented continuation lines, quoted values, flow mappings/arrays, anchors/aliases/tags, and values without colons. Good defensive layering.
src/llm-coding-tools-agents/src/error.rs (1)
7-91: Clean error hierarchy with optional path context.The design of
Option<PathBuf>for both file-based and in-memory sources is practical. Constructor methods keep error creation ergonomic. TheError::source()chain is correctly wired forIoandParsevariants.src/llm-coding-tools-agents/src/loader.rs (3)
1-60: Well-documented stateless loader with comprehensive examples.The module docs, struct docs, and method docs are all thorough with
no_runexamples. The unit struct withDefaultderive is idiomatic for a stateless service.
252-312: Directory scanning logic is clean and platform-aware.Good use of
ignore::WalkBuilderfor.gitignorerespect and symlink handling. The#[cfg(windows)]/#[cfg(not(windows))]path normalization at Lines 294–297 correctly unifies separators. Silently skipping unreadable entries (Line 279) is appropriate for directory-scan semantics.
361-392:matches_agent_patternandderive_agent_name_from_relare clear and well-tested.Simple prefix/suffix string checks are efficient and correct for the expected path format. The
Nonereturn for empty names (Line 387–388) prevents inserting nameless agents.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm-coding-tools-agents/Cargo.toml`:
- Line 14: The Cargo.toml currently depends on the archived serde_yaml; replace
that dependency entry with an actively maintained fork (either serde_yaml_ng or
the YAML Company fork published as serde-yaml) by changing the dependency name
and version in Cargo.toml (replace the serde_yaml = "0.9" line), then run cargo
update and cargo tree / cargo audit to ensure no vulnerable versions remain;
update any use statements or feature flags in code if the crate name differs
from serde_yaml to serde_yaml_ng or serde_yaml so functions like
serde_yaml::from_str calls still resolve.
---
Nitpick comments:
In `@src/llm-coding-tools-agents/src/config.rs`:
- Around line 90-124: AgentConfig currently uses #[serde(skip)] on the prompt
field which prevents serialization/deserialization and causes data loss on
round-trips; to fix it, stop skipping prompt and make it serde-friendly (e.g.,
remove #[serde(skip)] and add #[serde(default)] or otherwise implement
Serialize/Deserialize handling) so prompt is preserved across
Serialize/Deserialize operations—update the AgentConfig struct's prompt field
(named prompt) and ensure any deserialization path supplies a default empty
string when the field is missing.
In `@src/llm-coding-tools-agents/src/error.rs`:
- Around line 35-57: The Display impl for AgentLoadError repeats the same path
extraction in fmt; refactor by moving that logic into a small helper (either a
private function or a closure) inside fmt or an impl helper on AgentLoadError
(e.g., a method like AgentLoadError::path_display(&self) -> String) and replace
the duplicated let path_str = ... blocks in the match arms with a single call to
that helper, then use the returned string in each write! invocation.
In `@src/llm-coding-tools-agents/src/loader.rs`:
- Around line 237-249: The add_from_bytes method currently accepts bytes: impl
AsRef<[u8]>, forcing a borrowed path that produces a &str via
std::str::from_utf8 and causes callers who own a Vec<u8> to incur an extra
allocation when they must clone into a String; add a consuming overload that
takes Vec<u8> (e.g., add_from_vec or change signature to bytes: impl
Into<Vec<u8>>) and parse with String::from_utf8 to reuse the owned buffer,
mapping FromUtf8Error into AgentLoadError::schema_validation the same way as the
existing UTF-8 error handling, while keeping the existing add_from_bytes (or
delegating it to the new consuming implementation) and continuing to call
config_from_str_strict and catalog.insert as before.
- Around line 106-126: add_directory_with_errors currently swallows file-level
load errors when on_error is None; modify the function (and AgentLoadError) to
accumulate failures during load_directory_with (track a skipped_count and
optionally a Vec<Path>) and after iteration, if skipped_count > 0 and
on_error.is_none() return an Err(AgentLoadError::DirectoryLoadPartial { skipped:
skipped_count, paths: skipped_paths }) instead of Ok(()); keep existing behavior
of invoking handler when on_error.is_some(), and ensure to update the
load_agent_file/error handling in add_directory_with_errors and the
AgentLoadError enum to include a DirectoryLoadPartial variant so callers can
distinguish partial loads from full success.
In `@src/llm-coding-tools-agents/src/parser/mod.rs`:
- Around line 126-168: The closing delimiter detection in
find_frontmatter_offsets currently uses tail.find("\n---") without ensuring the
three dashes are alone on the line; modify find_frontmatter_offsets to validate
that the byte(s) following the matched "\n---" are either EOF, a newline, or
only whitespace (e.g., space or tab) up to a newline; if the check fails,
continue searching for the next occurrence of "\n---" in tail until a valid
line-exclusive closing delimiter is found (then compute closing_newline,
yaml_end, closing_start, after_closing, and body_start from that validated match
as currently done).
- Around line 87-107: The current validate_headless_contract creates owned
Value::String for "permission" and "task" on each call (permission_key and
task_key); replace those allocations by looking up keys via a small helper that
compares Value::String contents to a &str (e.g., add fn find_by_str(map:
&Mapping, key: &str) -> Option<&Value> that checks if a mapping key is a
Value::String and equals key), then use that helper to get permission_map and
task_rule instead of creating permission_key/task_key; keep the existing logic
and error (AgentParseError::SchemaValidation) and still call
task_rule_contains_ask(task_rule).
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Around line 131-166: The initial capacity for the output String in
convert_block_scalars underestimates growth because it only adds +5 bytes for a
single rewrite; when multiple block scalars are rewritten the buffer will
reallocate. Fix by improving the capacity estimate before building `out`: either
pre-scan `input` for additional block-scalar matches (using
find_first_block_scalar/block_scalar_parts) and add ~5 bytes per match to the
capacity, or use a simple larger heuristic (e.g. +32) instead of +5 to avoid
reallocations for typical inputs.
8fb0e0b to
9c5773f
Compare
Run semver checks, documentation builds, clippy, and publish-path wiring for the new agents crate. This keeps CI coverage aligned with the expanded workspace so regressions are caught before release.
9c5773f to
1c1c20c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/rust.yml (1)
155-167:⚠️ Potential issue | 🟡 MinorCrates are in correct publish order, but consider using automatic dependency resolution.
Verified:
llm-coding-tools-agentsandllm-coding-tools-serdesaiboth have path dependencies onllm-coding-tools-core. The workflow lists crates in the correct topological order (core → agents → serdesai), anddevops-publish-action@v3publishes in the exact sequence provided. However, this approach relies on manual ordering maintenance. Consider switching to an action that automatically resolves and publishes in dependency order (e.g.,katyo/publish-crates), which is more robust and prevents ordering mistakes as dependencies change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/rust.yml around lines 155 - 167, Replace the manual-ordered publish step that uses Reloaded-Project/devops-publish-action@v3 and the explicit rust-cargo-project-paths list (entries: src/llm-coding-tools-core, src/llm-coding-tools-agents, src/llm-coding-tools-serdesai) with an action that auto-resolves crate dependency order (for example katyo/publish-crates) so publishing is performed in topological order automatically; update the step that currently names "Publish Rust Crate and Artifacts" to use the automatic-publish action and pass the appropriate token (CRATES_IO_TOKEN) per that action's inputs instead of hardcoding the ordered paths.
🧹 Nitpick comments (7)
src/llm-coding-tools-agents/benches/parser.rs (1)
25-33: Benchmark includes allocator noise from per-iteration construction.
AgentLoader::new()andAgentCatalog::new()(which allocates aHashMap) are created inside the hot loop. If the intent is to measure pure parse throughput, consider constructing the catalog once and calling.clear()(or equivalent) between iterations. If the intent is to measure the full load-from-scratch workflow, this is fine as-is—just worth being explicit about what's being measured.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/benches/parser.rs` around lines 25 - 33, The benchmark currently constructs AgentLoader::new() and AgentCatalog::new() inside the hot loop, adding allocator noise from per-iteration HashMap allocation; to measure pure parse throughput move the creation of AgentLoader and AgentCatalog outside the b.iter closure and call catalog.clear() (or an equivalent reset) inside the loop before loader.add_from_str(&mut catalog, ...) so only parsing is measured, or if you intend to measure full cold-load costs leave construction inside but make the intent explicit in a comment and benchmark name; reference AgentLoader::new, AgentCatalog::new, catalog.clear(), and loader.add_from_str when making the change..github/workflows/rust.yml (1)
92-96: Formatter check runs in both async and blocking jobs.The
rustfmtstep is duplicated across both jobs, meaning it runs 6 times total (3 OS × 2 jobs) on the same workspace. Since formatting is workspace-wide and feature-independent, it could be extracted into a lightweight standalone job to save CI minutes. Not blocking.Also applies to: 145-149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/rust.yml around lines 92 - 96, The rustfmt step ("Run formatter check" using actions-rust-lang/rustfmt@v1 with manifest-path: src/Cargo.toml) is duplicated in both the async and blocking matrix jobs; extract it into a single lightweight standalone job (e.g., a new job named "rustfmt" or "format-check") that runs once for PRs/tags (same if condition) and remove the duplicated "Run formatter check" step from the async and blocking jobs so formatting is only run once for the workspace. Ensure the new job uses the same action and manifest-path and preserves any needed environment or checkout steps so it can run independently.src/llm-coding-tools-agents/Cargo.toml (1)
28-28: Inconsistent version pinning forignore.All other deps use relaxed semver ranges (e.g.,
"1.0","2","0.1"), butignorepins to"0.4.25". Unless a specific patch is required, prefer"0.4"for consistency.Suggested fix
-ignore = "0.4.25" +ignore = "0.4"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/Cargo.toml` at line 28, The dependency entry for ignore is overly pinned to "0.4.25"; update the version string in the Cargo.toml dependency for ignore to the relaxed semver "0.4" so it matches the project's other relaxed ranges and allows patch updates (i.e., replace the exact "0.4.25" specifier for the ignore dependency with "0.4").src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
55-64: Add#[inline]topreprocess_frontmatter_yaml.The function body is trivially small (guard + single call).
is_valid_keyandblock_scalar_partsare already annotated; the public entry point should be too.♻️ Proposed fix
/// Rewrites ambiguous frontmatter values so YAML parsing stays unambiguous. +#[inline] pub(super) fn preprocess_frontmatter_yaml(input: &str) -> Cow<'_, str> {As per coding guidelines: "Use
#[inline]on small, hot-path functions."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs` around lines 55 - 64, Add the #[inline] attribute to the preprocess_frontmatter_yaml function by placing #[inline] immediately above its pub(super) fn preprocess_frontmatter_yaml(...) signature; keep the existing body (the empty check and match on convert_block_scalars) unchanged so the function is identical except for the added attribute to match other small helpers like is_valid_key and block_scalar_parts.src/llm-coding-tools-agents/src/loader.rs (3)
144-148: Redundant.to_path_buf()calls on already-PathBufvalues at early-return sites.In both
add_file(Line 146) andadd_file_named(Line 174),pathhas already been converted toPathBufvia.into()on Lines 139 and 170 respectively. At the early-return branches the value is never used afterwards, sopathcan be moved directly.♻️ Proposed fix
add_file(Line 146):- Some(path.to_path_buf()), + Some(path),
add_file_named(Line 174):- Some(path.to_path_buf()), + Some(path),Also applies to: 172-177
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 144 - 148, The early-return branches in add_file and add_file_named call .to_path_buf() on path even though path is already a PathBuf (created via .into()), which is redundant and prevents moving the value; update the Err(...) returns that call AgentLoadError::schema_validation to pass Some(path) (moving path) instead of Some(path.to_path_buf()), ensuring you consume the existing PathBuf rather than cloning it. Use the same change for the other early-return sites mentioned so path is moved directly into AgentLoadError::schema_validation.
276-280: Walker errors are silently discarded and never reach theon_errorcallback.
Err(_) => continueswallowsignore::Errorwalk entries (e.g., permission-denied on a subdirectory). These are distinct from file-load errors and are not forwarded to theon_errorargument inadd_directory_with_errors. A caller relying onon_errorto know about all failures will miss them. At minimum, document this gap explicitly.Optionally, convert walker errors to
AgentLoadError::Ioand invokeon_error:♻️ Proposed change
let entry = match entry_result { Ok(e) => e, - Err(_) => continue, // skip unreadable entries + Err(e) => { + // Walk-level error (e.g., permission denied on a directory entry). + // Not forwarded to on_error; silently skipped. + let _ = e; + continue; + } };Or, if surfacing them is desired:
Err(e) => { if let Some(ref mut handler) = on_error { let io_err = e.into_io_error().unwrap_or_else(|| std::io::Error::new(std::io::ErrorKind::Other, "walk error")); handler(entry.path(), &AgentLoadError::io(None, io_err)); } continue; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 276 - 280, The loop over walker currently swallows walker errors (entry_result Err) with "Err(_) => continue", so surface those failures through the existing on_error path in add_directory_with_errors: when entry_result is Err(e) convert it into an AgentLoadError::Io (or another appropriate AgentLoadError variant), call the provided on_error callback with that error and the path context, and then continue; update the match handling around walker/entry_result (and any surrounding logging) to ensure walker errors are not silently ignored but are forwarded to on_error.
368-392:derive_agent_name_from_relshould returnOption<&str>to avoid a per-file allocation.All intermediate values (
without_prefix, thestrip_suffixresult) are&strslices of therel_pathinput. The function can safely returnOption<&str>with a lifetime tied to the input, eliminating theStringallocation on every file match during directory scanning.As per coding guidelines: "Prefer
&str/&[T]returns over owned types when lifetime allows."♻️ Proposed refactor
-fn derive_agent_name_from_rel(rel_path: &str) -> Option<String> { +fn derive_agent_name_from_rel(rel_path: &str) -> Option<&str> { let without_prefix = rel_path .strip_prefix("agent/") .or_else(|| rel_path.strip_prefix("agents/")) .unwrap_or(rel_path); let name = without_prefix .strip_suffix(".md") - .unwrap_or(without_prefix) - .to_string(); + .unwrap_or(without_prefix); if name.is_empty() { None } else { Some(name) } }Call site update (Line 303–308):
- on_match(path, name.as_str())?; + on_match(path, name)?;Test assertions:
- Some("test".to_string()) + Some("test")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 368 - 392, The function derive_agent_name_from_rel currently returns Option<String> and allocates per call; change its signature to fn derive_agent_name_from_rel<'a>(rel_path: &'a str) -> Option<&'a str> and return borrowed slices instead of owned Strings by using the existing without_prefix and strip_suffix results (e.g., let name = without_prefix.strip_suffix(".md").unwrap_or(without_prefix);) and return Some(name) or None accordingly; update any call sites that expect String to accept &str or call .to_string() only where ownership is actually required.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
.github/workflows/rust.ymlsrc/Cargo.tomlsrc/llm-coding-tools-agents/Cargo.tomlsrc/llm-coding-tools-agents/README.mdsrc/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.mdsrc/llm-coding-tools-agents/benches/parser.rssrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/config.rssrc/llm-coding-tools-agents/src/error.rssrc/llm-coding-tools-agents/src/extensions.rssrc/llm-coding-tools-agents/src/lib.rssrc/llm-coding-tools-agents/src/loader.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rs
🚧 Files skipped from review as they are similar to previous changes (6)
- src/llm-coding-tools-agents/src/lib.rs
- src/llm-coding-tools-agents/src/error.rs
- src/Cargo.toml
- src/llm-coding-tools-agents/src/config.rs
- src/llm-coding-tools-agents/README.md
- src/llm-coding-tools-agents/src/parser/mod.rs
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-agents/src/extensions.rssrc/llm-coding-tools-agents/benches/parser.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/loader.rs
src/**/Cargo.toml
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/Cargo.toml: Enabletokiofeature flag (default) for async mode with tokio runtime, orblockingfor sync/blocking mode. Theasyncandblockingfeatures are mutually exclusive.
Prefer performance-oriented crates:parking_lotoverstd::sync,memchrfor byte search
Files:
src/llm-coding-tools-agents/Cargo.toml
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `//!` for module-level documentation
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Document public items with `///` and add examples in docs where helpful
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/Cargo.toml : Enable `tokio` feature flag (default) for async mode with tokio runtime, or `blocking` for sync/blocking mode. The `async` and `blocking` features are mutually exclusive.
Applied to files:
.github/workflows/rust.yml
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
Applied to files:
.github/workflows/rust.ymlsrc/llm-coding-tools-agents/Cargo.toml
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Applied to files:
src/llm-coding-tools-agents/Cargo.toml
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/Cargo.toml : Prefer performance-oriented crates: `parking_lot` over `std::sync`, `memchr` for byte search
Applied to files:
src/llm-coding-tools-agents/Cargo.toml
🧬 Code graph analysis (4)
src/llm-coding-tools-agents/src/extensions.rs (1)
src/llm-coding-tools-core/src/permissions.rs (5)
with_capacity(143-147)pattern(112-114)len(157-159)is_allowed(211-213)evaluate(185-204)
src/llm-coding-tools-agents/benches/parser.rs (2)
src/llm-coding-tools-agents/src/catalog.rs (1)
new(24-28)src/llm-coding-tools-agents/src/loader.rs (1)
new(64-66)
src/llm-coding-tools-agents/src/catalog.rs (1)
src/llm-coding-tools-agents/src/loader.rs (2)
new(64-66)None(87-87)
src/llm-coding-tools-agents/src/loader.rs (4)
src/llm-coding-tools-agents/src/parser/mod.rs (4)
parse_agent(49-79)parse_agent(293-293)parse_agent(417-417)parse_agent(436-436)src/llm-coding-tools-agents/src/catalog.rs (3)
new(24-28)by_name(45-47)iter(34-36)src/llm-coding-tools-agents/src/error.rs (3)
schema_validation(82-87)io(72-74)parse(77-79)src/llm-coding-tools-agents/src/config.rs (1)
from_raw(128-141)
🪛 markdownlint-cli2 (0.21.0)
src/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.md
[warning] 95-95: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 155-155: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build and Test (Blocking) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
- GitHub Check: Build and Test (Blocking) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Blocking) (macos-latest, aarch64-apple-darwin, false)
- GitHub Check: Build and Test (Async/Tokio) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
- GitHub Check: Build and Test (Async/Tokio) (windows-latest, x86_64-pc-windows-msvc, false)
🔇 Additional comments (8)
.github/workflows/rust.yml (2)
13-14: Clean separation of async/blocking test jobs.The rename to
build-and-test-asyncand the newbuild-and-test-blockingjob properly isolate the mutually exclusive feature flags. The agents crate is correctly excluded from the blocking job since it doesn't have theblockingfeature.Also applies to: 33-33
55-70: Semver-checks loop correctly handles the agents crate.The
elsebranch (no--only-explicit-features) is the right path forllm-coding-tools-agentssince it doesn't have the mutually exclusivetokio/blockingfeatures thatllm-coding-tools-corehas.src/llm-coding-tools-agents/benches/parser.rs (1)
3-3: Good use ofcore::hint::black_boxoverstd.Aligns with the guideline to prefer
coreoverstdwhere possible.src/llm-coding-tools-agents/benches/fixtures/orchestrator-quality-gate-gpt5.md (2)
1-16: Well-structured benchmark fixture with realistic frontmatter.The YAML frontmatter exercises multiple field types (strings, booleans, nested permission maps) which gives good coverage for the parser benchmark. The
reasoningEffortfield with its "not yet supported" comment is a nice touch for forward-compatibility testing.
95-95: Static analysis MD040 warnings are false positives for this fixture.The bare fenced code blocks (without a language specifier) at lines 95 and 155 are part of the agent prompt template content—they define the expected output format. Linting this fixture as documentation is inappropriate; these warnings can be safely ignored or suppressed via a
.markdownlint.ymlconfig scoped tobenches/fixtures/.Also applies to: 155-155
src/llm-coding-tools-agents/Cargo.toml (1)
1-40: Consider addingmemchrto optimize byte-level delimiter searching in the parser.The
find_frontmatter_offsetsfunction searches for\n---delimiters and newline boundaries (mod.rs:141, 147). Thememchrcrate would be more efficient for these byte pattern searches compared toString::find(). Per coding guidelines,memchris preferred for byte search operations.src/llm-coding-tools-agents/src/extensions.rs (1)
1-103: LGTM.The
RulesetExtimplementation is correct: capacity pre-allocation, insertion order preserved viaIndexMap,*actionderef works sincePermissionActionisCopy, and the last-match-wins semantics in thePattern-variant test ("*"→Denyinserted before"orchestrator-*"→Allow) align exactly withRuleset::evaluate.src/llm-coding-tools-agents/src/catalog.rs (1)
1-109: LGTM.
#[inline]is correctly applied on the trivial read methods (new,iter,by_name). Thecollect::<HashMap<_,_>>()path infrom_entriesleveragessize_hint()for pre-allocation, and the test correctly avoids order-dependent assertions against the non-deterministic HashMap iteration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Around line 96-128: The code currently returns (key, value) with value still
containing any inline comment suffix (e.g., "foo # bar"), which corrupts block
scalars; update the logic after computing value to strip an inline comment: if
value contains '#', split on the first '#' into (val_part, comment_part), trim
val_part and use it as the returned value, but if comment_part contains a ':'
then treat the line as ambiguous and return None (to avoid false positives);
ensure you still respect the existing guards (use the same variables/key checks
like key, value, first_value, and is_valid_key) and return None if the stripped
val_part becomes empty.
- Around line 131-166: convert_block_scalars currently uses split_terminator
over input[first.rest_start..], which drops a trailing '\n' so the Cow::Owned
rewrite path can lose a final newline; update convert_block_scalars to detect if
the original input ends_with('\n') (or equivalently check input.len() > 0 &&
input.as_bytes().last() == Some(&b'\n')) and, just before returning Some(out),
append a trailing '\n' to out when the original input had one so the rewritten
(Owned) result preserves the original trailing newline parity; keep the rest of
the logic (first, block_scalar_parts) unchanged.
---
Outside diff comments:
In @.github/workflows/rust.yml:
- Around line 155-167: Replace the manual-ordered publish step that uses
Reloaded-Project/devops-publish-action@v3 and the explicit
rust-cargo-project-paths list (entries: src/llm-coding-tools-core,
src/llm-coding-tools-agents, src/llm-coding-tools-serdesai) with an action that
auto-resolves crate dependency order (for example katyo/publish-crates) so
publishing is performed in topological order automatically; update the step that
currently names "Publish Rust Crate and Artifacts" to use the automatic-publish
action and pass the appropriate token (CRATES_IO_TOKEN) per that action's inputs
instead of hardcoding the ordered paths.
---
Duplicate comments:
In `@src/llm-coding-tools-agents/Cargo.toml`:
- Line 14: Replace the deprecated serde_yaml dependency with the actively
maintained fork serde_yaml_ng by changing the Cargo.toml entry (replace the
dependency key serde_yaml with serde_yaml_ng and set the appropriate version
spec, e.g. "0.9" or the latest compatible release), update any Rust use/import
paths (replace references like serde_yaml::... with serde_yaml_ng::...), run
cargo update to refresh Cargo.lock, and verify compilation/tests to catch any
API incompatibilities in functions that previously referenced serde_yaml (search
for serde_yaml in the repo to locate all call sites).
---
Nitpick comments:
In @.github/workflows/rust.yml:
- Around line 92-96: The rustfmt step ("Run formatter check" using
actions-rust-lang/rustfmt@v1 with manifest-path: src/Cargo.toml) is duplicated
in both the async and blocking matrix jobs; extract it into a single lightweight
standalone job (e.g., a new job named "rustfmt" or "format-check") that runs
once for PRs/tags (same if condition) and remove the duplicated "Run formatter
check" step from the async and blocking jobs so formatting is only run once for
the workspace. Ensure the new job uses the same action and manifest-path and
preserves any needed environment or checkout steps so it can run independently.
In `@src/llm-coding-tools-agents/benches/parser.rs`:
- Around line 25-33: The benchmark currently constructs AgentLoader::new() and
AgentCatalog::new() inside the hot loop, adding allocator noise from
per-iteration HashMap allocation; to measure pure parse throughput move the
creation of AgentLoader and AgentCatalog outside the b.iter closure and call
catalog.clear() (or an equivalent reset) inside the loop before
loader.add_from_str(&mut catalog, ...) so only parsing is measured, or if you
intend to measure full cold-load costs leave construction inside but make the
intent explicit in a comment and benchmark name; reference AgentLoader::new,
AgentCatalog::new, catalog.clear(), and loader.add_from_str when making the
change.
In `@src/llm-coding-tools-agents/Cargo.toml`:
- Line 28: The dependency entry for ignore is overly pinned to "0.4.25"; update
the version string in the Cargo.toml dependency for ignore to the relaxed semver
"0.4" so it matches the project's other relaxed ranges and allows patch updates
(i.e., replace the exact "0.4.25" specifier for the ignore dependency with
"0.4").
In `@src/llm-coding-tools-agents/src/loader.rs`:
- Around line 144-148: The early-return branches in add_file and add_file_named
call .to_path_buf() on path even though path is already a PathBuf (created via
.into()), which is redundant and prevents moving the value; update the Err(...)
returns that call AgentLoadError::schema_validation to pass Some(path) (moving
path) instead of Some(path.to_path_buf()), ensuring you consume the existing
PathBuf rather than cloning it. Use the same change for the other early-return
sites mentioned so path is moved directly into
AgentLoadError::schema_validation.
- Around line 276-280: The loop over walker currently swallows walker errors
(entry_result Err) with "Err(_) => continue", so surface those failures through
the existing on_error path in add_directory_with_errors: when entry_result is
Err(e) convert it into an AgentLoadError::Io (or another appropriate
AgentLoadError variant), call the provided on_error callback with that error and
the path context, and then continue; update the match handling around
walker/entry_result (and any surrounding logging) to ensure walker errors are
not silently ignored but are forwarded to on_error.
- Around line 368-392: The function derive_agent_name_from_rel currently returns
Option<String> and allocates per call; change its signature to fn
derive_agent_name_from_rel<'a>(rel_path: &'a str) -> Option<&'a str> and return
borrowed slices instead of owned Strings by using the existing without_prefix
and strip_suffix results (e.g., let name =
without_prefix.strip_suffix(".md").unwrap_or(without_prefix);) and return
Some(name) or None accordingly; update any call sites that expect String to
accept &str or call .to_string() only where ownership is actually required.
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Around line 55-64: Add the #[inline] attribute to the
preprocess_frontmatter_yaml function by placing #[inline] immediately above its
pub(super) fn preprocess_frontmatter_yaml(...) signature; keep the existing body
(the empty check and match on convert_block_scalars) unchanged so the function
is identical except for the added attribute to match other small helpers like
is_valid_key and block_scalar_parts.
| /// Rewrites matching lines and returns `None` when no rewrite is needed. | ||
| fn convert_block_scalars(input: &str) -> Option<String> { | ||
| let first = match find_first_block_scalar(input) { | ||
| Some(first) => first, | ||
| _ => return None, | ||
| }; | ||
|
|
||
| // Second pass: copy prefix once, then rewrite from the first changed line onward. | ||
| // Typical rewrite is: | ||
| // `model: synthetic/hf:moonshotai/Kimi-K2.5` | ||
| // -> `model: |-\n synthetic/hf:moonshotai/Kimi-K2.5` | ||
| // Another multi-colon example: | ||
| // `api_url: http://localhost:8080` | ||
| // -> `api_url: |-\n http://localhost:8080` | ||
| // This also represents a change of +5 characters. | ||
| let mut out = String::with_capacity(input.len() + 5); | ||
| if first.line_start > 0 { | ||
| out.push_str(&input[..first.line_start]); | ||
| } | ||
| out.push_str(first.key); | ||
| out.push_str(": |-\n "); | ||
| out.push_str(first.value); | ||
|
|
||
| for line in input[first.rest_start..].split_terminator('\n') { | ||
| out.push('\n'); | ||
| if let Some((key, value)) = block_scalar_parts(line) { | ||
| out.push_str(key); | ||
| out.push_str(": |-\n "); | ||
| out.push_str(value); | ||
| } else { | ||
| out.push_str(line); | ||
| } | ||
| } | ||
|
|
||
| Some(out) | ||
| } |
There was a problem hiding this comment.
Cow::Owned path silently drops a trailing newline present in the input.
split_terminator is "equivalent to split, except that the trailing substring is skipped if empty" — so "foo\n".split_terminator('\n') yields only ["foo"]. When convert_block_scalars iterates input[rest_start..] with split_terminator, a trailing \n in the original input is never emitted into out.
This creates an inconsistency: the Cow::Borrowed path returns the input byte-for-byte (trailing newline preserved), while the Cow::Owned path strips it. Downstream code that expects the two branches to preserve the trailing newline parity will see a difference.
For the current YAML-parser consumer this is benign, but it should be documented or fixed for correctness:
🔧 Proposed fix (append trailing newline when original had one)
Some(out)
}
+
+fn append_trailing_newline_if_needed(out: &mut String, input: &str) {
+ if input.ends_with('\n') && !out.ends_with('\n') {
+ out.push('\n');
+ }
+}Then call it at the end of convert_block_scalars before Some(out):
+ append_trailing_newline_if_needed(&mut out, input);
Some(out)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Rewrites matching lines and returns `None` when no rewrite is needed. | |
| fn convert_block_scalars(input: &str) -> Option<String> { | |
| let first = match find_first_block_scalar(input) { | |
| Some(first) => first, | |
| _ => return None, | |
| }; | |
| // Second pass: copy prefix once, then rewrite from the first changed line onward. | |
| // Typical rewrite is: | |
| // `model: synthetic/hf:moonshotai/Kimi-K2.5` | |
| // -> `model: |-\n synthetic/hf:moonshotai/Kimi-K2.5` | |
| // Another multi-colon example: | |
| // `api_url: http://localhost:8080` | |
| // -> `api_url: |-\n http://localhost:8080` | |
| // This also represents a change of +5 characters. | |
| let mut out = String::with_capacity(input.len() + 5); | |
| if first.line_start > 0 { | |
| out.push_str(&input[..first.line_start]); | |
| } | |
| out.push_str(first.key); | |
| out.push_str(": |-\n "); | |
| out.push_str(first.value); | |
| for line in input[first.rest_start..].split_terminator('\n') { | |
| out.push('\n'); | |
| if let Some((key, value)) = block_scalar_parts(line) { | |
| out.push_str(key); | |
| out.push_str(": |-\n "); | |
| out.push_str(value); | |
| } else { | |
| out.push_str(line); | |
| } | |
| } | |
| Some(out) | |
| } | |
| /// Rewrites matching lines and returns `None` when no rewrite is needed. | |
| fn convert_block_scalars(input: &str) -> Option<String> { | |
| let first = match find_first_block_scalar(input) { | |
| Some(first) => first, | |
| _ => return None, | |
| }; | |
| // Second pass: copy prefix once, then rewrite from the first changed line onward. | |
| // Typical rewrite is: | |
| // `model: synthetic/hf:moonshotai/Kimi-K2.5` | |
| // -> `model: |-\n synthetic/hf:moonshotai/Kimi-K2.5` | |
| // Another multi-colon example: | |
| // `api_url: http://localhost:8080` | |
| // -> `api_url: |-\n http://localhost:8080` | |
| // This also represents a change of +5 characters. | |
| let mut out = String::with_capacity(input.len() + 5); | |
| if first.line_start > 0 { | |
| out.push_str(&input[..first.line_start]); | |
| } | |
| out.push_str(first.key); | |
| out.push_str(": |-\n "); | |
| out.push_str(first.value); | |
| for line in input[first.rest_start..].split_terminator('\n') { | |
| out.push('\n'); | |
| if let Some((key, value)) = block_scalar_parts(line) { | |
| out.push_str(key); | |
| out.push_str(": |-\n "); | |
| out.push_str(value); | |
| } else { | |
| out.push_str(line); | |
| } | |
| } | |
| append_trailing_newline_if_needed(&mut out, input); | |
| Some(out) | |
| } | |
| fn append_trailing_newline_if_needed(out: &mut String, input: &str) { | |
| if input.ends_with('\n') && !out.ends_with('\n') { | |
| out.push('\n'); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs` around lines 131 -
166, convert_block_scalars currently uses split_terminator over
input[first.rest_start..], which drops a trailing '\n' so the Cow::Owned rewrite
path can lose a final newline; update convert_block_scalars to detect if the
original input ends_with('\n') (or equivalently check input.len() > 0 &&
input.as_bytes().last() == Some(&b'\n')) and, just before returning Some(out),
append a trailing '\n' to out when the original input had one so the rewritten
(Owned) result preserves the original trailing newline parity; keep the rest of
the logic (first, block_scalar_parts) unchanged.
Reorder functions to match their execution flow, making the code readable from top-to-bottom as a human would trace through the logic. Changes: - Move convert_block_scalars after preprocess_frontmatter_yaml (calls it) - Move find_first_block_scalar with its FirstBlockScalar struct next - Move block_scalar_parts before its callers - Move is_valid_key to bottom as leaf function Benefits: - Improves code readability by following call order - Makes function dependencies clear at a glance - Easier to understand control flow
Rename function to better convey its conditional extraction behavior. Changes: - block_scalar_parts → extract_if_needs_block_scalar - Updated doc comment to clarify "needs transformation" intent - Updated all call sites (2 occurrences) Benefits: - Name clearly indicates conditional behavior (returns Option) - Explicit about "extraction" purpose - Makes code intent self-documenting
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
81-81: Consider a better capacity heuristic for documents with many rewriteable lines.
input.len() + 5only accounts for one rewrite. Since each rewrite of"key: value:..."→"key: |-\n value:..."adds exactly 5 bytes (|-\nreplaces), a document with N convertible lines grows byN * 5. For typical short frontmatter this is negligible, but as a quick improvement you can use a slightly larger constant (e.g.,+ 32) or pre-count occurrences of:in lines at the cost of a single pass.♻️ Proposed heuristic improvement
- let mut out = String::with_capacity(input.len() + 5); + // +5 per potential rewrite; rough upper bound without a full pre-scan. + let approx_rewrites = input.lines().count().min(16); + let mut out = String::with_capacity(input.len() + approx_rewrites * 5);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs` at line 81, The initial capacity for `out` (currently String::with_capacity(input.len() + 5)) underestimates growth when many lines are rewritten; update the heuristic by either using a larger constant (e.g., String::with_capacity(input.len() + 32)) or, for a better fit, make a single cheap pass over `input` to count potential rewriteable `:` occurrences (e.g., count bytes matching `':'` or `": "` per line) and allocate with input.len() + count * 5 before building `out`; adjust the allocation where `let mut out = String::with_capacity(...)` is defined in preprocessor.rs.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-core/src/permissions.rs
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-core/src/permissions.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rs
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `//!` for module-level documentation
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Document public items with `///` and add examples in docs where helpful
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `Cow<'_, str>` for conditional ownership (e.g., `String::from_utf8_lossy`)
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test (Blocking) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
- GitHub Check: Build and Test (Async/Tokio) (macos-latest, aarch64-apple-darwin, false)
🔇 Additional comments (3)
src/llm-coding-tools-core/src/permissions.rs (2)
73-75: LGTM — documentation link format is now correct rustdoc syntax.
<https://...>renders as a clickable hyperlink in rustdoc, replacing the bare URL.
60-66:PackedPermissioncorrectly round-trips bothPermissionActionvariants after the discriminant reordering.The implementation encodes/decodes using the enum's raw discriminant values directly (line 36:
action as u64 << 63; line 50:transmuteback from bit). The round-trip tests confirm bothAllowandDenyencode and decode correctly with the current enum ordering (Deny=0, Allow=1). No issues.src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
54-64: LGTM!Clean
Cow<'_, str>usage with a zero-copy fast path for empty input, and thepub(super)visibility is appropriate for the parser module boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Around line 89-100: The loop using split_terminator('\n') drops a final empty
token so rewritten Cow::Owned output can lose a trailing newline; after the
existing loop that iterates over input[first.rest_start..] and writes into out
(and uses extract_if_needs_block_scalar), detect if
input[first.rest_start..].ends_with('\n') and if so push a final '\n' onto out
so the rewritten output preserves a trailing newline just like the borrowed
path.
- Around line 181-186: The returned tuple currently returns value with any
trailing inline YAML comment suffix (e.g., " # fast") which corrupts the block
scalar; in the branch that returns Some((key, value)) strip any inline comment
suffix from value by finding the first unescaped '#' (or simply the first '#'
that follows optional whitespace), truncate value at that position, then trim
trailing whitespace/newlines and return Some((key, cleaned_value)) instead of
the raw value so the block scalar content is correct.
---
Nitpick comments:
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Line 81: The initial capacity for `out` (currently
String::with_capacity(input.len() + 5)) underestimates growth when many lines
are rewritten; update the heuristic by either using a larger constant (e.g.,
String::with_capacity(input.len() + 32)) or, for a better fit, make a single
cheap pass over `input` to count potential rewriteable `:` occurrences (e.g.,
count bytes matching `':'` or `": "` per line) and allocate with input.len() +
count * 5 before building `out`; adjust the allocation where `let mut out =
String::with_capacity(...)` is defined in preprocessor.rs.
Rewrite module-level documentation to be more approachable with a concrete example of the input format and clearer description of outputs. Changes: - Added example showing YAML frontmatter format - Simplified description of what the module does - Added intra-doc link to preprocessor module for details - Removed overly verbose implementation details from top-level docs Benefits: - New users can immediately understand the input/output format - Cleaner, more scannable documentation - Implementation details delegated to appropriate modules
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/llm-coding-tools-agents/src/parser/mod.rs (2)
96-107: Avoid unnecessaryStringallocations for static YAML key lookups.serde_yaml 0.9's
Mappingimplements theIndextrait forstrdirectly (via aHashLikeValuewrapper), so the two temporaryValue::Stringheap allocations on lines 99–100 are unnecessary. Use&strliterals instead.♻️ Proposed refactor
fn validate_headless_contract(frontmatter: &Value) -> Result<(), AgentParseError> { let Value::Mapping(root) = frontmatter else { return Ok(()); }; - let permission_key = Value::String("permission".to_string()); - let task_key = Value::String("task".to_string()); - - let Some(Value::Mapping(permission_map)) = root.get(&permission_key) else { + let Some(Value::Mapping(permission_map)) = root.get("permission") else { return Ok(()); }; - let Some(task_rule) = permission_map.get(&task_key) else { + let Some(task_rule) = permission_map.get("task") else { return Ok(()); };As per coding guidelines:
src/**/*.rs— avoid unnecessary allocations; and theserde_yaml 0.9Mappingsource confirmsimpl Index for stris available.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 96 - 107, The code creates temporary Value::String allocations for YAML key lookups; instead, use &str key lookups to avoid heap allocations—replace the Value::String("permission".to_string()) and Value::String("task".to_string()) usage when indexing the Mapping obtained from frontmatter/Value::Mapping and use root.get("permission") and permission_map.get("task") (or equivalent &str indexing) so permission_key/task_key allocations are removed while keeping the permission_map and task_rule extraction logic in functions handling frontmatter/Value::Mapping intact.
215-219: Add a// SAFETY:comment to theunsafeblock.The block is sound — ASCII byte trimming cannot split a multibyte UTF-8 code point (as documented by the table comment above) and
core::ptr::copy(memmove) is used correctly for potentially overlapping regions — but clippy'sundocumented_unsafe_blocksrestriction lint requires an explicit// SAFETY:comment. The comment also helps future readers audit the invariants without needing to re-derive them.♻️ Proposed refactor
+ // SAFETY: `start_offset` and `end_offset` are guaranteed to sit on UTF-8 + // char boundaries because the trimming loops only advance past ASCII bytes + // (≤ 0x7F), which can never be a continuation byte (0x80–0xBF) or the + // interior of a multibyte sequence. `core::ptr::copy` (memmove) handles + // the potentially overlapping src/dst regions correctly. unsafe { let vec = content.as_mut_vec(); core::ptr::copy(vec.as_ptr().add(start_offset), vec.as_mut_ptr(), body_len); vec.set_len(body_len); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 215 - 219, Add a `// SAFETY:` comment immediately above the unsafe block that explains the invariants: that trimming ASCII bytes cannot split a multi-byte UTF‑8 code point (so start_offset is at a valid byte boundary), that `content.as_mut_vec()` yields a valid mutable buffer and `body_len` is <= original length, that `core::ptr::copy` (memmove semantics) is used to correctly handle potentially overlapping regions, and that calling `vec.set_len(body_len)` is safe because we only shorten the vector to initialized bytes; reference the `content.as_mut_vec()`, `core::ptr::copy(...)`, and `vec.set_len(body_len)` operations in the comment so future readers can audit these guarantees.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/llm-coding-tools-agents/src/parser/mod.rs
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-agents/src/parser/mod.rs
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
🧬 Code graph analysis (1)
src/llm-coding-tools-agents/src/parser/mod.rs (2)
src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
preprocess_frontmatter_yaml(55-64)src/llm-coding-tools-agents/src/loader.rs (2)
parse_agent(319-319)None(87-87)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test (Async/Tokio) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
- GitHub Check: Build and Test (Blocking) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (macos-latest, aarch64-apple-darwin, false)
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm-coding-tools-agents/src/parser/mod.rs`:
- Around line 142-168: The frontmatter delimiter detection is too permissive:
update the opening check (variable start) to require a line-delimited opener
(e.g. "---" followed by EOF or "\n") instead of starts_with("---") so inline
text like "---key: val" is not treated as an opener; compute yaml_start
accordingly (use the first newline or treat content == "---" as empty YAML). For
the closing delimiter, make the match on tail.find("\n---") only accepted when
the matched "---" is followed by EOF or a newline/line boundary: after obtaining
end_offset/closing_newline/closing_start/after_closing, verify that
after_closing >= content.len() OR the bytes at after_closing are a newline (or
skip forward to the next line boundary) before using body_start; if that check
fails, continue searching for the next "\n---" occurrence so trailing text on
the same delimiter line isn’t misinterpreted as body.
---
Nitpick comments:
In `@src/llm-coding-tools-agents/src/parser/mod.rs`:
- Around line 96-107: The code creates temporary Value::String allocations for
YAML key lookups; instead, use &str key lookups to avoid heap
allocations—replace the Value::String("permission".to_string()) and
Value::String("task".to_string()) usage when indexing the Mapping obtained from
frontmatter/Value::Mapping and use root.get("permission") and
permission_map.get("task") (or equivalent &str indexing) so
permission_key/task_key allocations are removed while keeping the permission_map
and task_rule extraction logic in functions handling frontmatter/Value::Mapping
intact.
- Around line 215-219: Add a `// SAFETY:` comment immediately above the unsafe
block that explains the invariants: that trimming ASCII bytes cannot split a
multi-byte UTF‑8 code point (so start_offset is at a valid byte boundary), that
`content.as_mut_vec()` yields a valid mutable buffer and `body_len` is <=
original length, that `core::ptr::copy` (memmove semantics) is used to correctly
handle potentially overlapping regions, and that calling `vec.set_len(body_len)`
is safe because we only shorten the vector to initialized bytes; reference the
`content.as_mut_vec()`, `core::ptr::copy(...)`, and `vec.set_len(body_len)`
operations in the comment so future readers can audit these guarantees.
Replace std::collections::HashMap with ahash::AHashMap across agents crate and sync ahash dependency version with core crate. Changes: - Added ahash = "0.8" with serde feature to llm-coding-tools-agents/Cargo.toml - Added ahash = "0.8" dependency to llm-coding-tools-serdesai/Cargo.toml - Replaced std::collections::HashMap with ahash::AHashMap in catalog.rs - Replaced std::collections::HashMap with ahash::AHashMap in config.rs - Replaced std::collections::HashMap with ahash::AHashMap in loader.rs Benefits: - Consistent hashing implementation across all crates - Improved performance with ahash's optimized hashing algorithm - Aligns with existing ahash usage in llm-coding-tools-core
Update both verify.sh and verify.ps1 to include the new agents crate in all verification steps. Changes: - Added cargo build for llm-coding-tools-agents - Added cargo test for llm-coding-tools-agents - Added cargo clippy for llm-coding-tools-agents - Added cargo publish --dry-run for llm-coding-tools-agents Benefits: - Ensures agents crate is validated alongside core and serdesai - Maintains consistency between Unix and Windows verify scripts
6a3e7a0 to
c98826d
Compare
…ndency The cargo publish dry-run was failing because dependencies must specify a version requirement when publishing to crates.io. Changes: - Added version = "0.2.0" to llm-coding-tools-core dependency in agents crate Benefits: - Fixes publish validation error - Allows crates.io publishing
… intra-doc links Rewrote the module and method-level documentation to be more concise and accurate, while converting raw type mentions to proper rustdoc intra-doc links. Changes: - Reworded struct-level docs to emphasize upstream consumption pattern - Added proper intra-doc links for Option::Some, Option::None, and AgentConfig - Standardized all method docs with explicit Parameters and Returns sections - Simplified language while preserving technical accuracy Benefits: - Easier to understand the catalog's role in the agent loading pipeline - Proper intra-doc links enable IDE navigation and type checking - Consistent documentation structure across all public methods
The ahash crate was added to serdesAI in commit 57ddc18 during a migration to AHashMap for performance, but it was never actually used in this crate. Changes: - Removed ahash = "0.8" from llm-coding-tools-serdesai/Cargo.toml - Cargo.lock automatically updated to reflect the removal Benefits: - Eliminates an unnecessary dependency (ahash is still used by agents and core crates) - Reduces build complexity and dependency tree - Follows the principle of only including dependencies that are actually used
…cs readability Reorganize agent configuration and error types into a top-level types module and rewrite module docs to be clearer, shorter, and easier to scan. Changes: - Moved agent config and load error definitions into `src/types/config.rs` and `src/types/error.rs` - Added `src/types/mod.rs` to centralize re-exports while preserving the public API - Updated internal imports across loader, parser, catalog, and extensions to use the new module layout - Rewrote module-level documentation with headings, bullets, and an accessible YAML frontmatter example Benefits: - Makes type ownership and module structure easier to navigate and maintain - Reduces confusion from outdated descriptions by documenting current parsing and naming behavior
Place the default enum variant first to keep discriminant ordering intuitive and consistent with Rust defaults. Changes: - Moved `AgentMode::All` to the top of the enum - Kept `#[default]` on `All` while preserving serde lowercase behavior Benefits: - Makes implicit default discriminant value zero - Improves enum readability and maintenance consistency
Prevent inline YAML comments from being embedded into generated block scalar values during frontmatter preprocessing. Changes: - Parse value text up to the first `#` and rewrite using the stripped value only - Skip transformation when the comment suffix contains `:` to avoid ambiguous false positives - Add regression tests for comment stripping and ambiguous inline comment handling Benefits: - Preserves intended scalar content without comment contamination - Reduces incorrect rewrites for edge-case frontmatter lines
Clarify why rewritten frontmatter can drop a final newline and why this is acceptable in the current parse flow.
Changes:
- Added an inline note in `convert_block_scalars` about `split_terminator('\n')` dropping trailing empty segments
- Kept behavior unchanged while documenting that trailing newline parity is not semantically relevant to YAML deserialization
Benefits:
- Makes an intentional tradeoff explicit for maintainers and reviewers
- Reduces future churn from re-reporting this behavior as an unknown bug
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/llm-coding-tools-agents/src/loader.rs (1)
138-157: Redundant.to_path_buf()on an already-ownedPathBuf.On line 150,
pathis already aPathBuf(frompath.into()at line 143), sopath.to_path_buf()is an unnecessary clone. Usepath.clone()for clarity, or restructure to pass the owned value directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/loader.rs` around lines 138 - 157, In add_file, avoid calling path.to_path_buf() on the owned PathBuf; change the AgentLoadError::schema_validation call to use the owned value (move path) or use path.clone() instead of to_path_buf() so you don't perform an extra conversion—update the call site that constructs the error (AgentLoadError::schema_validation(Some(...), "agent file name is empty")) to pass either path.clone() or restructure so path is moved directly into the error.src/llm-coding-tools-agents/src/parser/mod.rs (1)
99-119: Avoid per-callStringallocations for static keys invalidate_headless_contract.
Value::String("permission".to_string())andValue::String("task".to_string())allocate on every call. These could beconst-like or use serde_yaml's key lookup more efficiently, but this is a validation path and unlikely to be a hot path.♻️ Proposed refactor using static values
fn validate_headless_contract(frontmatter: &Value) -> Result<(), AgentParseError> { let Value::Mapping(root) = frontmatter else { return Ok(()); }; - let permission_key = Value::String("permission".to_string()); - let task_key = Value::String("task".to_string()); + // Thread-local or lazy_static could be used, but for simplicity: + let permission_key = Value::String("permission".into()); + let task_key = Value::String("task".into());Alternatively, if this becomes a hot path, consider caching these keys. For now the allocations are negligible.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 99 - 119, The function validate_headless_contract currently creates Value::String("permission".to_string()) and Value::String("task".to_string()) on every call; change these to static cached Values to avoid repeated allocations by defining static lazily-initialized constants (e.g., with once_cell::sync::Lazy or lazy_static) for permission_key and task_key and then reference those statics inside validate_headless_contract (keep task_rule_contains_ask and the rest of the logic unchanged). Ensure the static symbols are named clearly (e.g., PERMISSION_KEY, TASK_KEY) and used in root.get(&PERMISSION_KEY) and permission_map.get(&TASK_KEY).src/llm-coding-tools-agents/src/types/error.rs (1)
46-68: Consider extracting the duplicatedpath_strresolution.Each
Displayarm repeats the samepath.as_deref().map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>"))logic. A small helper would reduce duplication.♻️ Proposed refactor
+fn display_path(path: &Option<PathBuf>) -> &str { + path.as_deref() + .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")) +} + impl fmt::Display for AgentLoadError { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { match self { AgentLoadError::Io { path, source } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); - write!(f, "I/O error reading {path_str}: {source}") + write!(f, "I/O error reading {}: {source}", display_path(path)) } AgentLoadError::Parse { path, source } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); - write!(f, "parse error in {path_str}: {source}") + write!(f, "parse error in {}: {source}", display_path(path)) } AgentLoadError::SchemaValidation { path, message } => { - let path_str = path - .as_deref() - .map_or("<memory>", |p| p.to_str().unwrap_or("<invalid>")); - write!(f, "schema validation failed in {path_str}: {message}") + write!(f, "schema validation failed in {}: {message}", display_path(path)) } } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/types/error.rs` around lines 46 - 68, Extract the duplicated path-to-string logic used in fmt::Display for AgentLoadError into a small private helper (e.g. a function like path_str or format_path that takes &Option<PathBuf> or &Option<&Path> and returns a String/Cow<String>) and call it from each match arm in the impl fmt::Display for AgentLoadError; replace the repeated path.as_deref().map_or(...) expression in the Io, Parse and SchemaValidation arms with a single call to that helper so the Display implementation reads the path once via the helper and then writes the message with the returned string.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
src/.cargo/verify.ps1src/.cargo/verify.shsrc/llm-coding-tools-agents/Cargo.tomlsrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/extensions.rssrc/llm-coding-tools-agents/src/lib.rssrc/llm-coding-tools-agents/src/loader.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-agents/src/types/config.rssrc/llm-coding-tools-agents/src/types/error.rssrc/llm-coding-tools-agents/src/types/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/llm-coding-tools-agents/src/extensions.rs
- src/llm-coding-tools-agents/Cargo.toml
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-agents/src/types/mod.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/types/config.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-agents/src/catalog.rssrc/llm-coding-tools-agents/src/types/error.rssrc/llm-coding-tools-agents/src/lib.rssrc/llm-coding-tools-agents/src/loader.rs
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Applied to files:
src/llm-coding-tools-agents/src/types/mod.rssrc/.cargo/verify.ps1src/llm-coding-tools-agents/src/parser/mod.rssrc/.cargo/verify.shsrc/llm-coding-tools-agents/src/lib.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
Applied to files:
src/llm-coding-tools-agents/src/types/mod.rssrc/.cargo/verify.ps1src/llm-coding-tools-agents/src/parser/mod.rssrc/.cargo/verify.shsrc/llm-coding-tools-agents/src/types/config.rssrc/llm-coding-tools-agents/src/lib.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `//!` for module-level documentation
Applied to files:
src/llm-coding-tools-agents/src/types/mod.rssrc/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: After making changes to source code, run `.cargo/verify.sh` (or `.cargo/verify.ps1` on Windows) before returning to the user
Applied to files:
src/.cargo/verify.ps1src/.cargo/verify.sh
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Document public items with `///` and add examples in docs where helpful
Applied to files:
src/llm-coding-tools-agents/src/parser/mod.rssrc/llm-coding-tools-agents/src/parser/preprocessor.rssrc/llm-coding-tools-agents/src/lib.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `Cow<'_, str>` for conditional ownership (e.g., `String::from_utf8_lossy`)
Applied to files:
src/llm-coding-tools-agents/src/parser/preprocessor.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Prefer `core` over `std` where possible (e.g., `core::mem` over `std::mem`)
Applied to files:
src/llm-coding-tools-agents/src/lib.rs
🧬 Code graph analysis (5)
src/llm-coding-tools-agents/src/parser/mod.rs (2)
src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
preprocess_frontmatter_yaml(55-64)src/llm-coding-tools-agents/src/loader.rs (2)
parse_agent(323-323)None(91-91)
src/llm-coding-tools-agents/src/types/config.rs (1)
src/llm-coding-tools-core/src/tools/webfetch/mod.rs (1)
serde_json(80-80)
src/llm-coding-tools-agents/src/catalog.rs (1)
src/llm-coding-tools-agents/src/loader.rs (2)
new(68-70)None(91-91)
src/llm-coding-tools-agents/src/types/error.rs (1)
src/llm-coding-tools-agents/src/loader.rs (1)
None(91-91)
src/llm-coding-tools-agents/src/loader.rs (4)
src/llm-coding-tools-agents/src/parser/mod.rs (4)
parse_agent(61-91)parse_agent(305-305)parse_agent(429-429)parse_agent(448-448)src/llm-coding-tools-agents/src/catalog.rs (3)
new(34-38)by_name(61-63)iter(48-50)src/llm-coding-tools-agents/src/types/error.rs (3)
schema_validation(93-98)io(83-85)parse(88-90)src/llm-coding-tools-agents/src/types/config.rs (1)
from_raw(140-153)
🪛 PSScriptAnalyzer (1.24.0)
src/.cargo/verify.ps1
[warning] 15-15: File 'verify.ps1' uses Write-Host. Avoid using Write-Host because it might not work in all hosts, does not work when there is no host, and (prior to PS 5.0) cannot be suppressed, captured, or redirected. Instead, use Write-Output, Write-Verbose, or Write-Information.
(PSAvoidUsingWriteHost)
[warning] 20-20: File 'verify.ps1' uses Write-Host. Avoid using Write-Host because it might not work in all hosts, does not work when there is no host, and (prior to PS 5.0) cannot be suppressed, captured, or redirected. Instead, use Write-Output, Write-Verbose, or Write-Information.
(PSAvoidUsingWriteHost)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Build and Test (Blocking) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (macos-latest, aarch64-apple-darwin, false)
- GitHub Check: Build and Test (Async/Tokio) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
🔇 Additional comments (11)
src/.cargo/verify.sh (1)
13-13: LGTM!New crate steps correctly mirror the existing
llm-coding-tools-serdesaipattern (--quiet,-D warningson clippy, ordering after core). The workspace-levelcargo docon line 30 already picks up the new crate without a dedicated step.Also applies to: 18-18, 23-23, 37-37
src/.cargo/verify.ps1 (1)
12-12: LGTM!Changes are in full sync with
verify.sh— same crate, same flags, same phase ordering. Script parity is maintained as required.Also applies to: 17-17, 22-22, 37-37
src/llm-coding-tools-agents/src/types/mod.rs (1)
1-15: Clean re-export module with proper documentation.Module docs, rustdoc links, and visibility scoping all look correct.
src/llm-coding-tools-agents/src/catalog.rs (1)
1-90: Well-structured catalog with appropriate API surface.Module docs, visibility,
#[inline]on accessors, and&AgentConfigreturn types are all correct. TheDefaultderive is consistent with the manualnew().src/llm-coding-tools-agents/src/lib.rs (1)
1-13: LGTM!Clean crate root with README-based docs and focused public API surface.
src/llm-coding-tools-agents/src/types/config.rs (1)
1-154: Clean data model with well-documented types.The serde configuration is sound:
#[serde(untagged)]onPermissionRulecorrectly prioritizesActionoverPattern,descriptionis required inRawFrontmatterbut defaulted inAgentConfig, and#[serde(skip)]onpromptis appropriate for the parsing flow.src/llm-coding-tools-agents/src/parser/mod.rs (1)
219-223: Unsafe block inextract_body_inplaceis sound.
core::ptr::copycorrectly handles overlapping regions, char boundary invariants are guarded bydebug_assert!, and the ASCII whitespace trimming cannot split multi-byte code points. The reuse of the existingStringallocation avoids a second allocation per parse.src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
183-218: Inline comment stripping logic is correct — edge cases are handled.The scan-for-
#-before-:approach correctly handles values likeprovider/model:tag # comment(strips comment),provider/model # note: keep(returnsNonedue to colon in comment suffix), andkey: # comment(returnsNonedue to empty stripped value). The fix from the prior review feedback has been properly incorporated.src/llm-coding-tools-agents/src/loader.rs (3)
257-316: Directory walker setup and filtering logic is sound.The
WalkBuilderconfiguration appropriately follows symlinks (with cycle protection from theignorecrate), respects.gitignore, shows hidden files, and applies pattern-based filtering. Path normalization for Windows is correctly handled via#[cfg(windows)].
168-186:add_file_nameddoc says "always overrides any frontmatternamefield" — verify this holds.The implementation passes
String::new()asdefault_nametoload_agent_file, then overwritesconfig.nameon line 183. This works correctly: if frontmatter has aname,from_rawuses it (sinceraw.name.unwrap_or("")returns the frontmatter name), then line 183 overwrites it. If frontmatter has noname,from_rawuses"", then line 183 overwrites. The override semantic is correct, though the intermediate empty-name state is a minor smell.
365-396: Pattern matching and name derivation are clean.
matches_agent_patterncorrectly requires the/afteragent/agents(preventing false matches likeagentsmith/), andderive_agent_name_from_relstrips prefixes and.mdsuffix with an empty-name guard.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/llm-coding-tools-agents/src/loader.rs`:
- Around line 17-33: The doc example fails because RawFrontmatter.description is
required during serde deserialization; either make it optional or provide a
default. Update the RawFrontmatter definition in loader.rs (the RawFrontmatter
struct) to mark description with #[serde(default)] (or change its type to
Option<String>) so add_from_str / deserialization accepts missing description,
or alternatively modify the doc example string passed to
AgentLoader::add_from_str to include a description field in the frontmatter;
choose the serde default change for backward-compatible behavior.
---
Duplicate comments:
In `@src/llm-coding-tools-agents/src/parser/preprocessor.rs`:
- Around line 66-103: The existing convert_block_scalars function intentionally
uses split_terminator('\n'), and the comment should explicitly state that
split_terminator drops a trailing newline and that this is acceptable for YAML
deserialization; update the comment near the loop in convert_block_scalars to
clearly reference that behavior (and optionally cite split_terminator
documentation) so reviewers understand the trailing-newline drop is deliberate
when using find_first_block_scalar and extract_if_needs_block_scalar in this
rewrite logic.
---
Nitpick comments:
In `@src/llm-coding-tools-agents/src/loader.rs`:
- Around line 138-157: In add_file, avoid calling path.to_path_buf() on the
owned PathBuf; change the AgentLoadError::schema_validation call to use the
owned value (move path) or use path.clone() instead of to_path_buf() so you
don't perform an extra conversion—update the call site that constructs the error
(AgentLoadError::schema_validation(Some(...), "agent file name is empty")) to
pass either path.clone() or restructure so path is moved directly into the
error.
In `@src/llm-coding-tools-agents/src/parser/mod.rs`:
- Around line 99-119: The function validate_headless_contract currently creates
Value::String("permission".to_string()) and Value::String("task".to_string()) on
every call; change these to static cached Values to avoid repeated allocations
by defining static lazily-initialized constants (e.g., with
once_cell::sync::Lazy or lazy_static) for permission_key and task_key and then
reference those statics inside validate_headless_contract (keep
task_rule_contains_ask and the rest of the logic unchanged). Ensure the static
symbols are named clearly (e.g., PERMISSION_KEY, TASK_KEY) and used in
root.get(&PERMISSION_KEY) and permission_map.get(&TASK_KEY).
In `@src/llm-coding-tools-agents/src/types/error.rs`:
- Around line 46-68: Extract the duplicated path-to-string logic used in
fmt::Display for AgentLoadError into a small private helper (e.g. a function
like path_str or format_path that takes &Option<PathBuf> or &Option<&Path> and
returns a String/Cow<String>) and call it from each match arm in the impl
fmt::Display for AgentLoadError; replace the repeated
path.as_deref().map_or(...) expression in the Io, Parse and SchemaValidation
arms with a single call to that helper so the Display implementation reads the
path once via the helper and then writes the message with the returned string.
| //! ```no_run | ||
| //! use llm_coding_tools_agents::{AgentLoader, AgentCatalog}; | ||
| //! use std::path::Path; | ||
| //! | ||
| //! let loader = AgentLoader::new(); | ||
| //! let mut catalog = AgentCatalog::new(); | ||
| //! | ||
| //! // Scan directories for agent definitions | ||
| //! loader.add_directory(&mut catalog, Path::new("~/.opencode"))?; | ||
| //! | ||
| //! // Load specific files | ||
| //! loader.add_file(&mut catalog, Path::new("custom.md"))?; | ||
| //! | ||
| //! // Parse from string (useful for embedded configs) | ||
| //! loader.add_from_str(&mut catalog, "---\nmode: subagent\n---\nprompt", "agent-name")?; | ||
| //! # Ok::<(), llm_coding_tools_agents::AgentLoadError>(()) | ||
| //! ``` |
There was a problem hiding this comment.
Doc example will fail: description is a required field in RawFrontmatter.
The inline example at line 31 passes markdown without a description field:
"---\nmode: subagent\n---\nprompt"
RawFrontmatter.description is not annotated with #[serde(default)], so this would return a parse error at runtime. Since the example is no_run, it won't be caught by cargo test --doc, but it's misleading.
📝 Proposed fix
-loader.add_from_str(&mut catalog, "---\nmode: subagent\n---\nprompt", "agent-name")?;
+loader.add_from_str(&mut catalog, "---\nmode: subagent\ndescription: Example\n---\nprompt", "agent-name")?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| //! ```no_run | |
| //! use llm_coding_tools_agents::{AgentLoader, AgentCatalog}; | |
| //! use std::path::Path; | |
| //! | |
| //! let loader = AgentLoader::new(); | |
| //! let mut catalog = AgentCatalog::new(); | |
| //! | |
| //! // Scan directories for agent definitions | |
| //! loader.add_directory(&mut catalog, Path::new("~/.opencode"))?; | |
| //! | |
| //! // Load specific files | |
| //! loader.add_file(&mut catalog, Path::new("custom.md"))?; | |
| //! | |
| //! // Parse from string (useful for embedded configs) | |
| //! loader.add_from_str(&mut catalog, "---\nmode: subagent\n---\nprompt", "agent-name")?; | |
| //! # Ok::<(), llm_coding_tools_agents::AgentLoadError>(()) | |
| //! ``` | |
| //! |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/llm-coding-tools-agents/src/loader.rs` around lines 17 - 33, The doc
example fails because RawFrontmatter.description is required during serde
deserialization; either make it optional or provide a default. Update the
RawFrontmatter definition in loader.rs (the RawFrontmatter struct) to mark
description with #[serde(default)] (or change its type to Option<String>) so
add_from_str / deserialization accepts missing description, or alternatively
modify the doc example string passed to AgentLoader::add_from_str to include a
description field in the frontmatter; choose the serde default change for
backward-compatible behavior.
…validation The validation function `validate_headless_contract` had an unclear name and documentation that didn't explain its actual purpose. This improves clarity for future maintainers. Changes: - Renamed `validate_headless_contract` to `validate_headless_compatibility` - Updated doc comment to explain it rejects interactive "ask" permissions - Added inline comments explaining YAML extraction logic - Added YAML code blocks showing scalar and pattern-based configurations - Added reference to PermissionRule target type - Added inline comments to task_rule_contains_ask explaining the two branches Benefits: - Clearer function name reflects its actual purpose - Better documentation helps future contributors understand the code - Easier to extend for additional headless compatibility checks
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/llm-coding-tools-agents/src/parser/mod.rs (3)
238-242: Add a// SAFETY:comment to theunsafeblock.The safety rationale is documented at lines 215–221, but idiomatic Rust (and
clippy::undocumented_unsafe_blocks) requires a// SAFETY:annotation directly preceding theunsafeblock.♻️ Proposed addition
+ // SAFETY: `start_offset` and `end_offset` are char boundaries (verified by debug_assert above). + // ASCII whitespace trimming cannot bisect a multibyte code point (continuation bytes 0x80–0xBF + // and leading bytes 0xC2–0xF4 are never ASCII). `ptr::copy` uses memmove semantics, so the + // overlapping forward shift is safe. The resulting bytes `[0..body_len]` are a valid UTF-8 + // sub-slice of the original string. unsafe { let vec = content.as_mut_vec(); core::ptr::copy(vec.as_ptr().add(start_offset), vec.as_mut_ptr(), body_len); vec.set_len(body_len); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 238 - 242, Add a `// SAFETY:` comment immediately above the `unsafe` block that explains why the raw pointer operations on `vec` are sound: state that `content.as_mut_vec()` yields a valid mutable Vec, that `start_offset` and `body_len` are within the original buffer bounds (so `vec.as_ptr().add(start_offset)` and `vec.as_mut_ptr()` are valid for reads/writes), that the regions copied do not overlap in an undefined way for `core::ptr::copy`, and that `vec.set_len(body_len)` is safe because the copied range has been initialized; reference the local symbols `content`, `vec`, `start_offset`, `body_len`, `core::ptr::copy`, and `vec.set_len` in the comment.
103-127: Avoid heap allocations for static key lookups; useas_str()iteration instead.
Value::String("permission".to_string())andValue::String("task".to_string())both allocate on every parse call. Sinceserde_yaml::Value::as_str()returnsOption<&str>, the lookups can be made allocation-free by iterating with.find():♻️ Proposed refactor
- let permission_key = Value::String("permission".to_string()); - let task_key = Value::String("task".to_string()); - // ... - let Some(Value::Mapping(permission_map)) = root.get(&permission_key) else { + let Some(Value::Mapping(permission_map)) = root + .iter() + .find(|(k, _)| k.as_str() == Some("permission")) + .map(|(_, v)| v) + else { return Ok(()); }; - let Some(task_rule) = permission_map.get(&task_key) else { + let Some((_, task_rule)) = permission_map + .iter() + .find(|(k, _)| k.as_str() == Some("task")) + else { return Ok(()); };As per coding guidelines: "Use
&'static strfor compile-time constant strings."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 103 - 127, The code currently creates temporary owned Strings for keys (permission_key and task_key as Value::String(...)) and uses root.get(&permission_key) which causes heap allocations on every parse; replace these lookups with allocation-free checks by iterating the mapping keys and using Value::as_str() to find entries named "permission" and then inside that mapping find "task" (e.g. iterate root.as_mapping().iter().find(|(k,_)| k.as_str()==Some("permission")) to get permission_map, then permission_map.as_mapping().iter().find(|(k,_)| k.as_str()==Some("task")) to get task_rule) so you avoid constructing Value::String and eliminate the allocations while preserving the existing permission_map and task_rule logic.
233-236: Unreachable early-return:start_offset == 0is never true for valid inputs.
start_offsetbegins atbody_start, which is always ≥ 7 (minimum document is---\n---= 7 bytes). The conditionstart_offset == 0can never be satisfied, making this fast-path dead code that may mislead future readers.♻️ Proposed fix
- if start_offset == 0 && body_len == len { - return content; - } - unsafe {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/llm-coding-tools-agents/src/parser/mod.rs` around lines 233 - 236, The early-return condition `if start_offset == 0 && body_len == len { return content; }` is dead because `start_offset` (from `body_start`) is never 0; remove the redundant `start_offset == 0` check and simplify to `if body_len == len { return content; }` (or, if the original intent was to test the raw document start, replace `start_offset` with the correct variable such as `body_start`). Update the check in the parser function around the variables `start_offset`, `body_len`, `len`, and `content` accordingly to keep the fast-path correct and readable.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/llm-coding-tools-agents/src/parser/mod.rs
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.rs
📄 CodeRabbit inference engine (src/AGENTS.md)
src/**/*.rs: Preallocate collections when size is known or estimable usingString::with_capacity(estimated_len),Vec::with_capacity(count), orBufReader::with_capacity(size, reader)
Prefer&str/&[T]returns over owned types when lifetime allows
UseCow<'_, str>for conditional ownership (e.g.,String::from_utf8_lossy)
Use&'static strfor compile-time constant strings
Reuse buffers by calling.clear()and reusingVec/Stringinstead of reallocating
Use const generics for compile-time branching (e.g.,<const LINE_NUMBERS: bool>)
Use#[inline]on small, hot-path functions
Prefercoreoverstdwhere possible (e.g.,core::memoverstd::mem)
Usememchrcrate for fast byte searching over manual iteration
Keep modules under 500 lines (excluding tests); split if larger
Placeusestatements inside functions only for#[cfg]conditional compilation
Document public items with///and add examples in docs where helpful
Use//!for module-level documentation
Use [TypeName] rustdoc links instead of backticks in documentation
Files:
src/llm-coding-tools-agents/src/parser/mod.rs
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Use `//!` for module-level documentation
Applied to files:
src/llm-coding-tools-agents/src/parser/mod.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Applies to src/**/*.rs : Document public items with `///` and add examples in docs where helpful
Applied to files:
src/llm-coding-tools-agents/src/parser/mod.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-core/` as framework-agnostic core library containing `src/tools/`, `src/path/`, `src/error.rs`, `src/output.rs`, and `src/util.rs`
Applied to files:
src/llm-coding-tools-agents/src/parser/mod.rs
📚 Learning: 2026-02-07T22:53:26.067Z
Learnt from: CR
Repo: Sewer56/llm-coding-tools PR: 0
File: src/AGENTS.md:0-0
Timestamp: 2026-02-07T22:53:26.067Z
Learning: Organize code with `llm-coding-tools-serdesai/` containing `src/absolute/`, `src/allowed/`, `src/schema.rs`, and `src/convert.rs` for serdesAI framework implementations
Applied to files:
src/llm-coding-tools-agents/src/parser/mod.rs
🧬 Code graph analysis (1)
src/llm-coding-tools-agents/src/parser/mod.rs (2)
src/llm-coding-tools-agents/src/parser/preprocessor.rs (1)
preprocess_frontmatter_yaml(55-64)src/llm-coding-tools-agents/src/loader.rs (2)
parse_agent(323-323)None(91-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Build and Test (Blocking) (macos-latest, aarch64-apple-darwin, false)
- GitHub Check: Build and Test (Async/Tokio) (ubuntu-latest, x86_64-unknown-linux-gnu, false)
- GitHub Check: Build and Test (Blocking) (windows-latest, x86_64-pc-windows-msvc, false)
- GitHub Check: Build and Test (Async/Tokio) (macos-latest, aarch64-apple-darwin, false)
- GitHub Check: Build and Test (Async/Tokio) (windows-latest, x86_64-pc-windows-msvc, false)
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/llm-coding-tools-agents/src/parser/mod.rs`:
- Around line 238-242: Add a `// SAFETY:` comment immediately above the `unsafe`
block that explains why the raw pointer operations on `vec` are sound: state
that `content.as_mut_vec()` yields a valid mutable Vec, that `start_offset` and
`body_len` are within the original buffer bounds (so
`vec.as_ptr().add(start_offset)` and `vec.as_mut_ptr()` are valid for
reads/writes), that the regions copied do not overlap in an undefined way for
`core::ptr::copy`, and that `vec.set_len(body_len)` is safe because the copied
range has been initialized; reference the local symbols `content`, `vec`,
`start_offset`, `body_len`, `core::ptr::copy`, and `vec.set_len` in the comment.
- Around line 103-127: The code currently creates temporary owned Strings for
keys (permission_key and task_key as Value::String(...)) and uses
root.get(&permission_key) which causes heap allocations on every parse; replace
these lookups with allocation-free checks by iterating the mapping keys and
using Value::as_str() to find entries named "permission" and then inside that
mapping find "task" (e.g. iterate root.as_mapping().iter().find(|(k,_)|
k.as_str()==Some("permission")) to get permission_map, then
permission_map.as_mapping().iter().find(|(k,_)| k.as_str()==Some("task")) to get
task_rule) so you avoid constructing Value::String and eliminate the allocations
while preserving the existing permission_map and task_rule logic.
- Around line 233-236: The early-return condition `if start_offset == 0 &&
body_len == len { return content; }` is dead because `start_offset` (from
`body_start`) is never 0; remove the redundant `start_offset == 0` check and
simplify to `if body_len == len { return content; }` (or, if the original intent
was to test the raw document start, replace `start_offset` with the correct
variable such as `body_start`). Update the check in the parser function around
the variables `start_offset`, `body_len`, `len`, and `content` accordingly to
keep the fast-path correct and readable.
Extract llm-coding-tools-agents crate and CI wiring
Context
This PR is the first extraction from the oversized
task-toolbranch.It isolates the
llm-coding-tools-agentscrate work into a focused, reviewable unit based onmain.What This PR Includes
src/llm-coding-tools-agentsfor loading OpenCode-style agent markdown filesparser,preprocessor)AgentConfig,AgentMode, permission rule mapping)AgentCatalog,AgentLoader)RulesetExt)README.mdsrc/Cargo.tomlsrc/Cargo.lock.github/workflows/rust.yml) to includellm-coding-tools-agentsin:Commit Breakdown
19d1157Added: Introduce llm-coding-tools-agents crate and workspace integration9c5773fChanged: Include llm-coding-tools-agents in Rust CI workflowIntentionally Out of Scope
llm-coding-tools-models-devcrate introduction and related workflow worktask-toolruntime integration/refactors outside the agents crate + CI wiringValidation
./.cargo/verify.shfromsrc/src/llm-coding-tools-core/src/permissions.rs:75(bare URL warning)cargo test -p llm-coding-tools-agents --quiet✅cargo clippy -p llm-coding-tools-agents --quiet -- -D warnings✅RUSTDOCFLAGS=\"-D warnings\" cargo doc -p llm-coding-tools-agents --no-deps --quiet✅