Skip to content

feat: enhance Langfuse integration, error correction, and tool system#131

Merged
jexShain merged 1 commit intoAI-Shell-Team:rustfrom
jexShain:crates-update-0423
Apr 23, 2026
Merged

feat: enhance Langfuse integration, error correction, and tool system#131
jexShain merged 1 commit intoAI-Shell-Team:rustfrom
jexShain:crates-update-0423

Conversation

@jexShain
Copy link
Copy Markdown
Collaborator

@jexShain jexShain commented Apr 23, 2026

Summary

  • Refactor langfuse module to use langfuse-ergonomic crate with environment variable override support (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL)
  • Add structured JSON error correction response parsing (ErrorCorrectionResult) with fallback to code block extraction
  • Defer SystemDiagnoseTool registration until after skill loading to properly wire skill callbacks
  • Improve PTY offload command handling
  • Add i18n translations for exit message across all locales (de-DE, es-ES, fr-FR, ja-JP)
  • Streamline bash tool and tool registry construction

Test plan

  • cargo build --release passes with 0 warnings
  • cargo clippy passes
  • cargo test all pass
  • Verify Langfuse integration works with configured keys
  • Verify error correction suggests commands on failed executions
  • Verify skill tool callbacks work within diagnose agent

Summary by CodeRabbit

  • New Features

    • Optional observability integration with Langfuse and richer, structured error-correction results requiring interactive confirmation.
    • Public helpers to get/set current language and display language names.
  • Localization Updates

    • Expanded DE/ES/FR/JA translations to support the enhanced error-correction UI.
  • Improvements

    • Better command-output capture, offload previews, and condensed terminal previews for tool outputs.
    • Limit AI context history to bound message history and improved diagnose tooling behavior.
  • Chores

    • Added workspace dependencies for observability and UUID support.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the pull request. A maintainer will review it when available.

Please keep the PR focused, explain the why in the description, and make sure local checks pass before requesting review.

Contribution guide: https://github.com/AI-Shell-Team/aish/blob/main/CONTRIBUTING.md

@github-actions
Copy link
Copy Markdown
Contributor

This pull request description looks incomplete. Please update the missing sections below before review.

Missing items:

  • User-visible Changes
  • Compatibility
  • Testing
  • Change Type
  • Scope

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Adds workspace langfuse-ergonomic, refactors Langfuse integration to use the ergonomic client with env-var precedence and fire-and-forget spans, converts error-correction to JSON-first parsing with interactive confirmation, extends i18n API, enforces LLM context trimming, revises offload/PTY handling, and exposes skill callbacks for diagnose tooling.

Changes

Cohort / File(s) Summary
Workspace & deps
Cargo.toml, crates/aish-llm/Cargo.toml, crates/aish-tools/Cargo.toml
Adds langfuse-ergonomic = "0.6" to workspace dependencies and marks uuid as workspace-managed for aish-tools.
CLI / Shell init
crates/aish-cli/src/main.rs, crates/aish-shell/src/app.rs
CLI now resolves Langfuse creds/URL from env vars (LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL) with fallbacks; shell optionally constructs a Langfuse client and attaches it to LlmSession; tool registration/order changed for diagnose tool.
Langfuse integration
crates/aish-llm/src/langfuse.rs, crates/aish-llm/src/session.rs
Replaces custom ingestion with langfuse_ergonomic::LangfuseClient (Arc-wrapped); config resolution updated (env priority, trim trailing /, default URL); trace methods spawn background tasks; span_generation now accepts serde_json::Value; flush/shutdown are no-ops; session now forwards message JSON and token usage; tests updated.
LLM agents & diagnose
crates/aish-llm/src/agent.rs, crates/aish-llm/src/diagnose_agent.rs, crates/aish-tools/src/system_diagnose.rs
Adds AgentConfig.max_context_messages (default 30) and trims message history; DiagnoseAgent prefers provided system_prompt; system prompt/tool naming updated (bash); ReAct iterations increased and system diagnose supports optional skill callbacks.
Error correction & prompts
crates/aish-prompts/src/manager.rs, crates/aish-shell/src/ai_handler.rs
Prompts changed to require a fixed JSON corrected_command schema; AiHandler now returns ErrorCorrectionResult { command, description } and parses fenced JSON first with fallbacks; auto-execution removed in favor of interactive confirmation; tests updated.
I18n additions & API
crates/aish-i18n/src/lib.rs, crates/aish-i18n/src/manager.rs, crates/aish-i18n/locales/...
Adds current_language() and language_name() helpers, exposes detect_locale() and current_locale() accessor; adds locale strings (de, es, fr, ja) for corrected-command UI and retry hints.
Offload & PTY output
crates/aish-pty/src/lib.rs, crates/aish-pty/src/offload.rs, crates/aish-tools/src/bash.rs, crates/aish-tools/src/registry.rs
Exports additional Bash offload types; BashOutputOffload::render now always returns structured JSON payloads for inline/failed/offloaded cases; PTY capture increased (10MB) and truncation replaced by offloader previews; format_tagged_result now includes <return_code> and embeds serialized offload JSON; public BashTool API simplified.
Shell integration & UI
crates/aish-shell/src/app.rs
Defers diagnose tool instantiation until after skills load (wires skill callbacks), prints plain-text preview for bash outputs, always injects [Shell] memory entry, and updates error-correction UI to require confirmation and show retry hints.
Public re-exports & types
crates/aish-pty/src/lib.rs, crates/aish-shell/src/ai_handler.rs
Adds re-exports for BashOffloadResult, BashOffloadSettings, BashOutputOffload; introduces ErrorCorrectionResult and updates AiHandler signature accordingly.

Sequence Diagram(s)

sequenceDiagram
  participant App as Shell App
  participant Session as LlmSession
  participant Langfuse as Langfuse Ergonomic Client
  participant Remote as Langfuse Service

  App->>Session: initialize (maybe with LangfuseConfig)
  Session->>Langfuse: LangfuseClient::new(...) (env-priority)
  Langfuse->>Langfuse: wrap ergonomic client in Arc
  note over Session,Langfuse: runtime tracing
  Session->>Langfuse: trace/span_generation(json messages, usage)
  Langfuse->>Langfuse: spawn tokio task (fire-and-forget)
  Langfuse->>Remote: async HTTP event (background)
  Remote-->>Langfuse: HTTP response (not awaited)
  Langfuse->>App: warn! on send error (logged)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 I hopped through crates with nimble feet,
Langfuse tucked in, env vars neat,
JSON corrections whisper clear,
Offloads nest when outputs leer,
Skills and context trimmed — a rabbit's treat! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: Langfuse integration refactoring, error correction improvements, and tool system enhancements across multiple crates.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/aish-cli/src/main.rs (1)

465-479: ⚠️ Potential issue | 🟠 Major

CLI Langfuse overrides still lose to environment variables.

check_langfuse resolves CLI > env > config here, but LangfuseConfig::from_parts(...) immediately re-reads LANGFUSE_* and overwrites those resolved values again. In practice, --public-key, --secret-key, and --host still won't win when the env vars are set. Build the config from the already-resolved values here, or add a helper that skips env lookup.

Possible fix
-            let lf_config =
-                aish_llm::LangfuseConfig::from_parts(Some(&pk), Some(&sk), host.as_deref());
-            match lf_config {
-                Some(cfg) => {
-                    let base_url = cfg.base_url.clone();
-                    let _client = aish_llm::LangfuseClient::new(cfg);
+            if pk.is_empty() || sk.is_empty() {
+                eprintln!("Langfuse configuration is incomplete.");
+                return;
+            }
+
+            let cfg = aish_llm::LangfuseConfig {
+                enabled: true,
+                public_key: pk.clone(),
+                secret_key: sk.clone(),
+                base_url: host
+                    .as_deref()
+                    .map(|h| h.trim_end_matches('/').to_string())
+                    .filter(|h| !h.is_empty())
+                    .unwrap_or_else(|| "https://cloud.langfuse.com".to_string()),
+            };
+
+            {
+                let base_url = cfg.base_url.clone();
+                let _client = aish_llm::LangfuseClient::new(cfg);
                     println!("Langfuse configuration found.");
                     println!("  Host: {}", base_url);
                     if pk.len() > 8 {
                         println!("  Public Key: {}...{}", &pk[..4], &pk[pk.len() - 4..]);
                     } else {
@@
-                }
-                None => {
-                    eprintln!("Langfuse configuration is incomplete.");
-                }
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-cli/src/main.rs` around lines 465 - 479, The resolved precedence
(CLI > env > config) is being clobbered because aish-cli calls
aish_llm::LangfuseConfig::from_parts(...) which re-reads environment variables;
instead, build the Langfuse config from the already-resolved values (public_key,
secret_key, host) so CLI flags win: replace the call to
LangfuseConfig::from_parts in the match arm that creates lf_config with a
constructor or direct struct creation that accepts the resolved pk/sk/host (or
add a new LangfuseConfig::from_resolved(...) helper that does not read env),
referencing the variables public_key/secret_key/host and the symbol
LangfuseConfig so the final lf_config uses the values you computed above rather
than re-reading LANGFUSE_* env vars.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/aish-i18n/src/lib.rs`:
- Around line 69-71: current_language() incorrectly re-runs
I18nManager::detect_locale() instead of returning the active runtime locale, so
it ignores set_locale() changes; update current_language() to read the
thread/global manager's active locale (via MANAGER or an accessor on I18nManager
such as current_locale()/get_locale()) rather than calling
I18nManager::detect_locale(), ensuring the returned String comes from the
manager instance used by set_locale().

In `@crates/aish-llm/src/agent.rs`:
- Around line 339-343: The code uses self.config.max_context_messages and
computes max - 1 without validating it's >=1, which can panic for 0; ensure max
is clamped or validated before use (either in AgentConfig builder or at start of
the trimming logic). Modify the trimming in the function containing
messages.drain(...) to compute let max = self.config.max_context_messages.max(1)
(or return early for <=1) and then use messages.drain(1..messages.len() - (max -
1)); also make the same change in the other trimming block around lines 396-400
so both paths treat 0/1 consistently and avoid subtracting from zero.
- Around line 334-337: The user prompt unconditionally instructs the model to
call the final_answer tool; change the logic in ReActAgent where
messages.push(ChatMessage::user(...)) is added (the continuation block) to check
the tool_specs collection for a tool named "final_answer" and only push the
instruction that explicitly references the final_answer tool when that tool is
present; otherwise push a fallback user prompt with plain "Final Answer:"
guidance so the message matches the actual available tools.
- Around line 339-350: The current trimming logic that drains messages by raw
count (using messages.drain(1..messages.len() - (max - 1)) around
self.config.max_context_messages) can break tool-call history invariants by
separating an assistant tool-call message from its subsequent tool result
messages; update the trimming to operate on conversation-turn boundaries or to
detect and preserve assistant "tool" messages together with their following
"tool" result messages (and the originating assistant message), e.g., iterate
backwards trimming whole turns until under max, or if encountering an assistant
message with a tool call include its subsequent tool result entries before
cutting, and apply the same fix to the analogous block around lines 396-407 that
uses the same messages draining logic.

In `@crates/aish-llm/src/langfuse.rs`:
- Around line 239-283: The tests for LangfuseConfig::from_parts are flaky
because they rely on global LANGFUSE_* env vars; update the test-suite so each
test saves, clears, and restores relevant env entries (LANGFUSE_PUBLIC_KEY,
LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL) around calls to
LangfuseConfig::from_parts, and add one dedicated test that sets those env vars
and verifies they override explicit arguments; ensure env-mutating tests run
serially (e.g., use a serial test attribute or run in a single-threaded test
cfg) so global state cannot leak between tests and assertions remain
deterministic.
- Around line 54-57: Sanitize and validate any LANGFUSE_BASE_URL env override
the same way as the host parameter instead of accepting it verbatim and calling
expect; trim trailing slashes and whitespace, reject empty strings, parse the
result with url::Url::parse (or equivalent) to ensure it's a valid URL, and if
parsing fails return an Err or log and abort client construction instead of
panicking (this affects the base_url assignment where LANGFUSE_BASE_URL is read
and the client construction in new()). Apply the same validation/sanitization to
the other similar block referenced (lines ~95-101) so both places enforce
non-empty, well-formed URLs and propagate failures instead of calling expect.
- Around line 118-129: The spawned trace tasks (tokio::spawn wrapping
client.trace().call()) are detached so flush()/shutdown() cannot guarantee
delivery; modify the implementation to track pending tasks (e.g., store
JoinHandle or use tokio::task::JoinSet) when creating tasks for
client.trace().call() instead of fully-detaching them, expose a draining await
in flush()/shutdown() that awaits all pending handles (or cancels+awaits them)
and apply the same change to every occurrence of tokio::spawn calling
client.trace().call() (the sites shown around the current block and the other
similar blocks noted) so callers can await completion before process exit.

In `@crates/aish-prompts/src/manager.rs`:
- Around line 130-133: The prompt rule references "the shell history context
above" which doesn't exist in the template; update the prompt template in
crates/aish-prompts/src/manager.rs (the prompt string/constant that contains the
JSON-instruction block) to either include a concrete shell history placeholder
section or remove/replace that rule. Specifically, edit the prompt text where
the lines "- Reference the shell history context above for the full error output
and previous commands" appear: either add a clearly named placeholder like
"{shell_history}" with guidance for when it can be empty, or delete that bullet
so the template doesn't require nonexistent context; ensure any downstream code
that fills the template (the function/constant that builds the prompt) is
adjusted to pass an empty string if no history is available.

---

Outside diff comments:
In `@crates/aish-cli/src/main.rs`:
- Around line 465-479: The resolved precedence (CLI > env > config) is being
clobbered because aish-cli calls aish_llm::LangfuseConfig::from_parts(...) which
re-reads environment variables; instead, build the Langfuse config from the
already-resolved values (public_key, secret_key, host) so CLI flags win: replace
the call to LangfuseConfig::from_parts in the match arm that creates lf_config
with a constructor or direct struct creation that accepts the resolved
pk/sk/host (or add a new LangfuseConfig::from_resolved(...) helper that does not
read env), referencing the variables public_key/secret_key/host and the symbol
LangfuseConfig so the final lf_config uses the values you computed above rather
than re-reading LANGFUSE_* env vars.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 62a463f7-8cf6-4141-8730-58195ee5423b

📥 Commits

Reviewing files that changed from the base of the PR and between f7e1777 and 75277f2.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (22)
  • Cargo.toml
  • crates/aish-cli/src/main.rs
  • crates/aish-i18n/locales/de-DE.yaml
  • crates/aish-i18n/locales/es-ES.yaml
  • crates/aish-i18n/locales/fr-FR.yaml
  • crates/aish-i18n/locales/ja-JP.yaml
  • crates/aish-i18n/src/lib.rs
  • crates/aish-i18n/src/manager.rs
  • crates/aish-llm/Cargo.toml
  • crates/aish-llm/src/agent.rs
  • crates/aish-llm/src/diagnose_agent.rs
  • crates/aish-llm/src/langfuse.rs
  • crates/aish-llm/src/session.rs
  • crates/aish-prompts/src/manager.rs
  • crates/aish-pty/src/lib.rs
  • crates/aish-pty/src/offload.rs
  • crates/aish-shell/src/ai_handler.rs
  • crates/aish-shell/src/app.rs
  • crates/aish-tools/Cargo.toml
  • crates/aish-tools/src/bash.rs
  • crates/aish-tools/src/registry.rs
  • crates/aish-tools/src/system_diagnose.rs

Comment thread crates/aish-i18n/src/lib.rs
Comment thread crates/aish-llm/src/agent.rs
Comment thread crates/aish-llm/src/agent.rs
Comment thread crates/aish-llm/src/agent.rs
Comment thread crates/aish-llm/src/langfuse.rs
Comment thread crates/aish-llm/src/langfuse.rs
Comment thread crates/aish-llm/src/langfuse.rs
Comment thread crates/aish-prompts/src/manager.rs
- Refactor langfuse module to use langfuse-ergonomic crate with env var support
- Add structured JSON error correction response parsing with fallback
- Defer SystemDiagnoseTool registration to wire skill callbacks
- Improve PTY offload command handling
- Add i18n translations for exit message across locales
- Streamline bash tool and tool registry construction
@jexShain jexShain force-pushed the crates-update-0423 branch from 25787b4 to 3bdbe62 Compare April 23, 2026 07:07
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/aish-llm/src/session.rs (1)

295-304: ⚠️ Potential issue | 🟠 Major

This now sends full tool/message history to Langfuse without redaction.

serde_json::json!(messages) serializes the complete conversation state, including all accumulated tool execution results. These results can contain command output, file contents, and other sensitive material. The export happens automatically whenever the final response has no tool calls, with no mechanism to redact or filter sensitive fields before transmission to the external observability platform.

Also applies to: 521-528

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-llm/src/session.rs` around lines 295 - 304, The current call to
langfuse.span_generation is serializing the entire conversation via
serde_json::json!(messages), leaking tool outputs and sensitive fields; replace
that with a redacted representation before serialization: implement or call a
sanitizer (e.g., redact_tool_outputs_from_messages or build_redacted_messages)
that strips or masks tool execution results, file contents, and any sensitive
metadata from the messages vector, then pass
serde_json::json!(redacted_messages) to span_generation (still include
client.model_name(), tid, pt, ct, and content as before); apply the same change
for the other occurrence around the span_generation call at the later block
(lines mentioned) so no full tool/message history is sent to Langfuse.
♻️ Duplicate comments (4)
crates/aish-i18n/src/lib.rs (1)

69-71: ⚠️ Potential issue | 🟠 Major

current_language() is out of sync with runtime locale state.

Line 69-71 re-detects locale instead of reading the active thread-local manager state set via set_locale(), so callers can get a different language than the one currently in use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-i18n/src/lib.rs` around lines 69 - 71, current_language()
incorrectly re-runs I18nManager::detect_locale() instead of returning the active
thread-local manager state set by set_locale(); change it to read the locale
from the thread-local manager (e.g. use the I18nManager current/with_current API
to access the active manager and return its locale, such as
I18nManager::with_current(|mgr| mgr.locale().to_string()) or the equivalent
method on I18nManager) so callers get the actual runtime locale rather than a
fresh detection.
crates/aish-llm/src/langfuse.rs (3)

40-57: ⚠️ Potential issue | 🟠 Major

Reject invalid LANGFUSE_BASE_URL instead of panicking.

LANGFUSE_BASE_URL is accepted verbatim here, but client construction still ends in build().expect(...). A blank or malformed env override now turns best-effort observability into a startup crash. Trim and validate the override before building the client, and fail closed instead of keeping a panic path.

Also applies to: 91-101

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-llm/src/langfuse.rs` around lines 40 - 57, The current
constructor reads LANGFUSE_BASE_URL verbatim into base_url and later calls
build().expect(...), which can panic on blank/malformed overrides; update the
logic around base_url (the LANGFUSE_BASE_URL env handling and the variable
base_url) to trim and validate the URL (reject empty/invalid URLs), and return
None (or an Err) instead of letting build().expect(...) panic; ensure the same
validation is applied in the later block referenced by lines 91-101 so any
invalid env override fails closed rather than causing a startup panic.

239-281: ⚠️ Potential issue | 🟡 Minor

These tests are ambient-env dependent now.

from_parts() now prefers LANGFUSE_*, so any exported variables in a developer shell or CI will change these assertions. Save/clear/restore the relevant env vars around each case, and serialize env-mutating tests so they stay deterministic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-llm/src/langfuse.rs` around lines 239 - 281, Tests calling
LangfuseConfig::from_parts are flaky because from_parts now prefers LANGFUSE_*
env vars; update each test (e.g., test_config_from_parts_with_values,
test_config_from_parts_missing_keys, test_config_from_parts_empty_keys,
test_config_default_base_url, test_config_strips_trailing_slash) to save any
existing LANGFUSE_PUBLIC_KEY/LANGFUSE_SECRET_KEY/LANGFUSE_BASE_URL, clear/unset
them before asserting, set only the env vars needed for the case (or explicitly
unset when testing None), and restore the originals at the end; also serialize
these tests (use a test-serialization mechanism such as the serial_test crate or
a single combined test) so environment mutations cannot run in parallel and make
assertions deterministic.

107-129: ⚠️ Potential issue | 🟠 Major

flush()/shutdown() no longer guarantee telemetry delivery.

All trace/span writes are detached with tokio::spawn, and no pending task state is retained. Short-lived CLI commands can exit before those tasks run, so the current no-op lifecycle methods silently drop spans. Please track pending work and drain it in flush()/shutdown().

Also applies to: 134-181, 184-231

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-llm/src/langfuse.rs` around lines 107 - 129, trace_session (and
similar fire-and-forget blocks) spawns tasks with tokio::spawn and drops their
JoinHandles so flush()/shutdown() cannot await them; modify the struct to hold a
shared task tracker (e.g., an Arc<Mutex/Atomic/JoinSet/Vec<JoinHandle<()>>>
stored on the Langfuse client wrapper), push each spawned task's JoinHandle into
that tracker instead of discarding it in trace_session and the other locations
noted (the similar fire-and-forget calls around lines 134-181 and 184-231), and
implement flush()/shutdown() to drain/await all tracked handles (with
timeout/error handling) so telemetry is guaranteed to be delivered before exit.
🧹 Nitpick comments (1)
crates/aish-shell/src/app.rs (1)

341-377: Use live skill callbacks here instead of a second startup snapshot.

This snapshot is frozen at shell startup, but SkillHotReloader is already running earlier in AishShell::new(). After a reload, the diagnose tool will keep listing/executing stale skill content. Reusing live lookup/list callbacks from the same shared skill source would keep diagnose behavior aligned with the rest of the shell.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-shell/src/app.rs` around lines 341 - 377, Summary: The diagnose
tool currently captures a frozen snapshot of skills at startup; switch it to use
the live skill lookup/list so it reflects SkillHotReloader updates. Fix: instead
of building diag_skills/diag_skill_names from skill_manager.list_skills(),
create diag_lookup and diag_list closures that call the live skill manager
methods (e.g., capture an Arc/clone of skill_manager and have diag_lookup call
its lookup/get method to return Option<aish_tools::SkillInfo> and diag_list call
its list_names/list_skills method to return Vec<String>), then pass those
closures as the SkillCallbacks to SystemDiagnoseTool::with_skill_callbacks (same
spot where diag_tool is created); ensure you clone or Arc::clone the
skill_manager into the closures and adapt return types to match
aish_tools::SkillInfo and the Fn signatures used for SkillCallbacks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/aish-pty/src/offload.rs`:
- Around line 643-658: The current offload path only marks failure if
create_dir_all(&offload_dir) fails but still reports offloaded: true when
subsequent file writes fail; update the write logic in the function that builds
BashOffloadResult so that every write of stdout.txt, stderr.txt, and result.json
is checked for errors (e.g., File::write/rename/flush results) and if any of
those writes fail change offloaded to false and populate offload_payload with a
"failed" status and the specific error(s) (or aggregate error message),
returning the preview fallback like the create_dir_all branch; apply the same
change to the second offload block referenced (the similar code around the other
write sequence) so no file-write failure ever reports offloaded: true and
missing/misleading paths.

---

Outside diff comments:
In `@crates/aish-llm/src/session.rs`:
- Around line 295-304: The current call to langfuse.span_generation is
serializing the entire conversation via serde_json::json!(messages), leaking
tool outputs and sensitive fields; replace that with a redacted representation
before serialization: implement or call a sanitizer (e.g.,
redact_tool_outputs_from_messages or build_redacted_messages) that strips or
masks tool execution results, file contents, and any sensitive metadata from the
messages vector, then pass serde_json::json!(redacted_messages) to
span_generation (still include client.model_name(), tid, pt, ct, and content as
before); apply the same change for the other occurrence around the
span_generation call at the later block (lines mentioned) so no full
tool/message history is sent to Langfuse.

---

Duplicate comments:
In `@crates/aish-i18n/src/lib.rs`:
- Around line 69-71: current_language() incorrectly re-runs
I18nManager::detect_locale() instead of returning the active thread-local
manager state set by set_locale(); change it to read the locale from the
thread-local manager (e.g. use the I18nManager current/with_current API to
access the active manager and return its locale, such as
I18nManager::with_current(|mgr| mgr.locale().to_string()) or the equivalent
method on I18nManager) so callers get the actual runtime locale rather than a
fresh detection.

In `@crates/aish-llm/src/langfuse.rs`:
- Around line 40-57: The current constructor reads LANGFUSE_BASE_URL verbatim
into base_url and later calls build().expect(...), which can panic on
blank/malformed overrides; update the logic around base_url (the
LANGFUSE_BASE_URL env handling and the variable base_url) to trim and validate
the URL (reject empty/invalid URLs), and return None (or an Err) instead of
letting build().expect(...) panic; ensure the same validation is applied in the
later block referenced by lines 91-101 so any invalid env override fails closed
rather than causing a startup panic.
- Around line 239-281: Tests calling LangfuseConfig::from_parts are flaky
because from_parts now prefers LANGFUSE_* env vars; update each test (e.g.,
test_config_from_parts_with_values, test_config_from_parts_missing_keys,
test_config_from_parts_empty_keys, test_config_default_base_url,
test_config_strips_trailing_slash) to save any existing
LANGFUSE_PUBLIC_KEY/LANGFUSE_SECRET_KEY/LANGFUSE_BASE_URL, clear/unset them
before asserting, set only the env vars needed for the case (or explicitly unset
when testing None), and restore the originals at the end; also serialize these
tests (use a test-serialization mechanism such as the serial_test crate or a
single combined test) so environment mutations cannot run in parallel and make
assertions deterministic.
- Around line 107-129: trace_session (and similar fire-and-forget blocks) spawns
tasks with tokio::spawn and drops their JoinHandles so flush()/shutdown() cannot
await them; modify the struct to hold a shared task tracker (e.g., an
Arc<Mutex/Atomic/JoinSet/Vec<JoinHandle<()>>> stored on the Langfuse client
wrapper), push each spawned task's JoinHandle into that tracker instead of
discarding it in trace_session and the other locations noted (the similar
fire-and-forget calls around lines 134-181 and 184-231), and implement
flush()/shutdown() to drain/await all tracked handles (with timeout/error
handling) so telemetry is guaranteed to be delivered before exit.

---

Nitpick comments:
In `@crates/aish-shell/src/app.rs`:
- Around line 341-377: Summary: The diagnose tool currently captures a frozen
snapshot of skills at startup; switch it to use the live skill lookup/list so it
reflects SkillHotReloader updates. Fix: instead of building
diag_skills/diag_skill_names from skill_manager.list_skills(), create
diag_lookup and diag_list closures that call the live skill manager methods
(e.g., capture an Arc/clone of skill_manager and have diag_lookup call its
lookup/get method to return Option<aish_tools::SkillInfo> and diag_list call its
list_names/list_skills method to return Vec<String>), then pass those closures
as the SkillCallbacks to SystemDiagnoseTool::with_skill_callbacks (same spot
where diag_tool is created); ensure you clone or Arc::clone the skill_manager
into the closures and adapt return types to match aish_tools::SkillInfo and the
Fn signatures used for SkillCallbacks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 52576a60-5c0a-493d-9e06-51e52de7524a

📥 Commits

Reviewing files that changed from the base of the PR and between 75277f2 and 25787b4.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (22)
  • Cargo.toml
  • crates/aish-cli/src/main.rs
  • crates/aish-i18n/locales/de-DE.yaml
  • crates/aish-i18n/locales/es-ES.yaml
  • crates/aish-i18n/locales/fr-FR.yaml
  • crates/aish-i18n/locales/ja-JP.yaml
  • crates/aish-i18n/src/lib.rs
  • crates/aish-i18n/src/manager.rs
  • crates/aish-llm/Cargo.toml
  • crates/aish-llm/src/agent.rs
  • crates/aish-llm/src/diagnose_agent.rs
  • crates/aish-llm/src/langfuse.rs
  • crates/aish-llm/src/session.rs
  • crates/aish-prompts/src/manager.rs
  • crates/aish-pty/src/lib.rs
  • crates/aish-pty/src/offload.rs
  • crates/aish-shell/src/ai_handler.rs
  • crates/aish-shell/src/app.rs
  • crates/aish-tools/Cargo.toml
  • crates/aish-tools/src/bash.rs
  • crates/aish-tools/src/registry.rs
  • crates/aish-tools/src/system_diagnose.rs
✅ Files skipped from review due to trivial changes (6)
  • Cargo.toml
  • crates/aish-i18n/src/manager.rs
  • crates/aish-i18n/locales/de-DE.yaml
  • crates/aish-pty/src/lib.rs
  • crates/aish-i18n/locales/ja-JP.yaml
  • crates/aish-tools/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (7)
  • crates/aish-llm/Cargo.toml
  • crates/aish-prompts/src/manager.rs
  • crates/aish-i18n/locales/es-ES.yaml
  • crates/aish-tools/src/system_diagnose.rs
  • crates/aish-llm/src/agent.rs
  • crates/aish-shell/src/ai_handler.rs
  • crates/aish-tools/src/registry.rs

Comment on lines 643 to +658
if let Err(e) = fs::create_dir_all(&offload_dir) {
tracing::warn!("failed to create offload dir: {e}");
// Offload failed: return preview with failed status
let (stdout_preview, _) =
truncate_utf8_safe(stdout.as_bytes(), self.settings.preview_bytes);
let (stderr_preview, _) =
truncate_utf8_safe(stderr.as_bytes(), self.settings.preview_bytes);
return BashOffloadResult {
stdout_text: stdout.to_string(),
stderr_text: stderr.to_string(),
stdout_text: String::from_utf8_lossy(&stdout_preview).to_string(),
stderr_text: String::from_utf8_lossy(&stderr_preview).to_string(),
offloaded: false,
offload_payload: None,
offload_payload: Some(serde_json::json!({
"status": "failed",
"error": e.to_string(),
"hint": "Output shown as preview only"
})),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't report offloaded when the payload files were only written best-effort.

Only directory-creation failure is downgraded to status: "failed". If stdout.txt, stderr.txt, or result.json fails to write later, this still returns status: "offloaded" plus paths/hints that may not exist. That makes downstream tooling trust incomplete output. Please downgrade to a failed payload, or surface per-file write errors, whenever any of the offload writes fails.

Also applies to: 739-755

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aish-pty/src/offload.rs` around lines 643 - 658, The current offload
path only marks failure if create_dir_all(&offload_dir) fails but still reports
offloaded: true when subsequent file writes fail; update the write logic in the
function that builds BashOffloadResult so that every write of stdout.txt,
stderr.txt, and result.json is checked for errors (e.g.,
File::write/rename/flush results) and if any of those writes fail change
offloaded to false and populate offload_payload with a "failed" status and the
specific error(s) (or aggregate error message), returning the preview fallback
like the create_dir_all branch; apply the same change to the second offload
block referenced (the similar code around the other write sequence) so no
file-write failure ever reports offloaded: true and missing/misleading paths.

@jexShain jexShain merged commit 4d9cf0b into AI-Shell-Team:rust Apr 23, 2026
6 of 7 checks passed
@jexShain jexShain deleted the crates-update-0423 branch April 24, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant