Skip to content

fix: final answer rendering and timing leaks#129

Merged
F16shen merged 6 commits intoAI-Shell-Team:mainfrom
F16shen:fix/final-answer-rendering-ttft
Apr 23, 2026
Merged

fix: final answer rendering and timing leaks#129
F16shen merged 6 commits intoAI-Shell-Team:mainfrom
F16shen:fix/final-answer-rendering-ttft

Conversation

@F16shen
Copy link
Copy Markdown
Collaborator

@F16shen F16shen commented Apr 23, 2026

Summary

  • Problem: tool-call preview content and nested diagnose events could leak into the shell's final-answer path, causing summaries to render as grey streamed text and printing misleading 思考: Xs lines.
  • Changes: only mark shell content as streamed for final content deltas, only record TTFT on final content, and restrict SystemDiagnoseAgent event forwarding to tool/error/cancel/interaction events.
  • Related Issue: #

Change Type

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Other

Scope

  • Core shell / PTY
  • AI agent / LLM
  • Skills / Tools
  • Security
  • Configuration
  • CLI / Interface
  • Packaging / Installation
  • CI/CD
  • Documentation

User-visible Changes

  • Final answers after tool-call flows no longer get treated as grey streamed preview text.
  • Misleading 思考: Xs summaries no longer appear when a turn only emitted non-final preview content.
  • system_diagnose_agent no longer leaks nested lifecycle/content events into the outer shell.

Compatibility

  • Backward compatible? Yes
  • Config changes? No

Testing

  • /home/lixin/workspace/aishell/aish/.venv/bin/python -m pytest tests/shell/runtime/test_shell_pty_core.py -k "content_streamed or ttft_timing_records_on_first_content_delta or ttft_timing_preserves_state_across_generations"
  • /home/lixin/workspace/aishell/aish/.venv/bin/python -m pytest tests/tools/test_final_answer.py -k "filters_nested_content_and_lifecycle_events or mocked_llm_with_final_answer_after_bash or end_to_end_nested_agent_execution"
  • /home/lixin/workspace/aishell/aish/.venv/bin/python -m pytest tests/shell/runtime/test_shell_pty_core.py -k "ttft_timing_records_on_first_content_delta or non_final_content_delta_does_not_record_ttft or op_end_does_not_render_ttft_for_non_final_preview_only or final_content_delta_marks_content_streamed"

Checklist

  • Code follows project style
  • Tests added if needed
  • Documentation updated if needed

Summary by CodeRabbit

  • Performance Improvements

    • TTFT now records only on final streamed content for more accurate responsiveness metrics.
  • Reliability Improvements

    • Shell startup uses reported readiness and can inherit backend working directory to avoid unnecessary delays.
    • Event forwarding now strictly filters lifecycle events to reduce spurious callbacks.
  • Robustness

    • Improved PTY protocol decoding and error tracking for more resilient terminal sessions.
  • Tests

    • Expanded tests validating startup handshake reuse, streaming/TTFT behavior, and event filtering.

@github-actions github-actions Bot added the tests label Apr 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the pull request. A maintainer will review it when available.

Please keep the PR focused, explain the why in the description, and make sure local checks pass before requesting review.

Contribution guide: https://github.com/AI-Shell-Team/aish/blob/main/CONTRIBUTING.md

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 23, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 70d84ab4-49db-4a09-879d-7769bce8a106

📥 Commits

Reviewing files that changed from the base of the PR and between dc21d3b and 1483815.

📒 Files selected for processing (2)
  • src/aish/shell/runtime/app.py
  • tests/shell/runtime/test_shell_pty_core.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/shell/runtime/test_shell_pty_core.py

📝 Walkthrough

Walkthrough

The PR adds PTY startup handshake state exposure in PTYManager, makes PTYAIShell consume those signals (avoiding unconditional sleeps), restricts parent callback forwarding via an LLM event allowlist, and changes TTFT/streaming to record only on final content deltas.

Changes

Cohort / File(s) Summary
Event Proxy Filtering
src/aish/llm/agents.py
Adds an explicit allowlist of child LLMEventType values forwarded to a parent callback; non-allowlisted events return LLMCallbackResult.CONTINUE without invoking the parent.
Content Delta & PTY-driven TTFT
src/aish/shell/runtime/app.py
handle_content_delta reads event.data["is_final"] to set _content_streamed_to_terminal and record TTFT only on final deltas; _setup_pty prefers PTY-provided startup_* signals and startup_cwd over fixed sleep fallback.
PTY Startup Handshake Tracking
src/aish/terminal/pty/manager.py
Introduces properties startup_session_ready, startup_prompt_ready, startup_ready, startup_cwd; _wait_ready initializes from stored values, decodes control events (capturing protocol issues), stores ps2/cwd, and persists observed handshake state.
Tests: handshake, timing, event filtering
tests/shell/runtime/test_shell_pty_core.py, tests/terminal/pty/test_pty_control_protocol.py, tests/tools/test_final_answer.py
Adds tests for PTY startup reuse and cwd propagation, final vs non-final delta TTFT behavior, draining/poll-mode readiness assertions, and verifies parent callback only receives allowlisted LLM events from nested sessions.

Sequence Diagram

sequenceDiagram
    participant Shell as PTYAIShell
    participant PTYMgr as PTYManager
    participant Proto as PTY Handshake Protocol

    Shell->>PTYMgr: start()
    activate PTYMgr
    PTYMgr->>PTYMgr: reset startup flags
    PTYMgr->>Proto: initialize PTY
    loop handshake polling
        Proto-->>PTYMgr: control events (session_ready, prompt_ready, cwd, ps2)
        PTYMgr->>PTYMgr: decode events, record readiness, cwd, ps2, protocol issues
    end
    PTYMgr-->>Shell: start() returns with startup_* and startup_cwd set
    deactivate PTYMgr

    Shell->>Shell: _setup_pty()
    Shell->>PTYMgr: query startup_ready, startup_session_ready, startup_cwd
    alt startup signals present
        Shell->>Shell: set backend cwd, _backend_session_ready=True, shell_phase="editing", skip sleep
    else fallback
        Shell->>Shell: perform legacy sleep delay
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰
Handshake hummed, the prompt awakes,
I filter hops and gentle takes,
Deltas final, TTFT sings—
No idle sleeps, the shell now springs,
A tiny rabbit thumbs new things.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title "fix: final answer rendering and timing leaks" directly addresses the main problem solved by this PR: preventing tool-call preview content and nested events from leaking into final-answer rendering and TTFT timing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch fix/final-answer-rendering-ttft

Comment @coderabbitai help to get the list of available commands and usage tips.

@F16shen F16shen changed the title Fix final answer rendering and timing leaks fix: final answer rendering and timing leaks Apr 23, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aish/shell/runtime/app.py`:
- Around line 1130-1136: The branch only sets _backend_session_ready when
_pty_manager.startup_ready is true, which misses the case where
PTYManager.start() consumed session_ready but didn't set startup_ready; change
the condition to set self._backend_session_ready = True if either
self._pty_manager.startup_ready or self._pty_manager.session_ready is truthy
(i.e., check both flags/properties), so replace the existing if-block that sets
_backend_session_ready and _shell_phase (or set _backend_session_ready
independently) to preserve session readiness even when startup_ready is false.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: a4869219-4a1a-47dc-9cec-8924aba13564

📥 Commits

Reviewing files that changed from the base of the PR and between e53da97 and dc21d3b.

📒 Files selected for processing (6)
  • src/aish/llm/agents.py
  • src/aish/shell/runtime/app.py
  • src/aish/terminal/pty/manager.py
  • tests/shell/runtime/test_shell_pty_core.py
  • tests/terminal/pty/test_pty_control_protocol.py
  • tests/tools/test_final_answer.py

Comment thread src/aish/shell/runtime/app.py
@F16shen F16shen merged commit 77638d3 into AI-Shell-Team:main Apr 23, 2026
10 checks passed
@F16shen F16shen deleted the fix/final-answer-rendering-ttft branch April 23, 2026 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant