Skip to content

Fix interactive YOLO plan review semantics#1836

Open
WeZZard wants to merge 3 commits intoMoonshotAI:mainfrom
WeZZard:fix-interactive-yolo-review
Open

Fix interactive YOLO plan review semantics#1836
WeZZard wants to merge 3 commits intoMoonshotAI:mainfrom
WeZZard:fix-interactive-yolo-review

Conversation

@WeZZard
Copy link
Copy Markdown

@WeZZard WeZZard commented Apr 11, 2026

Summary

  • separate interactive user-feedback availability from YOLO auto-approval
  • keep tool approvals auto-approved in YOLO while preserving AskUserQuestion and plan review in interactive shell mode
  • clarify print-mode docs to describe non-interactive auto-resolution behavior

Rationale

  • Interactive YOLO is still an interactive workflow. If a user already has a finished plan and wants the agent to run unattended, they can use a non-interactive YOLO mode such as print or automation flows. But when the agent is launched in interactive YOLO mode, the plan should still be reviewed and adjusted, because an AI-generated plan is rarely perfect on the first pass. This matches the expected behavior of major coding agents.
  • Plan review, plan revisions, and explicit human feedback are high-quality supervision signals. Preserving that loop is important not only for usability, but also because those review decisions and comments can become valuable training data for future model improvement.

Verification

  • uv run pytest tests/tools/test_ask_user.py tests/tools/test_plan_yolo.py tests/core/test_yolo_injection.py tests/ui_and_conv/test_shell_prompt_echo.py tests/ui_and_conv/test_replay.py tests/ui_and_conv/test_visualize_running_prompt.py tests/ui_and_conv/test_shell_run_placeholders.py tests/e2e/test_shell_modal_e2e.py -q -k 'ask_user or plan_yolo or yolo_injection or shell_prompt_echo or replay or visualize_running_prompt or shell_run_placeholders or yolo_enter_plan_still_requests_review or yolo_exit_plan_still_requests_review'
  • uv run ruff check src/kimi_cli/ui/shell/echo.py tests/ui_and_conv/test_shell_prompt_echo.py tests/ui_and_conv/test_replay.py tests/ui_and_conv/test_visualize_running_prompt.py tests/ui_and_conv/test_shell_run_placeholders.py

Visual Outcomes

スクリーンショット 2026-04-11 午後10 32 29
Open with Devin

@WeZZard WeZZard force-pushed the fix-interactive-yolo-review branch from da40dc3 to 9d01429 Compare April 11, 2026 14:26
@WeZZard WeZZard changed the title Fix interactive YOLO plan review and user echo Fix interactive YOLO plan review semantics Apr 11, 2026
@WeZZard WeZZard marked this pull request as ready for review April 11, 2026 14:45
Copilot AI review requested due to automatic review settings April 11, 2026 14:45

This comment was marked as resolved.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

chatgpt-codex-connector[bot]

This comment was marked as resolved.

@WeZZard WeZZard force-pushed the fix-interactive-yolo-review branch from 9d01429 to 9e3eaa0 Compare April 11, 2026 16:06
chatgpt-codex-connector[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 43d913d85b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +64 to +65
if soul.can_request_user_feedback:
return []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid persisting non-interactive ACP guidance in history

NonInteractiveModeInjectionProvider now injects whenever can_request_user_feedback is false, and ACP sessions are explicitly created with that flag disabled (src/kimi_cli/acp/server.py). Because injections are appended into conversation history (via KimiSoul context writes), this “Do NOT call AskUserQuestion” reminder survives session continuation; if the same session is later resumed in interactive shell/wire mode, the model still sees stale non-interactive instructions and may skip plan review/questions even though feedback is available. This regression newly affects default (non‑YOLO) ACP sessions, which previously did not receive a non-interactive reminder.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with an explicit regression test and tightened invariants. Dynamic injections are now verified to be step-local only: the reminder is present in the model-facing effective history for the current step, but it is not appended to Context.history, not written into the context file, and not restored on a fresh Context.restore() from disk. See tests/core/test_kimisoul_steer.py::test_step_keeps_runtime_specific_injections_out_of_persisted_history in commit c9460e9c.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c9460e9c57

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +838 to 842
effective_messages = list(self._context.history)
if injections:
combined_reminders = "\n".join(system_reminder(inj.content).text for inj in injections)
await self._context.append_message(
effective_messages.append(
Message(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore plan reminder throttling in plan mode

PlanModeInjectionProvider.get_injections() throttles by scanning history for previous plan reminders, but this change makes injections step-local by copying self._context.history and appending reminders only to effective_messages. Because reminders are no longer persisted in context, plan-mode turns after the first never contain a prior reminder marker, so the provider treats each step as first-time and re-injects the full plan reminder every step, which can quickly inflate prompt tokens and degrade long plan-mode sessions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 10 additional findings in Devin Review.

Open in Devin Review

Comment on lines +838 to 846
effective_messages = list(self._context.history)
if injections:
combined_reminders = "\n".join(system_reminder(inj.content).text for inj in injections)
await self._context.append_message(
effective_messages.append(
Message(
role="user",
content=[TextPart(text=combined_reminders)],
)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 PlanModeInjectionProvider throttling broken by non-persisted injections

The _step() method was changed to no longer persist dynamic injections into self._context (kimisoul.py:838-846). However, PlanModeInjectionProvider (plan_mode.py:61-79) relies on scanning self._context.history (passed at kimisoul.py:226) to find its own previous reminders via _has_plan_reminder() for throttling (inject every 5 assistant turns). Since injections are no longer persisted, _has_plan_reminder never finds a match, found_previous is always False, and the provider takes the if not found_previous branch on every step — injecting the full plan mode reminder (a multi-paragraph prompt) on every single LLM call instead of every 5 turns. This also resets _inject_count to 1 each time, breaking the sparse/full alternation logic at plan_mode.py:86-91.

Impact in plan mode

Every step in plan mode now gets the full ~30-line plan reminder injected, significantly increasing token usage and potentially degrading LLM output quality from repeated instructions. The throttle interval (_TURN_INTERVAL = 5) and full/sparse alternation (_FULL_EVERY_N = 5) are completely bypassed.

Prompt for agents
The root cause is that PlanModeInjectionProvider (src/kimi_cli/soul/dynamic_injections/plan_mode.py lines 61-79) scans the history passed to get_injections() to find its own previous reminders for throttling. Since _step() in kimisoul.py now builds effective_messages as a local copy (not persisting injections), the provider never finds its previous reminders in self._context.history.

Two possible approaches:

1. Change PlanModeInjectionProvider to track its own state internally (e.g. counting assistant messages or steps since last injection) rather than relying on scanning persisted history. This aligns with the new non-persisting design and matches how YoloModeInjectionProvider and NonInteractiveModeInjectionProvider already use internal _injected flags.

2. Alternatively, pass effective_messages (which includes the current step's injection) to _collect_injections on subsequent calls within the same turn, or maintain a separate ephemeral history that includes injections for the provider to scan. This is more complex and less clean.

Approach 1 is recommended. The provider could maintain an internal counter of assistant messages seen since the last injection, incrementing when it sees new assistant messages and resetting when it injects. This removes the dependency on persisted history entirely.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants