feature: Add "!cmd" user shell execution#2471
Conversation
|
All contributors have signed the CLA ✍️ ✅ |
There was a problem hiding this comment.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request that is ready for review, or mark a draft as ready for review. You can also ask for a review by commenting "@codex review".
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
|
We need to make sure cancellation works, particularly ctr-c. Right now it doesn't actually cancel the command. If possible, we should show the command we just ran + any output and delineate it as a user command instead of using the shortening UI we do for the model. Otherwise nice work! |
|
This also doesn't support windows, which isn't necessarily a blocker but we might need to look at. |
Happy to send a patch with both! Wanted to get something out to get buy-in first. Thanks for the prompt feedback. |
|
This is great! Would be also fantastic to have a way to run background commands like with Claude Code and have the output continuously attached to any new messages.
Here is also a good description of the feature: anomalyco/opencode#1970 |
7318b7e to
31a20a2
Compare
|
I have read the CLA Document and I hereby sign the CLA |
Alright both those things should work. Also, the Tui still remains like the "frontend" and forwards the request to the "core". This is in line with the existing archiecture. |
|
Summary
Notes
Review
Follow-up (outside this PR scope)
|
|
@codex review in depth |
There was a problem hiding this comment.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
|
Seems worth addressing the codex review notes above ^ |
18d2af7 to
fec03bc
Compare
Fixed. |
|
@codex review in depth |
There was a problem hiding this comment.
Codex Review: Here are some suggestions.
Reply with @codex fix comments to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
771ae55 to
8aaf9f0
Compare
Also done |
bolinfest
left a comment
There was a problem hiding this comment.
This is generally good, but there are some things I'd like you to look at!
| assert_eq!(deserialized, event); | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
I think we can do without this test: it doesn't provide much in the way of additional code correctness.
| /// True when this exec was initiated directly by the user (e.g. bang command), | ||
| /// not by the agent/model. Defaults to false for backwards compatibility. | ||
| #[serde(default)] | ||
| pub user_initiated_shell_command: bool, |
There was a problem hiding this comment.
How about: is_user_shell_command to mirror the name of the Op?
| struct RunningCommand { | ||
| command: Vec<String>, | ||
| parsed_cmd: Vec<ParsedCommand>, | ||
| user_initiated_shell_command: bool, |
| // Special-case: "!cmd" executes a local shell command instead of sending to the model. | ||
| if let Some(stripped) = text.strip_prefix('!') { | ||
| let cmd = stripped.trim().to_string(); | ||
| if !cmd.is_empty() { |
There was a problem hiding this comment.
So if the user types ! by itself, we'll send the message "!" to the agent, correct?
I guess this is fine, though we could take this opportunity to tell the user about the ! feature instead?
| body_lines.extend(wrapped.into_iter().map(|l| Line::from(l.to_string().dim()))); | ||
|
|
||
| if let Some(output) = call.output.as_ref() { | ||
| let is_shell_exec_cell = self.calls.iter().any(|c| c.user_initiated_shell_command); |
There was a problem hiding this comment.
Do we believe we could have a mix of exec calls in flight, only some of which are user initiated shell commands?
There was a problem hiding this comment.
I just check for that call itself now.
| sub_id: String, | ||
| command: String, | ||
| ) { | ||
| let spawn_sub_id = sub_id.clone(); |
There was a problem hiding this comment.
Do this inside let handle, as well?
| unsafe { | ||
| std::env::set_var("CODEX_HERMETIC_TEST_SHELL", "1"); | ||
| } |
There was a problem hiding this comment.
Please find another way to do this: this is unsafe for a reason.
| use std::path::PathBuf; | ||
| use tempfile::TempDir; | ||
|
|
||
| #[tokio::test(flavor = "multi_thread", worker_threads = 2)] |
There was a problem hiding this comment.
Please omit worker_threads unless you have a reason to believe you need them.
Empirically, tests that only pass with more than one worker thread imply bugs in our code, so I would prefer to try to catch them.
| @@ -0,0 +1,119 @@ | |||
| #![cfg(unix)] | |||
There was a problem hiding this comment.
Given these tests are limited to unix for now (which I would prefer not to do), do we really need CODEX_HERMETIC_TEST_SHELL?
| // 1) ls should list the file | ||
| codex | ||
| .submit(Op::RunUserShellCommand { | ||
| command: "ls".to_string(), |
There was a problem hiding this comment.
I often use python3 -c as an example for cross-platform shell calls.
0bc6e90 to
2f0bc51
Compare
|
Any update on the merge timeline for this PR? |
5578dbe to
45dc4f1
Compare
abhishek-oai
left a comment
There was a problem hiding this comment.
Pushing old comments that I had replied to
| @@ -0,0 +1,119 @@ | |||
| #![cfg(unix)] | |||
| use std::path::PathBuf; | ||
| use tempfile::TempDir; | ||
|
|
||
| #[tokio::test(flavor = "multi_thread", worker_threads = 2)] |
| unsafe { | ||
| std::env::set_var("CODEX_HERMETIC_TEST_SHELL", "1"); | ||
| } |
| sub_id: String, | ||
| command: String, | ||
| ) { | ||
| let spawn_sub_id = sub_id.clone(); |
| // Special-case: "!cmd" executes a local shell command instead of sending to the model. | ||
| if let Some(stripped) = text.strip_prefix('!') { | ||
| let cmd = stripped.trim().to_string(); | ||
| if !cmd.is_empty() { |
| body_lines.extend(wrapped.into_iter().map(|l| Line::from(l.to_string().dim()))); | ||
|
|
||
| if let Some(output) = call.output.as_ref() { | ||
| let is_shell_exec_cell = self.calls.iter().any(|c| c.user_initiated_shell_command); |
There was a problem hiding this comment.
I just check for that call itself now.
| Op::RunUserShellCommand { command } => { | ||
| spawn_user_shell_command_task( | ||
| sess.clone(), | ||
| Arc::clone(&turn_context), |
| ) { | ||
| let spawn_sub_id = sub_id.clone(); | ||
| let handle = { | ||
| let sess = Arc::clone(&sess); |
| command, | ||
| cwd, | ||
| parsed_cmd: _, | ||
| .. |
f0ad4bf to
1ee69cf
Compare
jif-oai
left a comment
There was a problem hiding this comment.
After my comments, lgtm
| turn: Arc<TurnContext>, | ||
| tracker: crate::tools::context::SharedTurnDiffTracker, | ||
| call_id: String, | ||
| is_user_shell_command: bool, |
There was a problem hiding this comment.
I'm not a huge fan of this being added everywhere... this just create some low-value wiring everywhere
Out of scope for this PR but can you add a todo to get rid of it?
| }); | ||
| session.send_event(turn_context.as_ref(), event).await; | ||
|
|
||
| let shell_invocation = session |
There was a problem hiding this comment.
Why do you need this? The use should just answer a valid command for his shell. This should not be our responsibility to format it
|
|
||
| let shell_invocation = session | ||
| .user_shell() | ||
| .format_user_shell_script(&command) |
There was a problem hiding this comment.
- this just got dropped because it was actually useless. And I don't think this PR justify adding this back
| Bash(BashShell), | ||
| PowerShell(PowerShellConfig), | ||
| Unknown, | ||
| } |
There was a problem hiding this comment.
IMO we should revert all this file (per my previous comments)
- This feature lets users execute a command using their shell when they type "!cmd" in the TUI for e.g. "!ls"
- It takes care to bound the output line limit to not flood the TUI
- Being able to signal (Ctrl-C) commands spawned. For e.g. "!sleep 10000" killed via SIGINT (Ctrl-C)
- E2E flow of the feature is detailed below
1. The composer hands the submitted text to `ChatWidget::submit_user_message` (`codex-rs/tui/src/chatwidget.rs`).
2. If the message starts with `!`, the prefix is stripped. A non-empty command triggers `ChatWidget::submit_op(Op::RunUserShellCommand { command })`, while empty input shows a help hint instead of contacting the model.
3. Because no `UserInput` items are emitted, the command never joins the conversation payload that the model sees.
4. The submission loop in `codex-rs/core/src/codex.rs` routes the op to `handlers::run_user_shell_command`.
5. The handler creates a new `TurnContext` (with default per-turn settings) and calls `Session::spawn_user_shell_command`, which enqueues a `UserShellCommandTask` without any model-facing input.
6. `UserShellCommandTask::run` (defined in `codex-rs/core/src/tasks/user_shell.rs`) emits a `TaskStartedEvent`, formats the command using the session’s shell wrapper, and creates a `ToolCall` for the built-in `local_shell` tool.
7. `ToolCallRuntime::handle_tool_call` dispatches this call to the `ShellHandler` implementation in `codex-rs/core/src/tools/handlers/shell.rs`.
8. `ShellHandler::run_exec_like` launches the command through the unified exec runtime, honoring sandbox policy and shell environment settings. It uses `ToolEmitter::shell` to publish `ExecCommandBegin` and `ExecCommandEnd` events so the TUI can render the command, streaming output, exit code, and duration.
9. The command output is captured locally for the UI; the handler returns a `ToolOutput::Function`, but the owning task does not turn that into a follow-up assistant message.
- Neither the command nor its stdout/stderr are appended to the conversation history supplied to the model. The bang command is purely a local exec flow; the model receives no additional context from it.
- The only protocol traffic is the pair of exec events and task lifecycle notifications needed by the UI.
10. After `ExecCommandEnd` fires, `UserShellCommandTask` finishes with `None` as the final agent message. The session marks the task complete, and the UI shows the finished exec cell result.
In short: `!cmd` spins up a local shell execution pipeline, streams the output to the user interface, and keeps the model prompt untouched.
1ee69cf to
0f39260
Compare
abhishek-oai
left a comment
There was a problem hiding this comment.
Fixed 2 comments
…rmat helpers in shell.rs and drop related test per review.
…PowerShell with -Command); fallback to shlex split only when shell is unknown.
Merged! |
feature: Add "!cmd" user shell execution
This change lets users run local shell commands directly from the TUI by prefixing their input with ! (e.g. !ls). Output is truncated to keep the exec cell usable, and Ctrl-C cleanly
interrupts long-running commands (e.g. !sleep 10000).
Summary of changes
End-to-end flow
TUI handling
Core submission loop
4. The submission loop routes the op to handlers::run_user_shell_command (core/src/codex.rs).
5. A fresh TurnContext is created and Session::spawn_user_shell_command enqueues UserShellCommandTask.
Task execution
6. UserShellCommandTask::run emits TaskStartedEvent, formats the command, and prepares a ToolCall targeting local_shell.
7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.
Shell tool runtime
8. ShellHandler::run_exec_like launches the process via the unified exec runtime, honoring sandbox and shell policies, and emits ExecCommandBegin/End.
9. Stdout/stderr are captured for the UI, but the task does not turn the resulting ToolOutput into a model response.
Completion
10. After ExecCommandEnd, the task finishes without an assistant message; the session marks it complete and the exec cell displays the final output.
Conversation context
Demo video
bang-command-demo-video-1080p.mp4