Skip to content

Increase Codex’s usefulness as a real-world workbench by defaulting to separate shell commands #18349

@dpritchett

Description

@dpritchett

What variant of Codex are you using?

CLI

What feature would you like to see?

Codex often feels optimized for an "intern" usage model: package up a task, let it go work inside a bounded sandbox, and judge success by autonomous completion.

A lot of real workstation-based engineering uses Codex more like a workbench: tight feedback loops, operator-visible control points, host-faithful probing, and incremental interaction with real tools and systems.

From the workbench perspective, Codex seems to over-optimize for fewer shell/tool calls. That makes compound commands look efficient, but in practice it often makes approvals, debugging, and failure recovery worse.

Compact one-shot behavior is often great for softer chat tasks like drafting, summarizing, or best-effort advice. But Codex operates in a more executable environment, where the foundation has to be simpler, more observable command primitives.

This also seems aligned with Codex’s own documented rules and approval model. Prefix rules, approval prompts, and command segmentation all get much easier to reason about when the model emits simpler command primitives. In that sense, defaulting to separate commands is not just better UX for users; it also reduces complexity for Codex’s own safety and policy surface.

I realize this may be a mix of:

  • Codex-side behavior in this repo, like prompt text, tool defaults, and product UX
  • OpenAI-side behavior behind the gateway, like model tendencies or server-side prompting

I’m filing it here because the Codex product surface can still shape the default behavior, even if some of the root cause lives upstream.

What I’d like:

  • default to separate shell commands more often
  • especially for tools like git, kubectl, helm, gh, and similar operational CLIs
  • use &&, pipes, and output-trimming shell cleverness more sparingly
  • preserve visible stop points between inspection, local mutation, and remote mutation

This feels like a small behavioral change with outsized leverage for making Codex more useful in real-world, operator-driven workflows.

Related, but not a dupe

Additional information

Examples of command shapes that seem attractive to the model but make approvals/rules harder to apply cleanly:

  • git diff --stat origin/main...HEAD && git log --oneline --decorate -5 && git status --short
  • kubectl get pod -n example-ns example-pod -o jsonpath='{.status.phase}{"\n"}' && kubectl logs -n example-ns example-pod --tail=80 | tail -n 30
  • helm test example-release -n example-ns --timeout 10m 2>&1 | tail -n 40
  • git add ... && git commit -m ... && git push ...
  • kubectl get deploy -n example-ns -o wide && kubectl get pods -n example-ns && kubectl get svc -n example-ns

In each case, the issue is not just readability. These command shapes make Codex more likely to get stuck behind its own approval/rules model on a too-clever one-liner that then sits waiting for human intervention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesttool-callsIssues related to tool calling

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions