Skip to content

Codex feels more like an AI babysitting workflow than an AI agent #23830

@henryhjjones

Description

@henryhjjones

What version of the Codex App are you using (From “About Codex” dialog)?

Codex app 26.513.4821.0ㄒ

What subscription do you have?

plus

What platform is your computer?

windows11

What issue are you seeing?

Hello Codex team,

I’ve been using Codex heavily for a real multi-branch development workflow, and honestly the experience often feels less like using an AI agent and more like babysitting an AI junior that constantly needs supervision. The biggest issue is not coding quality. The issue is workflow autonomy. Right now, I repeatedly have to: remind Codex which branch it is on remind it not to touch unrelated files remind it to commit remind it not to generate reports instead of executing manually handle push/deploy steps repeatedly explain SSH / sandbox / permissions stop it from entering long apology loops instead of continuing execution In practice, many sessions become: “explain → correct → remind → redirect → approve” instead of: “delegate → execute → verify → done” At times it genuinely feels like I’m supervising an “AI baby,” not collaborating with an AI agent. A few examples: Codex frequently stops at the exact point where real automation should begin (SSH, deploy, cron apply, server update, push) It tends to over-explain failures instead of aggressively finding the next executable path It often loses operational focus during long-running repository work Sandbox limitations are understandable, but the UX around them currently creates constant friction What I actually want: stronger workflow persistence better memory of active constraints more autonomous execution behavior clearer distinction between “unsafe” vs “annoyingly blocked” fewer apology/recovery loops agent-style task continuation after recoverable failures The coding ability itself is impressive. The operational experience is the frustrating part. I’m sending this because I genuinely want Codex to become great for serious long-running development workflows.

What steps can reproduce the bug?

Hello Codex team, I’ve been using Codex heavily for a real multi-branch development workflow, and honestly the experience often feels less like using an AI agent and more like babysitting an AI junior that constantly needs supervision. The biggest issue is not coding quality. The issue is workflow autonomy. Right now, I repeatedly have to: remind Codex which branch it is on remind it not to touch unrelated files remind it to commit remind it not to generate reports instead of executing manually handle push/deploy steps repeatedly explain SSH / sandbox / permissions stop it from entering long apology loops instead of continuing execution In practice, many sessions become: “explain → correct → remind → redirect → approve” instead of: “delegate → execute → verify → done” At times it genuinely feels like I’m supervising an “AI baby,” not collaborating with an AI agent. A few examples: Codex frequently stops at the exact point where real automation should begin (SSH, deploy, cron apply, server update, push) It tends to over-explain failures instead of aggressively finding the next executable path It often loses operational focus during long-running repository work Sandbox limitations are understandable, but the UX around them currently creates constant friction What I actually want: stronger workflow persistence better memory of active constraints more autonomous execution behavior clearer distinction between “unsafe” vs “annoyingly blocked” fewer apology/recovery loops agent-style task continuation after recoverable failures The coding ability itself is impressive. The operational experience is the frustrating part. I’m sending this because I genuinely want Codex to become great for serious long-running development workflows.

What is the expected behavior?

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    appIssues related to the Codex desktop appenhancementNew feature or requestmodel-behaviorIssues related to behaviors exhibited by the modelsandboxIssues related to permissions or sandboxingsessionIssues involving session (thread) management, resuming, forking, naming, archiving

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions