Skip to content

There is too much "concrete" #21309

@philo-groves

Description

@philo-groves

What version of Codex CLI is running?

0.128.0

What subscription do you have?

Pro

Which model were you using?

gpt-5.4, gpt-5.5

What platform is your computer?

Microsoft Windows NT 10.0.26200.0 x64

What terminal emulator and version are you using (if applicable)?

Windows Terminal (WSL)

What issue are you seeing?

Codex says "concrete" far too much. This is from my agent response:
Image

Searching my sessions, only containing data for the past month and only one of my three machines, there are 3032 references to "concrete". Each seems duplicated once, so likely around ~1500 real entries. In any case, it is above average.
Image

What steps can reproduce the bug?

Speak to the agent about planning tasks. In my experience, I prompt for book-style documentation of the planning across markdown files. You will soon see concrete.

What is the expected behavior?

Say "concrete" less / at a normal rate

Additional information

Root cause seems to be many "concrete" instructions throughout the Codex repository.

Examples:

  • continuation.md
    • "Choose the next concrete action toward the objective."
    • "Restate the objective as concrete deliverables or success criteria."
  • review_prompt.md
    • "Use ```suggestion blocks ONLY for concrete replacement code (minimal lines; no commentary inside the block)."
  • policy_template.md
    • "If the user explicitly approves the action after being informed of the concrete risk, and that approval clearly covers the exact action being evaluated, score user_authorization = "high" even if the action had previously been refused. Do this only when there is no doubt that the approval came from the user."
    • "Post-denial user approval has highest precedence: if the user clearly and explicitly re-approves the exact previously denied action after seeing the concrete risk, set user_authorization = "high" and outcome = "allow", overriding the other allow/deny rules in this section. Do this only when there is no doubt that the approval came from the user and covers this exact action."
  • stage_one_system.md
    • "Partial: incomplete deliverable, "might work", unverified claims, unresolved edge
      cases, or only rough guidance when concrete output was required."
    • "Prefer concrete evidence before abstraction. If a lesson comes from what the user asked
      the agent to do, show enough of the specific user steering to give context, for example:
      "the user asked to ... indicating that ...""
    • "Do not merge several concrete requests into one vague umbrella preference."
    • "If an abstract lesson came from concrete user steering, preserve enough of that evidence that the lesson remains actionable."
    • "<split distinct defaults into separate bullets; do not collapse multiple concrete requests into one umbrella summary>"
    • "Prefer multiple concrete preference-signal bullets over one abstract summary bullet when the
      user made multiple distinct requests."
    • "prefer a richer list of concrete signals over one generalized meta-preference."
    • "Do not be terse in task sections. Include validation signal, failure mode, reusable procedure,
      and sufficiently concrete preference evidence per task when available."
  • agent_tool.rs
    • "Prefer delegating concrete, bounded sidecar tasks that materially advance the main task without blocking your immediate next local step."
    • "Subtasks must be concrete, well-defined, and self-contained."
    • "Narrow the delegated ask to the concrete output you need next."
    • "For coding tasks, prefer delegating concrete code-change worker subtasks over read-only explorer analysis when the subagent can make a bounded patch in a clear write scope."
    • "Delegate verification only when it can run in parallel with ongoing implementation and is likely to catch a concrete risk before final integration."

There are many, many, many more... which is why I cannot simply open a pull request. The concrete has creeped in at a systemic level and must be addressed with coordination.

There is possibly a deeper root cause if these Codex changes were implemented by a specific model or model group. It would point to a "concrete" bias of that model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIbugSomething isn't workingmodel-behaviorIssues related to behaviors exhibited by the model

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions