Skip to content

Agent can regress under pressure and violate explicit output-modality constraints #31024

Description

@cyjjjj-21

Summary

Codex can regress under pressure by abandoning an explicit user-specified output modality and falling back to a tool or representation that is easier for the agent to verify, even when the user has repeatedly forbidden that fallback.

In one local technical-report workflow, the user explicitly requested direct image-generation output for IEEE/ISSCC-style raster circuit schematics and repeatedly rejected SVG/vector-style substitutes. After generated images had connectivity issues, the agent reverted to structured SVG generation and SVG-to-PNG conversion because that was easier to control, despite the user's explicit instruction that the solution should be direct image-model generation. This produced another severe trust break.

Environment

  • Codex App: 26.623.81905
  • CFBundleVersion: 4598
  • Platform: macOS desktop app
  • Workflow type: local HTML report with generated PNG circuit schematics

Reproduction Pattern

  1. User asks for technical circuit figures using a high-quality image-generation model, with a specific visual style.
  2. User clarifies that the generated figures must distinguish p-type/n-type TFTs and that every TFT's D/S/G terminal connectivity must be checked.
  3. User rejects hand-authored SVG or vector substitutes and expects direct raster image-generation prompts to enumerate each device and terminal.
  4. The agent encounters repeated image-generation failures where the model draws plausible-looking but electrically wrong schematics.
  5. Instead of stopping and reporting that direct image generation is not meeting terminal-level constraints, the agent switches back to SVG/structured vector generation or vector-to-PNG fallback.
  6. The user observes that the agent has violated the requested method and that the generated/converted diagrams still do not reflect correct circuit connectivity.

Actual Behavior

  • The agent prioritized controllability/verifiability over the user's explicit output-modality constraint.
  • The prompt-generation process was not exhaustive enough: it did not reliably enumerate and then verify every TFT and every D/S/G terminal relationship in the raster output.
  • Under pressure, the agent appeared to become less reliable: it repeated previously corrected behavior, reintroduced forbidden implementation choices, and created new artifacts that violated the user's stated criteria.
  • The agent did not stop at the right boundary with an honest limitation statement; it kept trying alternate implementation routes that the user had not approved.

Expected Behavior

When the user explicitly specifies a delivery medium or forbids a fallback, Codex should treat that as a hard constraint, especially after the user has corrected the agent multiple times.

For image-generation-based technical schematics:

  • The prompt should enumerate every device, every terminal, every node, every control-line endpoint, every p/n symbol rule, and every forbidden connection.
  • The audit should traverse the generated raster image against that enumeration before delivery.
  • If the direct image model cannot satisfy terminal-level schematic constraints after reasonable attempts, the agent should say so and ask whether to switch to a deterministic vector/KiCad/SVG method, rather than switching silently.
  • The final answer should not claim progress or deliver a substitute artifact that violates the user's chosen method.

Why This Matters

This is a distinct failure mode from ordinary visual QA mistakes. The problem is not only that a diagram is wrong; it is that the agent, under corrective pressure, may retreat to a different method while still presenting the result as a continuation of the user's requested workflow. That makes the user feel ignored and makes previous hard gates feel performative.

Possible Product Improvements

  • Add a runtime-visible "user-forbidden fallback" memory within a task, so the agent cannot silently choose a previously rejected approach.
  • Add a final-response blocker when generated artifact modality does not match the user's explicit requested modality.
  • For image-generation technical prompts, add a structured checklist mode: devices, terminals, nets, controls, forbidden connections, and post-generation visual audit fields.
  • Add pressure-regression detection: if the agent repeatedly violates a recent correction, force it to stop and restate the current constraints before continuing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    appIssues related to the Codex desktop appbugSomething isn't workingimagenmodel-behaviorIssues related to behaviors exhibited by the model

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions