Agent can regress under pressure and violate explicit output-modality constraints

## Summary

Codex can regress under pressure by abandoning an explicit user-specified output modality and falling back to a tool or representation that is easier for the agent to verify, even when the user has repeatedly forbidden that fallback.

In one local technical-report workflow, the user explicitly requested direct image-generation output for IEEE/ISSCC-style raster circuit schematics and repeatedly rejected SVG/vector-style substitutes. After generated images had connectivity issues, the agent reverted to structured SVG generation and SVG-to-PNG conversion because that was easier to control, despite the user's explicit instruction that the solution should be direct image-model generation. This produced another severe trust break.

## Environment

- Codex App: 26.623.81905
- CFBundleVersion: 4598
- Platform: macOS desktop app
- Workflow type: local HTML report with generated PNG circuit schematics

## Reproduction Pattern

1. User asks for technical circuit figures using a high-quality image-generation model, with a specific visual style.
2. User clarifies that the generated figures must distinguish p-type/n-type TFTs and that every TFT's D/S/G terminal connectivity must be checked.
3. User rejects hand-authored SVG or vector substitutes and expects direct raster image-generation prompts to enumerate each device and terminal.
4. The agent encounters repeated image-generation failures where the model draws plausible-looking but electrically wrong schematics.
5. Instead of stopping and reporting that direct image generation is not meeting terminal-level constraints, the agent switches back to SVG/structured vector generation or vector-to-PNG fallback.
6. The user observes that the agent has violated the requested method and that the generated/converted diagrams still do not reflect correct circuit connectivity.

## Actual Behavior

- The agent prioritized controllability/verifiability over the user's explicit output-modality constraint.
- The prompt-generation process was not exhaustive enough: it did not reliably enumerate and then verify every TFT and every D/S/G terminal relationship in the raster output.
- Under pressure, the agent appeared to become less reliable: it repeated previously corrected behavior, reintroduced forbidden implementation choices, and created new artifacts that violated the user's stated criteria.
- The agent did not stop at the right boundary with an honest limitation statement; it kept trying alternate implementation routes that the user had not approved.

## Expected Behavior

When the user explicitly specifies a delivery medium or forbids a fallback, Codex should treat that as a hard constraint, especially after the user has corrected the agent multiple times.

For image-generation-based technical schematics:

- The prompt should enumerate every device, every terminal, every node, every control-line endpoint, every p/n symbol rule, and every forbidden connection.
- The audit should traverse the generated raster image against that enumeration before delivery.
- If the direct image model cannot satisfy terminal-level schematic constraints after reasonable attempts, the agent should say so and ask whether to switch to a deterministic vector/KiCad/SVG method, rather than switching silently.
- The final answer should not claim progress or deliver a substitute artifact that violates the user's chosen method.

## Why This Matters

This is a distinct failure mode from ordinary visual QA mistakes. The problem is not only that a diagram is wrong; it is that the agent, under corrective pressure, may retreat to a different method while still presenting the result as a continuation of the user's requested workflow. That makes the user feel ignored and makes previous hard gates feel performative.

## Possible Product Improvements

- Add a runtime-visible "user-forbidden fallback" memory within a task, so the agent cannot silently choose a previously rejected approach.
- Add a final-response blocker when generated artifact modality does not match the user's explicit requested modality.
- For image-generation technical prompts, add a structured checklist mode: devices, terminals, nets, controls, forbidden connections, and post-generation visual audit fields.
- Add pressure-regression detection: if the agent repeatedly violates a recent correction, force it to stop and restate the current constraints before continuing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Agent can regress under pressure and violate explicit output-modality constraints #31024

Summary

Environment

Reproduction Pattern

Actual Behavior

Expected Behavior

Why This Matters

Possible Product Improvements

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Agent can regress under pressure and violate explicit output-modality constraints #31024

Description

Summary

Environment

Reproduction Pattern

Actual Behavior

Expected Behavior

Why This Matters

Possible Product Improvements

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions