Skip to content

/feedback is recommended after a false-positive safety block, but sending /feedback is blocked too #20497

@pdurlej

Description

@pdurlej

Summary

When a Codex/ChatGPT conversation is flagged with the “possible cybersecurity risk” banner, the UI tells the user to submit /feedback if the block seems incorrect. In practice, sending a /feedback ... message in that same conversation can be treated as normal chat content and blocked by the same classifier. This leaves the user with no working in-product path to report a false positive from the blocked conversation.

Environment

  • Product: Codex / ChatGPT conversation UI
  • Platform: macOS desktop browser/app context
  • Date observed: 2026-04-30
  • Screenshot evidence is available if maintainers need it; this was observed directly in the blocked conversation UI.

Steps To Reproduce

  1. In a Codex/ChatGPT conversation, trigger a false-positive block that shows the banner:
    • “This content was flagged for possible cybersecurity risk...”
    • The Polish UI says: “Jeśli to wydaje się nieodpowiednie, spróbuj inaczej sformułować prośbę lub wyślij /feedback.”
  2. In the same conversation, send a feedback message such as:
    • /feedback oracle i smoke to nie powinny być słowa problemy
  3. Observe that the feedback message is not accepted as a reporting action and the conversation remains blocked with the same warning state.

Expected Behavior

One of these should happen:

  • /feedback should open a dedicated feedback/report flow that bypasses the blocked chat response path, or
  • the warning banner should link to a separate report UI, or
  • the banner should not recommend /feedback unless that command is actually supported in the current blocked state.

Actual Behavior

The UI recommends /feedback, but sending /feedback ... in the blocked conversation appears to go through the same normal-message path and is blocked. The suggested recovery/reporting mechanism is therefore not usable.

Why This Matters

False positives are expected, but the current UX makes them hard to report:

  • the user is explicitly told to use /feedback,
  • the suggested command does not work in the blocked context,
  • there is no obvious alternative path from the same UI,
  • this creates a loop where the user cannot explain that the classification is incorrect.

Suggested Fix

Treat /feedback as a client-side command before normal model/safety routing, especially when the current conversation is already blocked. At minimum, the blocked banner should expose a direct feedback button/link that does not depend on sending another chat message.

Notes

The false positive in this case was about ordinary development/tooling terms in a discussion about a local Oracle CLI/Codex skill workflow. Terms like oracle and smoke were part of normal software testing language, not a request for harmful behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcodex-webIssues related to Codex Websafety-checkIssues related to safety and abuse checks

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions