Summary
When a Codex/ChatGPT conversation is flagged with the “possible cybersecurity risk” banner, the UI tells the user to submit /feedback if the block seems incorrect. In practice, sending a /feedback ... message in that same conversation can be treated as normal chat content and blocked by the same classifier. This leaves the user with no working in-product path to report a false positive from the blocked conversation.
Environment
- Product: Codex / ChatGPT conversation UI
- Platform: macOS desktop browser/app context
- Date observed: 2026-04-30
- Screenshot evidence is available if maintainers need it; this was observed directly in the blocked conversation UI.
Steps To Reproduce
- In a Codex/ChatGPT conversation, trigger a false-positive block that shows the banner:
- “This content was flagged for possible cybersecurity risk...”
- The Polish UI says: “Jeśli to wydaje się nieodpowiednie, spróbuj inaczej sformułować prośbę lub wyślij /feedback.”
- In the same conversation, send a feedback message such as:
/feedback oracle i smoke to nie powinny być słowa problemy
- Observe that the feedback message is not accepted as a reporting action and the conversation remains blocked with the same warning state.
Expected Behavior
One of these should happen:
/feedback should open a dedicated feedback/report flow that bypasses the blocked chat response path, or
- the warning banner should link to a separate report UI, or
- the banner should not recommend
/feedback unless that command is actually supported in the current blocked state.
Actual Behavior
The UI recommends /feedback, but sending /feedback ... in the blocked conversation appears to go through the same normal-message path and is blocked. The suggested recovery/reporting mechanism is therefore not usable.
Why This Matters
False positives are expected, but the current UX makes them hard to report:
- the user is explicitly told to use
/feedback,
- the suggested command does not work in the blocked context,
- there is no obvious alternative path from the same UI,
- this creates a loop where the user cannot explain that the classification is incorrect.
Suggested Fix
Treat /feedback as a client-side command before normal model/safety routing, especially when the current conversation is already blocked. At minimum, the blocked banner should expose a direct feedback button/link that does not depend on sending another chat message.
Notes
The false positive in this case was about ordinary development/tooling terms in a discussion about a local Oracle CLI/Codex skill workflow. Terms like oracle and smoke were part of normal software testing language, not a request for harmful behavior.
Summary
When a Codex/ChatGPT conversation is flagged with the “possible cybersecurity risk” banner, the UI tells the user to submit
/feedbackif the block seems incorrect. In practice, sending a/feedback ...message in that same conversation can be treated as normal chat content and blocked by the same classifier. This leaves the user with no working in-product path to report a false positive from the blocked conversation.Environment
Steps To Reproduce
/feedback oracle i smoke to nie powinny być słowa problemyExpected Behavior
One of these should happen:
/feedbackshould open a dedicated feedback/report flow that bypasses the blocked chat response path, or/feedbackunless that command is actually supported in the current blocked state.Actual Behavior
The UI recommends
/feedback, but sending/feedback ...in the blocked conversation appears to go through the same normal-message path and is blocked. The suggested recovery/reporting mechanism is therefore not usable.Why This Matters
False positives are expected, but the current UX makes them hard to report:
/feedback,Suggested Fix
Treat
/feedbackas a client-side command before normal model/safety routing, especially when the current conversation is already blocked. At minimum, the blocked banner should expose a direct feedback button/link that does not depend on sending another chat message.Notes
The false positive in this case was about ordinary development/tooling terms in a discussion about a local Oracle CLI/Codex skill workflow. Terms like
oracleandsmokewere part of normal software testing language, not a request for harmful behavior.