False positive "cybersecurity risk" flags on routine sysadmin/devops work in Codex CLI

### Description
Codex CLI (`gpt-5-codex`, ChatGPT Pro) is repeatedly flagging routine sysadmin and web-development work as cybersecurity risk on infrastructure I personally own and operate. The flags have escalated to an account-level state:

> *"Your conversations have multiple flags for possible cybersecurity risk. Responses may take longer because extra safety checks are on."*

The banner has been persistent for several days and the classifier is now scoring on conversational context rather than command intent.

### Evidence the classifier is context-scoped, not command-scoped
On 2026-04-25 ~19:58, immediately after a clean Commerce7 image-upload batch, the very next turn was:

```
date +%Y-%m-%dT%H:%M:%S%z
```

That command was flagged for cybersecurity risk. A bare `date` invocation is not a credible cyber signal under any reasonable threat model — which means the classifier is operating on conversation history, not the prompt in front of it. Once a session is "poisoned" by trigger-keyword adjacency, every subsequent turn inherits the flag regardless of content. This should be reproducible against any session that has touched WordPress, cPanel, or staging-deploy vocabulary earlier in the same conversation.

### Trigger keyword cluster (Codex's own self-diagnosis)
"production clone," "staging," "secret store," "Facebook signin," "traverse," "blocked public route," "SSH deploys," "WordPress operations." Stripped of ownership context, these read like credentialed web automation. With ownership context — which is in the session preamble — they're vanilla devops on personally-owned assets.

### Confirmed false-positive surface (all on owned infrastructure)
- WordPress staging → production deploys (rsync, WP-CLI, plugin/theme updates)
- cPanel/WHM administration of my own hosting accounts
- Commerce7 e-commerce tenant administration
- LAN device inventory on my home/office network
- The `date` command above

### What I've already done
- Multiple in-CLI `/feedback` → "Safety Check" submissions with full session logs
- OpenAI support case **#08180548** — three email turns, escalated to a Tier 4 response that did not engage with the technical claim
- Declined to enroll in Trusted Access for Cyber. The category is wrong. Trusted Access exists for security research; I am administering my own WordPress site. Asking devops users to submit government ID and biometrics to run `wp plugin update` is the wrong fix for a classifier tuning problem — and per the sibling reports below, it doesn't actually fix it.

### Sibling reports — this is a cluster, not an isolated complaint
- #19533 — "False positive cyber-safety flag on benign software engineering work" (closed)
- #19403 — "False positive cyber-safety flag during passive product research on public webhosting documentation" (open)
- #19379 — "Trusted access enabled but Codex still shows high-risk cyber slowdown banner" (closed)
- #19324 — "False positive cyber mitigation despite Trusted Access verification" (closed)
- #19272 — "Flagged for Cyber, while already completed TAC" (closed)

Three of these explicitly note that Trusted Access enrollment did **not** stop the flagging. Most were closed without engaging the underlying classifier-tuning problem. Worth treating these as one signal rather than five tickets.

### Impact
Codex CLI was my daily driver. It no longer is. I've migrated active operational workloads to Claude Opus 4.7 + a local multi-agent setup, and the migration is complete — this report is feedback, not leverage. I'm filing it because the classifier behavior is diagnosable, the false-positive rate is high, and the next professional user about to hit the same wall deserves better triage than case #08180548 received.

Capitalism balances books. I'd rather Codex stay competitive — the product is good when the harness lets it work — but that's an OpenAI decision, not mine.

### Ask
Tune the cyber_policy classifier so conversational-context drift doesn't escalate benign commands. Reopen #08180548 with engineering, or provide a notification channel when the classifier is retuned, so affected users know when to retest.

Happy to provide additional session logs, redacted prompt traces, or a 30-minute calibration call.

**Environment:** macOS 26.4.1, Codex CLI 0.125.0, ChatGPT Pro (not Enterprise)

<img width="1280" height="671" alt="Image" src="https://github.com/user-attachments/assets/cf584e6e-9284-442f-8419-0ca371a501c7" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

False positive "cybersecurity risk" flags on routine sysadmin/devops work in Codex CLI #19738

Description

Evidence the classifier is context-scoped, not command-scoped

Trigger keyword cluster (Codex's own self-diagnosis)

Confirmed false-positive surface (all on owned infrastructure)

What I've already done

Sibling reports — this is a cluster, not an isolated complaint

Impact

Ask

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

False positive "cybersecurity risk" flags on routine sysadmin/devops work in Codex CLI #19738

Description

Description

Evidence the classifier is context-scoped, not command-scoped

Trigger keyword cluster (Codex's own self-diagnosis)

Confirmed false-positive surface (all on owned infrastructure)

What I've already done

Sibling reports — this is a cluster, not an isolated complaint

Impact

Ask

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions