Skip to content

Fix Codex session rehydration leaking AGENTS.md/internal context into user-visible history#488

Merged
viper151 merged 1 commit intomainfrom
fix/do-not-show-agents-md-file-for-codex-sessions
Mar 5, 2026
Merged

Fix Codex session rehydration leaking AGENTS.md/internal context into user-visible history#488
viper151 merged 1 commit intomainfrom
fix/do-not-show-agents-md-file-for-codex-sessions

Conversation

@blackmammoth
Copy link
Collaborator

@blackmammoth blackmammoth commented Mar 5, 2026

Summary This PR fixes a Codex history parsing bug where session reload could show hidden internal prompt context (for example AGENTS.md instructions) that never appeared in live websocket responses. The fix aligns restored history with live behavior by sourcing user-visible user messages from event_msg.user_message (plain) and restricting response_item.message to assistant role only.

Issue

  • Repro:
  1. Send Hi.
  2. UI shows correct assistant reply from websocket.
  3. Refresh and reopen session.
  4. Unexpected AGENTS.md content appears as user input in history.
  • Impact:
    • False user-visible history.
    • Confusing conversation timeline.
    • Potential accidental display of internal prompt scaffolding.

Root cause analysis

  • Live rendering path:
    • server/openai-codex.js:26-120 transforms Codex stream items.
    • src/components/chat/hooks/useChatRealtimeHandlers.ts:774-903 renders assistant/tool/reasoning items.
  • Reload path:
    • src/components/chat/hooks/useChatSessionState.ts:100-121 calls session history endpoint.
    • src/utils/api.js:51-67 routes codex to /api/codex/sessions/:sessionId/messages.
    • server/routes/codex.js:71-83 calls getCodexSessionMessages.
    • server/projects.js previously reconstructed user messages from response_item.message.
  • Codex JSONL nuance:
    • response_item role=user may include expanded internal context (AGENTS/system/developer/env payloads).
    • event_msg user_message with kind=plain is the canonical end-user text.

Solution

  1. Introduce visibility filter:
    • server/projects.js:1493-1513
    • New isVisibleCodexUserMessage(payload) ensures only real user-facing plain messages are accepted.
  2. Session summary/message count:
    • server/projects.js:1550-1556
    • Count/summarize from visible user messages only.
  3. History reconstruction changes:
    • server/projects.js:1659-1669 add user messages from event_msg.user_message only.
    • server/projects.js:1671-1692 parse response_item.message only when role is assistant.

Concrete examples

Example (target bug):

  • Input: Hi
  • Before reload:
    • user: # AGENTS.md instructions ... (incorrect)
    • assistant: Hi. What do you want to work on?
  • After reload:
    • user: Hi
    • assistant: Hi. What do you want to work on?

Manual QA checklist

  1. Start new Codex session and send Hi.
  2. Confirm live websocket rendering remains unchanged.
  3. Refresh page and reopen same session.
  4. Confirm no AGENTS/environment scaffold appears.
  5. Confirm user Hi + assistant response are preserved.
  6. Test one older Codex session for backward compatibility.

Summary by CodeRabbit

  • Bug Fixes
    • Improved filtering of session messages to prevent internal content and system context from appearing in user-facing displays.
    • Enhanced extraction and proper handling of assistant responses, reasoning processes, and tool interactions in session data.
    • Refined message type handling to ensure only user-visible interactions are surfaced appropriately.

…loading Codex sessions

Problem
- Users see correct assistant output during live websocket streaming.
- After refresh, restored session history can include hidden internal prompt content
  (for example: "# AGENTS.md instructions ...", "<environment_context> ...")
  that was never rendered in the live UI.
- This creates a confusing mismatch between live conversation and rehydrated history.

Root cause
- Realtime path and history path were sourcing messages differently:
  - Realtime rendering (codex-response item events) shows assistant-visible content only.
    - server/openai-codex.js:26-120
    - src/components/chat/hooks/useChatRealtimeHandlers.ts:774-903
  - Refresh path loads persisted JSONL and reconstructs chat from /api/codex/sessions/:id/messages:
    - src/utils/api.js:51-67
    - src/components/chat/hooks/useChatSessionState.ts:100-121
    - server/routes/codex.js:71-83
- In Codex JSONL, `response_item` entries with role=user can include expanded internal context
  (AGENTS instructions, env context, system/developer scaffolding), not only end-user text.
- Previous parser consumed `response_item.message` for both roles and only filtered
  `<environment_context>` by string match, which is incomplete and let AGENTS content through.
  - server/projects.js (before): around 1637-1658

What changed
1. Added a strict visibility guard for user messages:
   - `isVisibleCodexUserMessage(payload)` in server/projects.js:1493-1513
   - Accepts only:
     - payload.type === "user_message"
     - kind is absent or "plain"
     - non-empty text
     - not an environment_context scaffold
2. Updated Codex session metadata parsing to use visible user messages only:
   - server/projects.js:1550-1556
   - Session summary/messageCount now reflect real user prompts, not internal injected content.
3. Updated Codex history reconstruction:
   - User messages now come from `event_msg.user_message` only:
     - server/projects.js:1659-1669
   - `response_item.message` is now accepted only for assistant role:
     - server/projects.js:1671-1692
   - This aligns restored history with live websocket behavior.

Behavior examples

Example A (simple "Hi" flow)
- JSONL contains:
  - response_item role=user => "# AGENTS.md instructions ... + <environment_context> ..."
  - event_msg user_message kind=plain => "Hi"
  - response_item role=assistant => "Hi. What do you want to work on?"
- Before:
  - Restored user history could show AGENTS.md block.
- After:
  - Restored history shows only:
    - user: "Hi"
    - assistant: "Hi. What do you want to work on?"
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8afc8f6b-469a-4971-88ec-59f38f2636da

📥 Commits

Reviewing files that changed from the base of the PR and between 2444209 and 36a77a3.

📒 Files selected for processing (1)
  • server/projects.js

📝 Walkthrough

Walkthrough

This change introduces a centralized visibility filter (isVisibleCodexUserMessage()) to standardize how Codex user messages are identified and extracted in session parsing. The filter ensures only plain, non-empty messages are treated as user input, and is applied throughout message counting, extraction, and storage operations.

Changes

Cohort / File(s) Summary
Codex Message Visibility Filter
server/projects.js
Adds isVisibleCodexUserMessage() helper function to filter user message payloads, and refactors parseCodexSessionFile() and getCodexSessionMessages() to consistently apply this filter when identifying user inputs and extracting assistant responses, reasoning, tool calls, and custom tool call data.

Possibly related PRs

Suggested reviewers

  • viper151

Poem

🐰 A filter now guards the message stream,
Only plain words fulfill the dream,
No hidden context leaks through the cracks,
Just user and assistant, clean message tracks!
Codex sessions, now crystalline clear.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main bug being fixed: preventing internal context (AGENTS.md/scaffolding) from leaking into user-visible session history during Codex rehydration.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/do-not-show-agents-md-file-for-codex-sessions

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@blackmammoth
Copy link
Collaborator Author

Resolves #484, #428

@viper151 viper151 merged commit 64a96b2 into main Mar 5, 2026
5 checks passed
@blackmammoth blackmammoth deleted the fix/do-not-show-agents-md-file-for-codex-sessions branch March 5, 2026 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants