Skip to content

Save and upload codex conversation transcript#9697

Merged
liliwilson merged 3 commits intomasterfrom
lili/remote-1504-save-and-upload-codex-conversation-transcript
May 1, 2026
Merged

Save and upload codex conversation transcript#9697
liliwilson merged 3 commits intomasterfrom
lili/remote-1504-save-and-upload-codex-conversation-transcript

Conversation

@liliwilson
Copy link
Copy Markdown
Contributor

@liliwilson liliwilson commented May 1, 2026

Description

Add transcript upload for codex cloud agent runs, mirroring what we already have for claude transcripts.

Of note:

  • codex nests session JSONL files within YYYY/MM/DD folders, we walk the dirs here to find the session JSONL before the first snapshot upload. This first walk shouldn't be expensive in practice—we are running these in a context where there shouldn't be any other sessions (cloud agent).
  • We cache some metadata that we parse from the transcript on the harness runner like the codex version and cwd since we don't expect that to change over the course of the session.

Testing

Tested by running locally and confirming that the codex transcripts do get uploaded to GCS in the format that we expect. Also added unit tests for parsing metadata.

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Capture the codex session UUID via the codex-cli-sidecar SessionStart event
and upload the on-disk rollout to the server alongside the block snapshot
during periodic and final saves.

REMOTE-1504

Co-Authored-By: Oz <oz-agent@warp.dev>
@cla-bot cla-bot Bot added the cla-signed label May 1, 2026
Copy link
Copy Markdown
Contributor Author

liliwilson commented May 1, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@liliwilson liliwilson requested a review from oz-agent May 1, 2026 00:51
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 1, 2026

@liliwilson

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and posted feedback on this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds Codex transcript discovery, envelope serialization, and upload alongside existing block snapshot saves, plus unit coverage for the transcript helper module.

Concerns

  • parse_session_meta returns metadata for any first entry with a payload, so an early read where the first parsed entry is not session_meta can cache default cwd/version forever and make all later uploads keep incorrect metadata.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread app/src/ai/agent_sdk/driver/harness/codex_transcript.rs Outdated
@liliwilson liliwilson mentioned this pull request May 1, 2026
1 task
@liliwilson liliwilson requested a review from bnavetta May 1, 2026 05:34
@liliwilson liliwilson marked this pull request as ready for review May 1, 2026 05:34
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 1, 2026

@liliwilson

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and posted feedback on this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds Codex transcript discovery, envelope serialization, and upload alongside the existing block snapshot save path, with unit coverage for root resolution, rollout discovery, and metadata parsing.

Concerns

  • Rollout discovery returns matching directory entries without rejecting symlinks, so a writable Codex sessions tree can redirect the privileged upload path to another JSON/JSONL file.

Security

  • The transcript file finder should only accept regular files before returning a path for upload.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread specs/REMOTE-1504/TECH.md Outdated
Three `OnceLock` fields added for lazy, set-once caching:
- `session_id: OnceLock<Uuid>` — captured from `CLIAgentSessionsModel` when hooks emit `SessionStart`
- `transcript_path: OnceLock<PathBuf>` — resolved by `find_session_file` on first save, cached thereafter
- `session_metadata: OnceLock<CodexSessionMetadata>` — parsed from JSONL first line, cached thereafter
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're reading the JSONL file already, I'm not sure we gain as much from caching this?

Caching the transcript path makes sense though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fair—I can remove that

Ok(None)
}

fn read_subdirs(parent: &Path) -> io::Result<Vec<PathBuf>> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: thoughts on returning an io::Result<impl Iterator<Item = PathBuf>>> so we can wrap the fs::read_dir iterator instead of buffering into a Vec?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh nice! Will do

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, CodexHarnessRunner::resolve_transcript_path converts both JoinError and filesystem errors into None, so transcript upload failures become indistinguishable from “rollout file not yet written” and silently skip uploads.

Severity: action required | Category: reliability

How to fix: Preserve/log resolution errors

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

resolve_transcript_path currently swallows both JoinError and underlying IO/errors and returns None, which downstream is treated as “no rollout yet”. This can permanently prevent transcript uploads without any actionable logs.

Issue Context

Transcript uploads are best-effort early in a run, but real errors (permissions, home dir resolution, read_dir errors, spawn_blocking panic) should be observable and ideally retriable.

Fix Focus Areas

  • app/src/ai/agent_sdk/driver/harness/codex.rs[154-171]
  • app/src/ai/agent_sdk/driver/harness/codex.rs[316-323]

What to change

  • Replace .await.ok().and_then(|r| r.ok()) with explicit error handling:
    • If spawn_blocking join fails, log a warn (include session_id) and return None.
    • If the inner Result is Err, log a warn with context and return None.
  • Optionally, return Result<Option<PathBuf>> from resolve_transcript_path and propagate non-NotFound-ish errors up so the save path can report them (matching Claude’s with_context behavior).

Found by Qodo. Free code review for open-source maintainers.

@liliwilson liliwilson merged commit 2bdbb61 into master May 1, 2026
25 checks passed
@liliwilson liliwilson deleted the lili/remote-1504-save-and-upload-codex-conversation-transcript branch May 1, 2026 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants