checkpoint: into wallentx/termux-target from release/0.129.0 @ 1a78ed0ef81a by unemployabot[bot] · Pull Request #88 · wallentx/codex-termux

unemployabot · 2026-05-06T01:32:57Z

Termux release checkpoint

Source branch: release/0.129.0
Source hash: 1a78ed0ef81a93de28e1e0b0e7063c063e057166
Destination branch: wallentx/termux-target
Remaining first-parent commits on source: 0

This PR carries release-train conflict fixes and follow-up changes back into the reusable Termux patch branch.

Release-only workflow files and metadata under .github were restored to the destination branch versions before opening this PR.

## Summary - Add a testable DNS lookup helper for the local or private host precheck while preserving production `lookup_host` behavior. - Add deterministic coverage for DNS timeout, lookup error, private resolution, and public resolution decisions. - Keep BUGB 15982 guarded without relying on ambient DNS timing or resolver behavior. ## Why BUGB 15982 was fixed by failing closed on DNS lookup errors and timeouts. The existing regression covered lookup failure through real DNS, but did not deterministically exercise the timeout branch. This PR adds a small injection point so CI can cover that branch without standing up slow authoritative DNS. ## Validation - `cargo test -p codex-network-proxy host_resolves_to_non_public_ip -- --nocapture` - `cargo test -p codex-network-proxy host_blocked_rejects_allowlisted_hostname_when_dns_lookup_fails -- --nocapture` - `cargo test -p codex-network-proxy` - `just fmt` - `just fix -p codex-network-proxy` - `git diff --check` ## Tickets - BUGB 15982 - https://linear.app/openai/issue/BUGB-15982/codex-dns-timeout-fail-open-in-codex-network-proxy-bypasses - Bugcrowd: https://tracker.bugcrowd.com/openai/submissions/b2bf131d-db04-478f-85aa-cdd17ca8f604

## Why The external startup/login surface for this auth path should talk about an access token instead of exposing the internal Agent Identity terminology. Users should pass `CODEX_ACCESS_TOKEN` or pipe a token into `codex login --with-access-token`; the old external env/flag spellings are removed so there is only one supported user-facing path. ## What Changed - Added `CODEX_ACCESS_TOKEN` as the supported environment variable for this auth path. - Added `codex login --with-access-token` as the supported stdin-based login command. - Removed the legacy `CODEX_AGENT_IDENTITY` env-var fallback and hidden `--with-agent-identity` CLI alias. - Updated CLI error, status, and stdin prompts to use access-token language. - Added coverage for access-token env loading, CLI login failure behavior, and renamed login status text. ## Validation - `cargo test -p codex-login` - `cargo test -p codex-cli` - `just fix -p codex-login` - `just fix -p codex-cli`

- Route `thread/metadata/update` through `ThreadStore::update_thread_metadata`. - Add `LocalThreadStore` git metadata patch support for set, partial update, and clear semantics. - Add some unit tests for the new thread store code - Remove a lot of dead code/tests!

## Summary - thread plugin skill roots through the skills loader with their plugin ID - store plugin ID on loaded skill metadata for plugin-provided skills - include plugin ID on skill invocation analytics events ## Test plan - cargo check -p codex-core-skills - cargo check -p codex-core -p codex-core-plugins -p codex-analytics - cargo check -p codex-tui - cargo check -p codex-plugin -p codex-core -p codex-core-plugins -p codex-analytics - cargo check -p codex-app-server - cargo test -p codex-analytics - HOME=/private/tmp/codex-empty-home cargo test -p codex-core-skills - just fix -p codex-core-skills - just fix -p codex-analytics - just fix -p codex-core-plugins - just fix -p codex-core - just fmt - git diff --check

…penai#20575) Migrate token usage replay, rollback responses, and detached review setup (a special case of forking) to be served from ThreadStore reads rather direct rollout files. - replay restored token usage from already-loaded `RolloutItem` history instead of reopening `Thread.path` - rebuild rollback responses from loaded `ThreadStore` snapshots and history - start detached reviews from store-backed parent history and stored review-thread metadata - remove obsolete app-server rollout-summary helper code that became dead after the store-backed migration - preserve response/notification ordering for resume, fork, rollback, and detached review flows - add integration test coverage for the affected paths

## Why Large hook outputs can enter model-visible context through hook-specific paths such as `additionalContext` and `Stop` continuation prompts. Without a dedicated cap, one hook can inject a large blob directly into conversation history instead of leaving a bounded preview for the model and preserving the full text elsewhere. ## What - spill hook text once it exceeds a fixed `2_500`-token budget, preserving the full output on disk and leaving a head/tail preview plus saved path in context - add shared hook-output spilling under `CODEX_HOME/hook_outputs/<thread_id>/<uuid>.txt` - apply the cap to both `additionalContext`, `feedback_message`, and `Stop` continuation fragments

## Why The model list needs to carry display-ready service tier metadata so clients can render tier choices with stable IDs, names, and descriptions. A raw speed-tier string list is not enough for richer UI copy or future tier labels. ## What changed - Added `ModelServiceTier` to shared model metadata with string `id`, `name`, and `description` fields. - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving empty defaults for older cached model payloads. - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded it through TUI app-server model conversion. - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers` metadata as deprecated in source and generated schema output. - Regenerated app-server protocol JSON schema and TypeScript fixtures, including `ModelServiceTier.ts`. ## Verification - Ran `just write-app-server-schema`. - Did not run local tests per repo instruction; relying on PR CI. --------- Co-authored-by: Codex <noreply@openai.com>

## Why `list_dir` still carries a full spec/handler/test path, but nothing in the current model catalog advertises it via `experimental_supported_tools`. That leaves us maintaining an environment-backed tool surface that is effectively unused. ## What changed - delete the `list_dir` handler and its tests from `codex-core` - remove the `list_dir` spec builder, handler kind, and registry wiring from `codex-tools` - clean up the remaining internal README and registry tests so they no longer mention the removed tool

## Why The local memories root can contain implementation details such as `.git` plus incidental OS metadata like `.DS_Store`. Those entries are not authored memory content, so the memories MCP should keep them invisible instead of exposing them through normal discovery or direct lookup. Only for local implementation ofc ## What changed - Return `NotFound` for scoped `list`, `read`, and `search` requests that include a hidden path component. - Skip hidden files and directories while listing a directory or recursively searching the memories tree. - Add regression coverage for hidden files, hidden directories, and hidden scoped requests across `list`, `read`, and `search`. ## Testing - Added focused regression tests in `memories/mcp/src/local_tests.rs` covering hidden-path behavior across the affected APIs.

## Why Memory search currently supports either independent substring matches or requiring every query to appear on the same line. That is too restrictive for memory files where related terms often land on nearby lines in the same note or bullet block. ## What changed - Replace the old `all` match mode with explicit tagged modes: `all_on_same_line` and `all_within_lines { line_count }`. - Add windowed matching in `codex-rs/memories/mcp/src/local.rs` so callers can require every query to appear within a bounded line range while returning only the minimal qualifying windows. - Reject invalid zero-width windows and update the MCP tool description plus argument parsing to expose the new mode. - Add coverage for same-line matching, windowed matching, and invalid `line_count` input. ## Verification - Added targeted coverage in `codex-rs/memories/mcp/src/local_tests.rs` for `search_supports_all_within_lines_match_mode` and `search_rejects_zero_line_window`. - Added server-side parsing coverage in `codex-rs/memories/mcp/src/server.rs` for `search_args_accept_windowed_all_match_mode`.

## Why Memory search currently treats separators literally, so callers need to know whether a stored term uses spaces, hyphens, or no separators at all. That makes recall brittle for terms such as `MultiAgentV2` vs. `multi agent v2` and `cold-resume` vs. `cold resume`. ## What changed - Add an opt-in `normalized` mode to memory search that removes non-alphanumeric separators after any requested case folding. - Thread the new flag through the MCP `search` tool into the local backend while keeping existing literal matching as the default. - Reject queries that normalize to an empty string, and add regression coverage for both normalized matching and that validation path. ## Testing - `cargo test -p codex-memories-mcp`

## Summary - prefer tmux's native clipboard integration for `/copy` when running inside tmux - fall back to OSC 52 when tmux clipboard copy is unavailable - add coverage for tmux-preferred, fallback, and combined-failure paths ## Why Inside tmux, `/copy` previously relied on DCS-wrapped OSC 52 when `TMUX` was set. That only reaches the outer terminal when tmux passthrough is enabled, so Codex could report success even though the system clipboard never changed. ## User impact `/copy` now works inside tmux even when `allow-passthrough` is off, as long as tmux clipboard integration is available. If tmux cannot handle the copy, Codex still keeps the existing OSC 52 fallback path. ## Validation - `cargo test -p codex-tui` - `just fmt` - `just fix -p codex-tui` - `just argument-comment-lint` - manually verified `/copy` inside tmux with `allow-passthrough off` Fixes openai#19926

## Why Adding goal metrics makes it possible to track how often goals are created, completed, and stopped by budget limits, plus the final token and wall-clock usage for terminal outcomes. ## What Changed - Added OpenTelemetry metric constants for goal lifecycle tracking: - `codex.goal.created`: increments each time a new persisted goal is created or an existing goal is replaced with a new objective. - `codex.goal.completed`: increments when a goal transitions to `complete`. - `codex.goal.budget_limited`: increments when a goal transitions to `budget_limited` because its token budget has been reached. - `codex.goal.token_count`: records the final persisted token count when a goal transitions to `complete` or `budget_limited`. - `codex.goal.duration_s`: records the final persisted elapsed wall-clock time, in seconds, when a goal transitions to `complete` or `budget_limited`. - Emitted creation metrics when a goal is created or replaced. - Emitted terminal outcome counters and final usage histograms when a goal transitions to `complete` or `budget_limited`, avoiding double-counting later in-flight accounting for already budget-limited goals. - Added focused `codex-core` tests for create/complete metrics and one-time budget-limit metrics.

## Why Long `/goal` definitions currently reach lower-level goal validation and can produce an opaque failure. This bug was reported by a user. Pasted instruction blocks are especially confusing because the composer can still contain a paste placeholder before expansion, which may otherwise fall into the generic prompt-size error path. There was also a related paste edge case where `/goal ` followed by a multiline block whose first pasted line was blank looked like a bare `/goal` command. That showed the goal usage/summary instead of setting the pasted objective. ## What Changed This adds TUI-side preflight validation for `/goal <objective>` using the shared `MAX_THREAD_GOAL_OBJECTIVE_CHARS` limit. Oversized typed, queued, and pasted goal objectives now fail locally with a goal-specific message that recommends putting longer instructions in a file and referencing that file from the goal. The TUI now also lets inline-argument slash commands consume later-line arguments before treating the first line as a bare command, so `/goal ` followed by blank lines and then objective text sets the goal instead of opening the bare `/goal` flow. ## Manual Testing 1. Start the TUI with goals enabled and an active session. 2. Submit `/goal ` followed by exactly 4,000 objective characters. It should continue through the normal goal-setting path. 3. Submit `/goal ` followed by 4,001 objective characters. It should not set a goal, and should show `Goal objective is too long: 4,001 characters. Limit: 4,000 characters.` followed by the guidance to put longer instructions in a file and reference that file from the goal. 4. Type `/goal `, paste a large block that becomes a `[Pasted Content ... chars]` placeholder, then submit. It should validate the expanded pasted text and show the goal-specific file guidance rather than the generic prompt-size error. 5. Type `/goal `, paste a multiline block whose first line is blank, then submit. It should set the objective from the non-blank pasted content instead of showing `Usage: /goal <objective>` or the bare goal summary. 6. While a turn is running, queue an oversized `/goal` command. When the queue drains, it should show the same goal-specific error and should not emit a goal-setting request.

## Why The desktop app on Windows needs a read-only way to tell, before the next tool call, whether the local Windows sandbox setup is in a state that should block the user and ask for setup again. The main case we want to cover is the elevated sandbox setup version bump. Today, if the app is configured for elevated Windows sandboxing and the installed setup is stale, the next sandboxed shell/exec path can end up triggering the elevated setup flow directly. That means the user can see an unexpected UAC prompt with no UI explanation. This change adds a small app-server preflight so the desktop app can ask “is Windows sandbox ready, not configured, or update-required?” during startup and show the appropriate blocking UI before the user hits a tool call. ## What changed - Added a new read-only app-server RPC: `windowsSandbox/readiness` - Added a new protocol enum and response type: - `WindowsSandboxReadiness` - `WindowsSandboxReadinessResponse` - Added core readiness logic in `core/src/windows_sandbox.rs`: - `ready` - `notConfigured` - `updateRequired` - Wired the new request through `codex_message_processor` - Regenerated the vendored app-server schema fixtures ## Readiness semantics This is intentionally a coarse startup/version-bump readiness check, not a full predictor of every runtime repair case. For now, readiness is determined from: - the configured Windows sandbox level - `sandbox_setup_is_complete()` for elevated mode That means: - `disabled` maps to `notConfigured` - `restricted token` maps to `ready` - `elevated` maps to `ready` or `updateRequired` depending on `sandbox_setup_is_complete()` This is deliberate for the first UI integration because the common case we want to catch is “the app updated, the elevated setup version bumped, and the user should see an update-required blocker instead of a surprise UAC prompt”. It does not attempt to model every case where the deeper runtime path might decide to repair or re-run setup. ## Testing - Ran `cargo fmt --all -- app-server-protocol/src/protocol/common.rs app-server-protocol/src/protocol/v2.rs app-server/src/codex_message_processor.rs core/src/windows_sandbox.rs core/src/windows_sandbox_tests.rs` - Added unit tests for the pure readiness mapping in `core/src/windows_sandbox_tests.rs` - Regenerated vendored schema fixtures with `cargo run -p codex-app-server-protocol --bin write_schema_fixtures -- --schema-root app-server-protocol/schema` - Did not run the full cargo test suite

# Why `PreToolUse` already exposes `hookSpecificOutput.additionalContext` in the generated hook schema, but the runtime still rejected it as unsupported. That leaves `PreToolUse` out of step with the other context-injecting hooks and prevents hook authors from attaching model-visible guidance to a pending tool call before it runs. # What - Parse `PreToolUse.additionalContext` and carry it through the hook event pipeline. - Record `PreToolUse` context at the hook boundary so successful context is preserved for both allowed and blocked calls without widening the tool registry surface. - Preserve existing deny behavior when context is combined with either `permissionDecision: "deny"` or the legacy `decision: "block"` shape.

@Sungyoun-Kim

…i#21091) Fixes openai#19940. Large-paste placeholder numbering was backed by a per-size counter, so clearing a draft with `Ctrl+C` left numbering state behind even though the active pending paste state was gone. This updates the composer to derive the next placeholder suffix from active pending pastes instead, which keeps simultaneous same-size pastes distinct while letting fresh drafts reuse the base label. This is also a small code cleanup: pending paste state is now the source of truth instead of maintaining a separate counter. Credit to @Sungyoun-Kim for the issue report, root-cause notes, and fork with the proposed fix, and to @charley-oai for the earlier related openai#10032 proposal. Changes: - Remove the monotonic large-paste counter from the composer. - Compute suffixes from currently active pending paste placeholders. - Document large-paste placeholder behavior in the composer module docs. - Add regression coverage for `Ctrl+C` clearing and deletion/reset behavior. Testing: - `just fmt` - `git diff --check`

@chanwooyang1

Fixes openai#20945. This keeps `codex fork --last` aligned with the neighboring latest-session lookup flows. The local fork path now uses the same cwd-scope helper as `resume --last`, which is also a small code cleanup around how this selection logic is shared. Credit to @chanwooyang1 for the report and for pointing out the narrow fix direction. What changed: - Route `fork --last` through the shared latest-session cwd filter. - Preserve `--all` as the explicit opt-in for global latest-session selection. - Keep remote cwd override behavior unchanged. - Add focused coverage for local default, `--all`, and remote override filter semantics. Validation: - Ran `just fmt`. - Ran `git diff --check`. - Reviewed the `fork --last`, `resume --last`, and fork picker selection paths against the issue report.

# Why Revert openai#20524 for now because the computer use plugin has not migrated off legacy `notify` yet. Keeping the deprecation in place today would show users a warning before the plugin path is ready to move, so this rolls the change back until that migration is complete. # What - revert the legacy `notify` deprecation change from openai#20524 - restore the prior `notify` behavior and remove the temporary deprecation metrics/docs from that change Once the computer use plugin has migrated, we can land the same deprecation again.

…i#21190) ## Why We found this while reviewing openai#21091, but confirmed it is not introduced by that PR: the order-sensitive `current_text_with_pending()` replacement loop already existed, and `main` already allowed active same-size large pastes to use prefix-overlapping labels such as `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. openai#21091 fixes placeholder numbering after a draft is cleared, so a fresh same-size paste can reuse the base label. This PR fixes a different path: when a draft already contains multiple active same-size large pastes, the placeholders can overlap by prefix, for example `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. That overlap breaks `current_text_with_pending()` when the composer materializes the draft text for the external editor. Replacing the base placeholder first can partially rewrite the `#2` placeholder, leaving the external editor seeded with corrupted text instead of both paste payloads. | Before | After | |---|---| | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 09" src="https://github.com/user-attachments/assets/88a2936c-cf00-4adc-8567-8fd8f398b4a8" /> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 20 31" src="https://github.com/user-attachments/assets/119cff52-43c8-432a-9367-418d82f4ed82" /> | | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 57" src="https://github.com/user-attachments/assets/026031bb-839b-4252-a0fd-9ba9616435fe" /> | <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 21 31" src="https://github.com/user-attachments/assets/8cb6f2c8-3a5d-411b-8623-dca666ee3c08" /> | ## What Changed - Changed `current_text_with_pending()` to expand pending pastes through the existing element-range based `expand_pending_pastes()` helper instead of global string replacement. - Added a regression test with two different same-length large pastes to ensure both overlapping placeholders expand to their original payloads. ## How to Test 1. Start Codex TUI. 2. Paste a large string, for example 1004 `A` characters. ```shell perl -e 'print "A" x 1004' | pbcopy ``` 3. Paste a second large string with the same length, for example 1004 `B` characters. ```shell perl -e 'print "B" x 1004' | pbcopy ``` 4. Open the external editor from the composer. 5. Confirm the editor is seeded with the full `A...` payload followed by the full `B...` payload, with no literal `#2` left behind. Targeted tests: - `cargo test -p codex-tui current_text_with_pending_expands_overlapping_placeholders` - `just argument-comment-lint-from-source -p codex-tui` I also ran `cargo test -p codex-tui`; it reached the full crate suite but failed two unrelated local status tests because this machine's `/etc/codex/requirements.toml` rejects `DangerFullAccess`.

) ## Why Linux startup runs an advisory system `bwrap` warning probe on each launch. On hosts with NFS or autofs mounts, its `--ro-bind / /` probe can take tens of seconds before Codex prints anything, matching openai#19828. Because this probe only decides whether to surface a warning, it should not be allowed to stall startup. Relevant pre-change path: [`codex-rs/sandboxing/src/bwrap.rs`](https://github.com/openai/codex/blob/de2ccf94735a3d8a2a7077e6a5292026413867cf/codex-rs/sandboxing/src/bwrap.rs#L64-L80) ## What changed - Bound the advisory system `bwrap` probe to 500 ms. - Preserve the existing warning behavior when `bwrap` promptly reports a known user-namespace failure. - Kill and reap the probe child on timeout, then suppress the advisory warning instead of blocking startup. - Read probe stderr with a bounded nonblocking drain so descendants that inherit the pipe cannot extend startup after the probe child exits. - Add regression coverage for both a deliberately slow fake `bwrap` process and a fake probe whose descendant keeps stderr open. ## Security This only bounds the advisory startup probe. It does not change the command execution path or add a fail-open sandbox fallback. The related command-side hang in openai#20017 remains separate from this PR. ## Verification - Added `system_bwrap_probe_times_out_without_reporting_a_warning`. - Added `system_bwrap_probe_does_not_wait_for_descendants_holding_stderr_open`. - `cargo test -p codex-sandboxing` - `cargo clippy -p codex-sandboxing --all-targets -- -D warnings` Fixes openai#19828 Related: openai#20017

Termux rust-v0.129.0-alpha.7

…nt/wallentx_termux-target_from_release_0.129.0_1a78ed0ef81a # Conflicts: # .github/workflows/rust-release.yml # .github/workflows/shell-tool-mcp.yml # codex-rs/Cargo.toml # scripts/termux-create-checkpoint-pr.sh

evawong-oai and others added 27 commits May 4, 2026 19:03

Add turn_id to Codex skill invocation analytics (openai#21122)

7e71d02

Release 0.129.0-alpha.7

407cefb

Seed Termux release automation

032cc07

Prepare Termux rust-v0.129.0-alpha.7

cdf527e

Merge pull request #86 from wallentx/upstream/rust-v0.129.0-alpha.7

1a78ed0

Termux rust-v0.129.0-alpha.7

unemployabot Bot requested a review from wallentx May 6, 2026 01:32

unemployabot Bot added checkpoint Checkpoint merge termux-release Termux release automation labels May 6, 2026

wallentx merged commit 79db502 into wallentx/termux-target May 6, 2026
1 check passed

wallentx deleted the checkpoint/wallentx_termux-target_from_release_0.129.0_1a78ed0ef81a branch May 6, 2026 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoint: into wallentx/termux-target from release/0.129.0 @ 1a78ed0ef81a#88

checkpoint: into wallentx/termux-target from release/0.129.0 @ 1a78ed0ef81a#88
wallentx merged 27 commits intowallentx/termux-targetfrom
checkpoint/wallentx_termux-target_from_release_0.129.0_1a78ed0ef81a

unemployabot Bot commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

Conversation

unemployabot Bot commented May 6, 2026

Termux release checkpoint

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants