Skip to content

Remote SSH Desktop reuses stale app-server and masks model_not_found as Reconnecting #19300

@hans43564334-pixel

Description

@hans43564334-pixel

What happened?

When using Codex Desktop with a remote SSH project, the UI got stuck showing reconnect retries after the first message:

Reconnecting... 2/5
Reconnecting... 3/5
Reconnecting... 4/5

From the UI, this looks like a transport/WebSocket problem. However, the remote Codex logs showed the real error was a model/catalog/access error:

stream disconnected - retrying sampling request (1/5 ... 5/5)
Turn error: stream disconnected before completion: The model `gpt-5.5` does not exist or you do not have access to it.

The app-server WebSocket event also contained:

{"type":"error","error":{"type":"invalid_request_error","code":"model_not_found","message":"The model `gpt-5.5` does not exist or you do not have access to it."}}

The confusing part: running the remote CLI directly with the same model worked:

codex exec -C <remote-project> --skip-git-repo-check -m gpt-5.5 -s read-only 'reply OK'

Result:

OpenAI Codex v0.124.0
model: gpt-5.5
...
OK

While debugging, I found an old remote app-server process still listening on the remote host:

codex app-server --listen ws://127.0.0.1:9234  # started Apr 21

After killing the old app-server process, Codex Desktop spawned a fresh remote app-server:

codex app-server --listen ws://127.0.0.1:9234  # started Apr 24

This suggests the Desktop remote-SSH path may be reusing a stale remote app-server whose runtime/model catalog/auth state no longer matches the current CLI/Desktop configuration. It also suggests model_not_found is being presented to the user as generic Reconnecting..., which makes the root cause hard to diagnose.

Environment

  • Codex Desktop client: 26.422.21637 from logs
  • Remote host: Ubuntu 24.04.3 LTS
  • Remote Codex CLI: codex-cli 0.124.0
  • Auth mode: ChatGPT login
  • Remote connection type: Codex Desktop remote SSH project
  • Model configured/used: gpt-5.5
  • Remote sandbox note: Ubuntu AppArmor/bubblewrap issue was already fixed; bwrap --ro-bind / / true succeeds and codex app-server </dev/null exits successfully.

Expected behavior

  1. If an existing remote app-server is stale, Codex Desktop should detect that during handshake and restart it or refuse it with a clear diagnostic.
  2. If the backend returns model_not_found, the Desktop UI should show a model/catalog/access error instead of Reconnecting....
  3. The remote app-server handshake should probably include enough version/model-catalog/auth metadata for Desktop to know whether it can safely reuse the process.

Actual behavior

  • Desktop UI only showed reconnect attempts.
  • Remote logs showed model_not_found.
  • Direct remote CLI with gpt-5.5 succeeded.
  • A stale codex app-server --listen ws://127.0.0.1:9234 process was still running and appeared to be reused until it was manually killed.

Possibly related

Suggested fix direction

  • Add remote app-server preflight/handshake checks: server version, start time or generation, auth mode, and model catalog compatibility.
  • Restart or invalidate an existing remote app-server when the current Desktop/CLI configuration no longer matches.
  • Map backend model_not_found to a clear UI message instead of retrying as if it were only a connection problem.
  • Optionally add a diagnostic command or log line for the exact remote app-server PID Desktop connected to.

Metadata

Metadata

Assignees

No one assigned

    Labels

    appIssues related to the Codex desktop appapp-serverIssues involving app server protocol or interfacesbugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions