Skip to content

feat(model-server): support Windows via TCP loopback transport#389

Open
Huntehhh wants to merge 1 commit into
buildingjoshbetter:mainfrom
Huntehhh:fix/windows-model-server-tcp
Open

feat(model-server): support Windows via TCP loopback transport#389
Huntehhh wants to merge 1 commit into
buildingjoshbetter:mainfrom
Huntehhh:fix/windows-model-server-tcp

Conversation

@Huntehhh
Copy link
Copy Markdown
Contributor

The shared model server talks over a Unix domain socket (socket.AF_UNIX), which is unavailable in CPython on Windows — so the server cannot run there at all, and every process falls back to loading its own embedding + reranker models (a 149M-param cross-encoder per process). On a multi-client setup that means heavy GPU/CPU contention and multi-minute cold loads.

Fix

Platform-branch the transport:

  • POSIX keeps AF_UNIX on ~/.truememory/model.sock byte-for-byte.
  • Windows binds AF_INET on 127.0.0.1 (loopback only) to an OS-assigned ephemeral port, written to ~/.truememory/model_server.port for the client to read.
  • The length-prefixed pickle wire protocol is unchanged.
  • Server liveness / auto-start / cleanup made cross-platform (port-file + PID check; headless CREATE_NO_WINDOW spawn on Windows).

Verified on Windows

Server auto-started; an embedding round-trip returned a finite (1, N) vector; rerank correctly ranked the on-topic doc above an off-topic one — all routed through the loopback server. Bound to 127.0.0.1 only (confirmed via Get-NetTCPConnection).

Security note

Loopback TCP is reachable by other local processes — the same trust boundary as the POSIX Unix socket (both carry pickle). Appropriate for single-user machines; flagged in code for multi-user hosts.

Test plan

  • Existing suite green; manual Windows round-trip (embed + rerank). POSIX path byte-for-byte unchanged.

The shared model server communicates over a Unix domain socket
(socket.AF_UNIX), which is unavailable in CPython on Windows -- so the server
cannot run there and every process falls back to loading its own embedding +
reranker models (a 149M-param cross-encoder per process), causing heavy
contention and multi-minute cold loads on multi-client setups.

Port the transport to platform-branched sockets: POSIX keeps AF_UNIX on
~/.truememory/model.sock byte-for-byte; Windows binds AF_INET on 127.0.0.1
(loopback only) to an OS-assigned ephemeral port, written to
~/.truememory/model_server.port for the client to read. The length-prefixed
pickle wire protocol is unchanged. Server liveness, auto-start and cleanup are
made cross-platform (port-file + PID check; headless CREATE_NO_WINDOW spawn on
Windows). Verified end-to-end on Windows (embedding + rerank round-trip over
the loopback server).

Co-Authored-By: claude-opus-4-7 <wontreply@getfucked.ai>
Huntehhh pushed a commit to Huntehhh/TrueMemory that referenced this pull request May 28, 2026
Huntehhh pushed a commit to Huntehhh/TrueMemory that referenced this pull request May 28, 2026
…ildingjoshbetter#389 merge

The WINDOWS-MODEL-SERVER.md doc (144 lines — enable/rollback steps + localhost-TCP
security note) was authored in local/win-fixes@10585e4 but the upstream PR buildingjoshbetter#389
commit (67c6ab3) didn't include it, so the merge into local/instrumentation-diag
(8ca7986) silently dropped it.  Cherry-picking the file only — the model-server
code itself already landed correctly via the PR.

Closes the only gap between local/win-fixes and local/instrumentation-diag;
win-fixes worktree + branch are now safe to delete.

Co-Authored-By: claude-sonnet-4-6 <wontreply@getfucked.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant