feat: add mlx-server runtime to the metal-agent by Defilan · Pull Request #471 · defilantech/LLMKube

Defilan · 2026-05-15T19:23:28Z

What

Add mlx-server as a metal-agent inference runtime, alongside llama-server,
omlx, ollama, and vllm-swift.

Why

mlx-server is a native
OpenAI-compatible MLX inference server for Apple Silicon. This lets the
metal-agent supervise it as a first-class InferenceService runtime — the
same lifecycle the agent already gives the other runtimes.

No linked issue.

How

MLXServerExecutor (pkg/agent/executor_mlx_server.go) implements
ProcessExecutor: resolves the model directory, spawns mlx-server, polls
/health, and stops it with SIGTERM plus a grace period. Unlike the
vllm-swift executor's per-spawn ephemeral port it binds a fixed port
(default 8080) so clients keep a stable base URL across respawns.
runtimeMLXServer wired into the executor-selection switch and
validateRuntimeFormat (MLX directories and HF safetensors accepted, gguf
rejected — same as vllm-swift).
--mlx-server-bin, --mlx-server-port, and --mlx-server-startup-timeout
flags on the metal-agent, with binary auto-detection.
mlx-server-specific flags such as --tool-call-format and --reasoning
ride through InferenceService.spec.extraArgs — no new CRD fields.

Checklist

Tests added/updated — executor_mlx_server_test.go covers arg construction, the fixed-port executor, health polling, and process teardown
make test passes locally
make lint passes locally
Commit messages follow conventional commits
All commits are signed off (git commit -s) per DCO
Documentation updated (if user-facing change) — the macOS Metal guide's runtime list is a small follow-up

Add mlx-server as a metal-agent inference runtime alongside llama-server, omlx, ollama, and vllm-swift. mlx-server (github.com/Defilan/mlx-server) is a native OpenAI-compatible MLX inference server; this lets the agent supervise it as a first-class InferenceService runtime. - MLXServerExecutor (executor_mlx_server.go) implements ProcessExecutor: resolves the model directory, spawns mlx-server, polls /health, and stops it with SIGTERM plus a grace period. Unlike the vllm-swift executor's per-spawn ephemeral port it binds a fixed port (default 8080) so clients keep a stable base URL across respawns. - runtimeMLXServer wired into the executor-selection switch and into validateRuntimeFormat (MLX directories and HF safetensors accepted, gguf rejected — same as vllm-swift). - --mlx-server-bin, --mlx-server-port, and --mlx-server-startup-timeout flags on the metal-agent, with binary auto-detection. - mlx-server-specific flags such as --tool-call-format and --reasoning ride through InferenceService.spec.extraArgs; no new CRD fields. Unit tests cover argument construction, the fixed-port executor, health polling, and process teardown. Signed-off-by: Christopher Maher <chris@mahercode.io>

codecov · 2026-05-15T19:27:15Z

Codecov Report

❌ Patch coverage is 28.02548% with 113 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pkg/agent/executor_mlx_server.go	40.74%	63 Missing and 1 partial ⚠️
cmd/metal-agent/main.go	0.00%	33 Missing ⚠️
pkg/agent/agent.go	0.00%	16 Missing ⚠️

📢 Thoughts on this report? Let us know!

Defilan merged commit 8bf9808 into defilantech:main May 17, 2026
21 checks passed

github-actions Bot mentioned this pull request May 17, 2026

chore: release 0.7.9 #470

Merged

Defilan mentioned this pull request May 17, 2026

fix: point metal-agent mlx-server install hint at the Homebrew formula #477

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add mlx-server runtime to the metal-agent#471

feat: add mlx-server runtime to the metal-agent#471
Defilan merged 1 commit into
defilantech:mainfrom
Defilan:feat/metal-agent-mlx-server-runtime

Defilan commented May 15, 2026

Uh oh!

codecov Bot commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Defilan commented May 15, 2026

What

Why

How

Checklist

Uh oh!

codecov Bot commented May 15, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant