macMLX

Native macOS LLM inference, powered by Apple MLX.

macMLX brings local LLM inference to Apple Silicon with a first-class native macOS experience. No cloud, no telemetry, no Electron — just your Mac running models at full speed.

macMLX is for everyone: a polished SwiftUI app for newcomers, and a proper CLI for developers.

Why macMLX?

	macMLX	LM Studio	Ollama	oMLX
Native macOS GUI	✅ SwiftUI	❌ Electron	❌	❌ Web UI
MLX-native inference	✅	❌ GGUF	❌ GGUF	✅
CLI	✅	❌	✅	✅
Resumable downloads + mirrors	✅	⚠ partial	⚠ partial	❌
OpenAI-compatible API	✅ always-on	✅	✅	✅
Zero Python required	✅	✅	✅	❌

Requirements

macOS 14.0 (Sonoma) or later
Apple Silicon (M1 / M2 / M3 / M4)
No Python required

Installation

Download macMLX-vX.X.X.dmg from Releases, mount it, and drag macMLX.app to /Applications.

The DMG is not notarized (no paid Apple Developer account yet — #19), so Gatekeeper blocks it on first launch. Pick one of the two unblocks:

Option A — terminal (recommended, always works):

xattr -cr /Applications/macMLX.app    # clear quarantine attribute
open /Applications/macMLX.app         # first launch

Option B — right-click: right-click macMLX.app → Open → then click Open again in the dialog. On newer macOS versions this fallback dialog sometimes doesn't appear — if so, use Option A.

Want to see what Gatekeeper thinks of the app?

spctl --assess --verbose /Applications/macMLX.app

Feature highlights (v0.2 → v0.3.5)

Twelve-ish shipped releases since the v0.1 MVP. Pick the ones that matter:

Downloads

Resumable downloads survive cancels AND app quits (background URLSession + persisted resume data) — #5/#6/#8
Live speed (MB/s) + ETA + per-file progress bar — #7
Configurable Hugging Face endpoint for mirrors like https://hf-mirror.com (GUI + CLI, both) — #21

Chat

Conversation sidebar: switch between saved chats, rename, delete, rewind to here (truncate after any message) — v0.3.2
Streaming Markdown rendering with paragraph breaks preserved — #10 (+ v0.3.1 fix)
Right-click any message: Copy / Edit / Regenerate / Delete — #11
Per-model Parameters Inspector (⌘⌥I) — temperature, top_p, max tokens, system prompt persist to disk — #15
Chat model switcher in toolbar loads on tap — v0.3.1

Benchmark — v0.3.0 tab for local tok/s, TTFT, peak memory, and history, with Share to Community to a GitHub-issue leaderboard — #22

Logs — v0.3.4 tab reads Pulse's store directly: search, level filter, live tail, clear

API (OpenAI-compat)

Cold-swap: /v1/chat/completions auto-loads any locally-downloaded model by ID, serialises concurrent swaps — v0.3.3
/x/status reports real RSS

CLI — native ANSI dashboards (macmlx pull, serve, run), honours preferredEngine + per-model ModelParameters + HF mirror settings — v0.3.1 / v0.3.3 / v0.3.5

Stability / polish — chat survives sidebar tab switches (#1), single-instance enforcement (#2), Quit in menu bar (#17), macmlx list segfault fix (v0.3.1), ConversationStore date-precision fix (v0.3.3), and a 3-commit independent code-review sweep in v0.3.0

Full per-tag breakdown: CHANGELOG.md.

Quickstart

GUI

Launch macMLX — the setup wizard points you at ~/.mac-mlx/models and picks the MLX Swift engine
Download a model from the built-in HuggingFace browser (resumable, works through mirrors)
Load it and start chatting

CLI

macmlx pull mlx-community/Qwen3-8B-4bit     # download
macmlx list                                  # local models
macmlx run Qwen3-8B-4bit "Hello, world"      # single prompt
macmlx run Qwen3-8B-4bit                     # interactive
macmlx serve                                 # start API on :8000
macmlx ps                                    # is serve running?
macmlx stop                                  # graceful SIGTERM

Connecting external tools

macMLX's OpenAI-compatible server runs on http://localhost:8000/v1 whenever you load a model (or whenever macmlx serve is running).

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"Qwen3-8B-4bit","messages":[{"role":"user","content":"Hi"}],"stream":true}'

Any OpenAI-compatible client works — point it at http://localhost:8000/v1 with any key:

Cursor / Continue / Cline: set the custom base URL in settings
Open WebUI: add as an OpenAI provider
Raycast, Zed, etc.: same pattern

Inference engines

Engine	Status	Notes
MLX Swift (default)	✅ Shipping	Apple's `mlx-swift-lm`, in-process. Supports models up to ~70B on 64 GB+ Macs.
SwiftLM (100B+ MoE)	🕒 Deferred to v0.3	Subprocess launch blocked by App Sandbox policy; revisit when there's a concrete user ask (#12).
Python mlx-lm	🕒 Deferred to v0.3	Same sandbox blocker (#13).

Settings → Engine shows Install Guide links for the deferred engines; selecting them today surfaces a graceful "engine not available" state.

Architecture

macMLX.app (SwiftUI)        macmlx (CLI)
            │                    │
            └─────── MacMLXCore ─┘    (Swift SPM package)
                        │
               InferenceEngine
                        │
                  MLXSwiftEngine    (in-process, mlx-swift-lm 3.31.x)
                        │
                  HummingbirdServer  → http://localhost:8000/v1
                        │
                Apple Silicon (Metal / ANE)

Data lives under ~/.mac-mlx/:

~/.mac-mlx/
├── models/              # weights (default, changeable in Settings)
├── conversations/       # chat history JSON
├── model-params/        # per-model parameter overrides
├── downloads/           # resume-data for interrupted downloads
├── logs/                # Pulse logs
├── settings.json        # user preferences
└── macmlx.pid           # CLI daemon coordination

This path is deliberately a dotfile under real $HOME: macOS App Sandbox's dotfile exemption lets a sandboxed app read/write here without user-selected.read-write entitlements or security-scoped bookmarks, while staying visible to power users.

Building from source

git clone https://github.com/magicnight/mac-mlx
cd mac-mlx
brew bundle                            # dev tools

# GUI app
open macMLX/macMLX.xcodeproj           # or: xcodebuild -scheme macMLX build

# CLI
swift build --package-path macmlx-cli

# Core + tests
swift test --package-path MacMLXCore   # 90 tests, runs in ~3s

Roadmap

Shipped

v0.1.0 — native SwiftUI GUI, menu bar, CLI (serve / pull / run / list / ps / stop), HuggingFace downloader, OpenAI-compatible API, Sparkle auto-update, memory-aware onboarding.
v0.2.0 — Download + chat polish (10 issues): resumable downloads, HF mirrors, Markdown rendering, message edit/regenerate, Parameters Inspector.
v0.3.x — six patch releases: Benchmark feature, cross-cutting gap fixes, UX patches, Chat history sidebar, API cold-swap, Logs tab, native ANSI CLI dashboards. See CHANGELOG.md for the per-tag breakdown.

Next (v0.3.6 — maintenance patch)

macmlx --version auto-bumped from the release tag
macmlx search <query> command (queries mlx-community by default)
Release binary slim-down via strip -S + dynamic Swift stdlib
CLI --log-level + --log-stderr flags so Pulse logging surfaces from the terminal

Next minor (v0.4.0)

#23 Vision-Language Model support — MLXVLM already in the dependency tree, 16 architectures (Qwen2.5-VL, SmolVLM, Gemma-3, Paligemma, …). Full plan in .omc/plans/v0.4-vlm-plan.md.

Later (v0.5+)

v0.5 — LoRA adapter loading (drop in existing HF adapters, no training) + conversation/dataset export
v0.6 — Speech I/O: WhisperKit for ASR (mic input in chat) + AVSpeechSynthesizer for TTS (play assistant replies)
#20 Homebrew tap for the CLI (scheduled around v0.3.6–v0.4 once the CLI tarball lands as a release asset)

Deferred / blocked

#19 Signed + notarized DMG — needs a paid Apple Developer account
Full native-MLX Whisper in Swift — upstream mlx-swift-lm doesn't ship audio models yet; WhisperKit (Core ML) covers the UX in the meantime
#12 / #13 Subprocess-based engines (SwiftLM, Python mlx-lm) — closed as not planned because App Sandbox blocks spawning external binaries. Reopenable if sandbox policy is revisited or a Swift-native 100B+ MoE inference path appears.

Contributing

See CONTRIBUTING.md. Issues and PRs welcome.

License

Apache 2.0 — see LICENSE

Acknowledgements

MLX and mlx-swift-lm by Apple
Swama — Swift inference architecture inspiration
SwiftLM — 100B+ MoE engine (future integration)
oMLX — feature depth reference
Hummingbird — Swift HTTP server
Sparkle — auto-update framework
Pulse — logging framework
SwiftTUI — TUI framework

Full BibTeX citations: CITATIONS.bib

Name		Name	Last commit message	Last commit date
Latest commit History 117 Commits
.claude		.claude
.github		.github
.omc		.omc
MacMLXCore		MacMLXCore
macMLX		macMLX
macmlx-cli		macmlx-cli
scripts		scripts
.gitignore		.gitignore
Brewfile		Brewfile
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CITATIONS.bib		CITATIONS.bib
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SECURITY.md		SECURITY.md
appcast.xml		appcast.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

macMLX

Why macMLX?

Requirements

Installation

Feature highlights (v0.2 → v0.3.5)

Quickstart

GUI

CLI

Connecting external tools

Inference engines

Architecture

Building from source

Roadmap

Shipped

Next (v0.3.6 — maintenance patch)

Next minor (v0.4.0)

Later (v0.5+)

Deferred / blocked

Contributing

License

Acknowledgements

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

macMLX

Why macMLX?

Requirements

Installation

Feature highlights (v0.2 → v0.3.5)

Quickstart

GUI

CLI

Connecting external tools

Inference engines

Architecture

Building from source

Roadmap

Shipped

Next (v0.3.6 — maintenance patch)

Next minor (v0.4.0)

Later (v0.5+)

Deferred / blocked

Contributing

License

Acknowledgements

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages