⚡ FlashMLX

中文文档

A lightweight macOS menubar app for managing local MLX model inference servers.

~2MB binary — menubar-resident, one-click start/stop.

Features

🔍 Auto-scan MLX models from HuggingFace cache (~/.cache/huggingface/hub/)
🚀 One-click start/stop MLX inference server (mlx_lm.server / mlx-openai-server)
🧩 Embedding model support — auto-detect & serve embedding models via /v1/embeddings
⚙️ Configure context length (2K–128K), port, model type, Python path
📊 Real-time monitoring — server status, uptime, memory RSS
📋 Quick Actions — copy API URL / cURL command with one click
Python Verification — validate mlx-lm installation from Settings
🚀 Launch at Login — optional auto-start via SMAppService
� Detach to Window — popover can detach to a resizable floating window
🔔 Notifications — system notifications on server start/stop/error
🌐 i18n — English + Chinese, follows system language
🛡️ Process management — multi-instance prevention, orphan cleanup, port conflict auto-resolve
🪶 Lightweight — ~2MB, no Electron, pure Swift + SwiftUI

Requirements

macOS 14.0+ (Sonoma)
Apple Silicon Mac (M1/M2/M3/M4)
Python 3.x with mlx-lm installed
For embedding models: mlx-openai-server (pip install mlx-openai-server)

Quick Start

1. Python Environment

python3 -m venv ~/mlx-env
~/mlx-env/bin/pip install mlx-lm

# Optional: for embedding model support
~/mlx-env/bin/pip install mlx-openai-server

2. Download an MLX Model

# Language model
~/mlx-env/bin/huggingface-cli download mlx-community/Qwen2.5-7B-Instruct-4bit

# Embedding model (optional)
~/mlx-env/bin/huggingface-cli download mlx-community/nomic-embed-text-v1.5-bf16

3. Build & Run

# Install XcodeGen (one-time)
brew install xcodegen

# Generate Xcode project
xcodegen generate

# Build via command line
xcodebuild -project FlashMLX.xcodeproj -scheme FlashMLX -configuration Release build

# Or open in Xcode and press ⌘R
open FlashMLX.xcodeproj

4. Use

Click the ⚡ icon in the menubar
Select a model from the sidebar
Adjust config if needed (context length, port)
Click Start — server runs at http://localhost:8000/v1

# Chat completion (language model)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "messages": [{"role": "user", "content": "Hello"}]}'

# Embeddings (embedding model)
curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "input": "Hello world"}'

Architecture

FlashMLX/
├── FlashMLXApp.swift               # @main entry point
├── AppDelegate.swift                # NSStatusBar + Popover + icon state
├── Models/
│   ├── MLXModel.swift               # Local model data struct
│   └── ServerConfig.swift           # Server config (Codable)
├── Services/
│   ├── ModelScanner.swift           # Scan ~/.cache/huggingface/hub/
│   ├── ServerManager.swift          # Process start/stop + logs + memory + health
│   └── ConfigManager.swift          # UserDefaults persistence
├── Views/
│   ├── PopoverView.swift            # Main container (header + sidebar + tabs)
│   ├── ModelListView.swift          # Sidebar model list with badges
│   ├── ConfigView.swift             # Config panel (context, port, type, python)
│   ├── StatusView.swift             # Status cards + quick actions
│   ├── LogView.swift                # Real-time log viewer
│   └── SettingsView.swift           # Launch at login, python verify, reset
├── en.lproj/Localizable.strings     # English
└── zh-Hans.lproj/Localizable.strings # Chinese

Key Decisions

Decision	Choice	Reason
Window mode	Menubar Popover	Launcher doesn't need a full window
Process mgmt	Foundation.Process	Swift native subprocess lifecycle control
LM Backend	`mlx_lm.server` CLI	Reuse existing inference engine, UI shell only
Embedding Backend	`mlx-openai-server`	Dedicated embedding server with `/v1/embeddings`
Process safety	PID lock + orphan cleanup	Prevent duplicates, auto-clean crashed processes
Model discovery	Scan HF cache	Users' existing models work immediately
Config storage	UserDefaults	macOS native, simple, reliable
Launch at login	SMAppService	Modern macOS API, no LaunchAgent plist needed

Menubar Icon States

Color	Meaning
🟢 Green	Server running
🟠 Orange	Server starting
🔴 Red	Error
⚪ Gray	Stopped

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
FlashMLX		FlashMLX
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
README_CN.md		README_CN.md
project.yml		project.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ FlashMLX

Features

Requirements

Quick Start

1. Python Environment

2. Download an MLX Model

3. Build & Run

4. Use

Architecture

Key Decisions

Menubar Icon States

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ FlashMLX

Features

Requirements

Quick Start

1. Python Environment

2. Download an MLX Model

3. Build & Run

4. Use

Architecture

Key Decisions

Menubar Icon States

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages