Skip to content

GuLu9527/FlashMLX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

⚡ FlashMLX

macOS Swift License Release Apple Silicon

中文文档

A lightweight macOS menubar app for managing local MLX model inference servers.

~2MB binary — menubar-resident, one-click start/stop.

Features

  • 🔍 Auto-scan MLX models from HuggingFace cache (~/.cache/huggingface/hub/)
  • 🚀 One-click start/stop MLX inference server (mlx_lm.server / mlx-openai-server)
  • 🧩 Embedding model support — auto-detect & serve embedding models via /v1/embeddings
  • ⚙️ Configure context length (2K–128K), port, model type, Python path
  • 📊 Real-time monitoring — server status, uptime, memory RSS
  • 📋 Quick Actions — copy API URL / cURL command with one click
  • Python Verification — validate mlx-lm installation from Settings
  • 🚀 Launch at Login — optional auto-start via SMAppService
  • Detach to Window — popover can detach to a resizable floating window
  • 🔔 Notifications — system notifications on server start/stop/error
  • 🌐 i18n — English + Chinese, follows system language
  • 🛡️ Process management — multi-instance prevention, orphan cleanup, port conflict auto-resolve
  • 🪶 Lightweight — ~2MB, no Electron, pure Swift + SwiftUI

Requirements

  • macOS 14.0+ (Sonoma)
  • Apple Silicon Mac (M1/M2/M3/M4)
  • Python 3.x with mlx-lm installed
  • For embedding models: mlx-openai-server (pip install mlx-openai-server)

Quick Start

1. Python Environment

python3 -m venv ~/mlx-env
~/mlx-env/bin/pip install mlx-lm

# Optional: for embedding model support
~/mlx-env/bin/pip install mlx-openai-server

2. Download an MLX Model

# Language model
~/mlx-env/bin/huggingface-cli download mlx-community/Qwen2.5-7B-Instruct-4bit

# Embedding model (optional)
~/mlx-env/bin/huggingface-cli download mlx-community/nomic-embed-text-v1.5-bf16

3. Build & Run

# Install XcodeGen (one-time)
brew install xcodegen

# Generate Xcode project
xcodegen generate

# Build via command line
xcodebuild -project FlashMLX.xcodeproj -scheme FlashMLX -configuration Release build

# Or open in Xcode and press ⌘R
open FlashMLX.xcodeproj

4. Use

  1. Click the ⚡ icon in the menubar
  2. Select a model from the sidebar
  3. Adjust config if needed (context length, port)
  4. Click Start — server runs at http://localhost:8000/v1
# Chat completion (language model)
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "messages": [{"role": "user", "content": "Hello"}]}'

# Embeddings (embedding model)
curl http://localhost:8000/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "input": "Hello world"}'

Architecture

FlashMLX/
├── FlashMLXApp.swift               # @main entry point
├── AppDelegate.swift                # NSStatusBar + Popover + icon state
├── Models/
│   ├── MLXModel.swift               # Local model data struct
│   └── ServerConfig.swift           # Server config (Codable)
├── Services/
│   ├── ModelScanner.swift           # Scan ~/.cache/huggingface/hub/
│   ├── ServerManager.swift          # Process start/stop + logs + memory + health
│   └── ConfigManager.swift          # UserDefaults persistence
├── Views/
│   ├── PopoverView.swift            # Main container (header + sidebar + tabs)
│   ├── ModelListView.swift          # Sidebar model list with badges
│   ├── ConfigView.swift             # Config panel (context, port, type, python)
│   ├── StatusView.swift             # Status cards + quick actions
│   ├── LogView.swift                # Real-time log viewer
│   └── SettingsView.swift           # Launch at login, python verify, reset
├── en.lproj/Localizable.strings     # English
└── zh-Hans.lproj/Localizable.strings # Chinese

Key Decisions

Decision Choice Reason
Window mode Menubar Popover Launcher doesn't need a full window
Process mgmt Foundation.Process Swift native subprocess lifecycle control
LM Backend mlx_lm.server CLI Reuse existing inference engine, UI shell only
Embedding Backend mlx-openai-server Dedicated embedding server with /v1/embeddings
Process safety PID lock + orphan cleanup Prevent duplicates, auto-clean crashed processes
Model discovery Scan HF cache Users' existing models work immediately
Config storage UserDefaults macOS native, simple, reliable
Launch at login SMAppService Modern macOS API, no LaunchAgent plist needed

Menubar Icon States

Color Meaning
🟢 Green Server running
🟠 Orange Server starting
🔴 Red Error
⚪ Gray Stopped

License

MIT

About

⚡ FlashMLX — A lightweight macOS menubar app for managing local MLX model inference servers. One-click start/stop, auto model scanning, real-time monitoring. Pure Swift + SwiftUI, ~1MB binary.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages