AppleML

On-device speech-to-text and OCR as a native macOS app. Uses Apple's Speech and Vision frameworks with Neural Engine acceleration. All processing runs locally -- no cloud dependencies.

Follows the Ollama pattern: one app, two ways to run it.

Features

Speech-to-Text: Transcription via SFSpeechRecognizer with word-level timestamps
OCR: Text recognition via VNRecognizeTextRequest with bounding boxes
Automatic Language Detection: Detects language from audio when not specified
On-Device: All processing on Apple Neural Engine, no data leaves your machine
REST API: JSON API with OpenAPI 3.0 spec, backward-compatible with CLI/SDK
GUI: Drag-drop file upload, history browser, settings
Menu Bar: Always-visible server status indicator

Requirements

macOS 14+ (Sonoma)
Apple Silicon (M1/M2/M3/M4)
Xcode 15+ (to build)
Rust 1.75+ (for CLI client, optional)

Quick Start

Run as app (GUI + server)

open AppleML.app

Menu bar icon appears, API server starts on port 8080, GUI window available for file uploads and history.

Run as headless server (like `ollama serve`)

./AppleML.app/Contents/MacOS/AppleML --serve

Server starts on port 8080 with no GUI. Ideal for remote machines or automation.

Use the CLI client

# Health check
apple-ml health

# Transcribe (language auto-detected)
apple-ml transcribe -f audio.mp3

# Transcribe with explicit language
apple-ml transcribe -f audio.m4a -l zh-CN --timestamps

# OCR
apple-ml ocr -f screenshot.png

API Endpoints

Endpoint	Method	Description
`/health`	GET	Health check
`/version`	GET	Server version
`/openapi.yaml`	GET	OpenAPI spec
`/transcribe`	POST	Speech-to-text
`/ocr`	POST	Image OCR

See docs/API.md for full API documentation.

macOS Permissions

As a native .app bundle, macOS handles permissions automatically:

Speech Recognition: Prompted automatically on app launch. Click Allow in the system dialog.
Incoming Network: Prompted on first remote connection.
OCR: No special permissions required.

No manual TCC configuration or tccutil needed.

Project Structure

apple-ml-server/
├── AppleML/                     # Xcode project (macOS app)
│   ├── AppleML.xcodeproj
│   └── AppleML/
│       ├── App/                 # SwiftUI entry, menu bar, lifecycle
│       ├── Views/               # Transcribe, OCR, History, Settings
│       ├── Server/              # Embedded Vapor server + routes
│       ├── Core/                # ML services (shared by GUI and API)
│       └── Storage/             # SwiftData history
├── cli/                         # Rust CLI client
│   ├── Cargo.toml
│   └── src/main.rs
├── sdk/                         # Rust SDK library
│   ├── Cargo.toml
│   └── src/
├── docs/                        # Architecture & design docs
└── README.md

Building

App (Xcode)

# Open in Xcode
open AppleML/AppleML.xcodeproj

# Or build from command line
xcodebuild -project AppleML/AppleML.xcodeproj -scheme AppleML -configuration Release build

CLI (Rust)

cargo build --release
# Binary at ./target/release/apple-ml

Configuration

Setting	Default	Description
Port	8080	API server port
Bind address	0.0.0.0	Network interface to listen on
Launch at login	Off	Start app automatically on login
Default language	Auto-detect	Default transcription language

Settings configurable via the GUI (Settings tab) or environment variables (PORT, HOST) in headless mode.

CLI Client

# Set endpoint (default: localhost:8080)
export APPLE_ML_ENDPOINT=http://mac-mini:8080

# Transcribe
apple-ml transcribe -f audio.m4a -l zh-CN --timestamps -o json

# OCR
apple-ml ocr -f image.png -o json

# Health
apple-ml health

Example: Transcribe with cURL

curl -X POST http://localhost:8080/transcribe \
  -H "Content-Type: application/json" \
  -d "{\"audio\": \"$(base64 -i audio.m4a)\", \"format\": \"m4a\"}"

Language is auto-detected when omitted.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude/rules		.claude/rules
.github/workflows		.github/workflows
AppleML		AppleML
archived/server		archived/server
cli		cli
docs		docs
sdk		sdk
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md
SPEC.md		SPEC.md
ecosystem.config.js		ecosystem.config.js
openapi.yaml		openapi.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AppleML

Features

Requirements

Quick Start

Run as app (GUI + server)

Run as headless server (like `ollama serve`)

Use the CLI client

API Endpoints

macOS Permissions

Project Structure

Building

App (Xcode)

CLI (Rust)

Configuration

CLI Client

Example: Transcribe with cURL

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AppleML

Features

Requirements

Quick Start

Run as app (GUI + server)

Run as headless server (like ollama serve)

Use the CLI client

API Endpoints

macOS Permissions

Project Structure

Building

App (Xcode)

CLI (Rust)

Configuration

CLI Client

Example: Transcribe with cURL

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Run as headless server (like `ollama serve`)

Packages