Local speech, vision OCR, and Ollama — with macOS tool calling
One window, menu bar status, FN push-to-talk.
CI badge updates after the first successful run on main.
- About
- Screenshots
- Features
- Requirements
- Getting started
- Configuration
- Built-in tools
- MCP (experimental)
- Roadmap · detailed roadmap
- Documentation
- Contributing
- Security
- License
RLeon is an open-source macOS app for people who want a local-first workflow: dictate with your voice, pull text from the screen or images with Vision, and chat with a local Ollama model — optionally with function calling into native macOS actions (pasteboard, open apps/URLs, Terminal, type into the focused field, and more).
- Privacy by default: speech and OCR run on-device; Ollama is typically
localhost(you control the model). - English UI and default prompts (all user-visible strings are English).
- Recognition: speech defaults to en-US; Vision OCR uses en-US then tr-TR for mixed-language screen text.
- Safety-first: high-risk tools (shell, typing into other apps) are off by default and gated in Settings.
Illustrative preview of the main window layout. For a real capture from your machine, build Release and run scripts/capture_screenshot.sh (saves docs/images/main-window-real.png).
| Speech | On-device dictation (en-US default) via SFSpeechRecognizer |
| OCR | Main display capture + image picker + Vision VNRecognizeTextRequest |
| LLM | Ollama /api/chat with optional tool calling |
| Tools | Curated Swift tools + optional mcp_* bridge hook for future MCP |
| FN + menu bar | Short tap vs hold for dictation vs full speech+OCR→LLM pipeline |
| OS | macOS 14+ |
| Build | Xcode 15+ |
| LLM (optional) | Ollama — use a tool-capable model if you want function calling |
| Permissions | Microphone, Speech Recognition, Screen Recording (for capture), Accessibility (for typing into other apps) |
git clone https://github.com/efekurucay/RLeon.git
cd RLeon
open RLeon.xcodeproj- Select scheme RLeon, destination My Mac.
- ⌘B to build; product is
RLeon.app.
Release build (CLI):
xcodebuild -scheme RLeon -configuration Release -destination 'platform=macOS' buildTypical output path:
~/Library/Developer/Xcode/DerivedData/RLeon-*/Build/Products/Release/RLeon.app
| Topic | What to do |
|---|---|
| Ollama | Run ollama serve; in the app, set base URL (default http://127.0.0.1:11434) and model name. |
| Tool calling | Enable in the LLM panel; configure which tools are listed under Settings → Local tools. |
| Dangerous tools | Under Settings → Dangerous tools & MCP, allow Terminal and/or type into focused field (off until you confirm once). Each shell command and insertion can show an extra Run / Type dialog (default on; can be disabled there for trusted setups only). |
MCP (mcp_*) |
Experimental toggle in the same section; full swift-sdk wiring is still in progress. |
Tools are sent to Ollama as OpenAI-compatible function definitions when enabled in Local tools and safety rules allow.
| Tool ID | Description | Risk |
|---|---|---|
copy_to_clipboard |
Copy text to the pasteboard | Low |
get_app_info |
RLeon name / version | Low |
open_application |
Launch app by name or bundle ID | Medium |
open_url |
Open URL (Safari or default browser) | Medium |
whatsapp_compose |
Open WhatsApp desktop via whatsapp:// |
Medium |
run_terminal_command |
Run shell in a new Terminal window | High |
type_into_focused_field |
Type into focused UI via Accessibility / events | High |
Dangerous tools (run_terminal_command, type_into_focused_field) are not visible to the model until you enable them under Settings → Dangerous tools & MCP (with confirmation). You still enable each tool in Local tools if you want it in the list. When a dangerous tool runs, Terminal shows the exact command in a modal (unless you turn off “ask before each”); typing can likewise ask before inserting into another app.
- Model Context Protocol — standard way to expose external tools.
- This repo includes
MCPToolBridgeas a stub (no livetools/list/tools/callyet). Naming:mcp_<serverSlug>_<toolName>. - To wire it: add MCP from swift-sdk in Xcode (Swift 6 / Xcode 16+ per upstream), then implement transport + mapping in
MCPToolBridge.
High level: shipped — screenshots, v0.1.x releases, notarization notes, per-call confirmation for Terminal/typing. Next: tests & CI hardening, stronger shell validation, then full MCP (swift-sdk, “Add MCP server” in Settings).
The detailed, maintained plan — what’s done, what’s next, priorities — lives in ROADMAP.md.
| Doc | Purpose |
|---|---|
| CONTRIBUTING.md | How to build, PRs, GitHub publishing |
| SECURITY.md | Vulnerability reporting, threat model |
| CODE_OF_CONDUCT.md | Community expectations |
| docs/ARCHITECTURE.md | Data flow and key components |
| docs/NOTARIZATION.md | Optional: signing & notarization for distributing .app |
| ROADMAP.md | Done vs planned work, near/mid/long term |
We welcome small, focused PRs — bugfixes, docs, UX polish, or MCP/tooling improvements. See CONTRIBUTING.md and CODE_OF_CONDUCT.md. Before large changes, opening an issue to discuss helps avoid rework.
RLeon can run local models and tools, but shell and cross-app typing are sensitive. Use trusted models only; read SECURITY.md for reporting issues and scope.
MIT — see LICENSE.
If RLeon helps you, consider starring the repo and sharing feedback.
