Copyright Travis Moore (Angelo Asante) Licensed under the Apache License 2.0 — see LICENSE and NOTICE for details.
Personal AI Operating System
███████╗██████╗ ██╗██████╗ █████╗ ██╗ ██╗
██╔════╝██╔══██╗██║██╔══██╗██╔══██╗╚██╗ ██╔╝
█████╗ ██████╔╝██║██║ ██║███████║ ╚████╔╝
██╔══╝ ██╔══██╗██║██║ ██║██╔══██║ ╚██╔╝
██║ ██║ ██║██║██████╔╝██║ ██║ ██║
╚═╝ ╚═╝ ╚═╝╚═╝╚═════╝ ╚═╝ ╚═╝ ╚═╝
Tony Stark had JARVIS. This is FRIDAY.
Not a demo. Not a wrapper around ChatGPT. A personal AI OS built from scratch — local inference, persistent memory, smart home control, browser automation, email, calendar, iMessage, X, job applications, a hologram display if you're into that.
You talk to it in whatever way you talk. It gets things done without you having to explain yourself twice.
Built by one person in Plymouth, UK, at 3am, between night shifts. Apache 2.0. Use it. Build on it. Just don't pretend you made it.
Runs hybrid: cloud inference via Groq for speed (6.5s avg), with automatic fallback to fully local Ollama when offline or if you prefer privacy.
You: "man, you good?"
FRIDAY: Always. What's the play? ← <1ms, zero LLM
You: "check my emails"
FRIDAY: Say less. Working on it in the background. ← instant ack
◈ checking emails ← live status
FRIDAY (12s) You've got 4 unread. One from Stripe ← direct dispatch (2 LLM)
(critical) — payment webhook failing on prod...
You: "catch me up"
FRIDAY: On it. Keep chatting, I'll holler when done.
◈ checking emails... ← all 8 tools
◈ checking calendar... in parallel
◈ checking x_ai...
◈ ✓ emails done
◈ ✓ calendar done
◈ synthesizing briefing...
FRIDAY (32s) Three things. Global Talent page ← 1 LLM call (was 12+)
updated. Sam George tweeted about digital
infrastructure. Calendar's empty...
You: "watch father in law's messages for the next hour, reply as friday"
FRIDAY: Got it. Watching every 60 seconds. ← standing order created
💛 FRIDAY Watch — replied to Father In Law: ← background, autonomous
"FRIDAY: He's building me right now, I'll
let him know you texted."
You (in iMessage to father in law): "I'm innocent 😂 @friday defend me here wai"
💛 FRIDAY Watch — replied to Father In Law: ← tagged mid-conversation
"FRIDAY: 😂😂😂 As told by Travis, who's busy
not telling me to chill. (I'm just the AI,
don't shoot the messenger)"
That last one actually happened. 2:50am. Father-in-law sent a LeBron reaction image. FRIDAY held down the conversation while Travis was building her. iMessage became a command interface — type @friday mid-chat, FRIDAY picks it up, acts on it, replies. The other person just thinks you're having a laugh.
- macOS (tested on Apple Silicon)
- Python 3.12+
- uv — Python package manager (install)
# 1. Clone the repo
git clone <https://github.com/angeloasante/Jarvis.git> && cd JARVIS
# 2. Install dependencies
uv sync
# 3. Create your .env file
echo 'TAVILY_API_KEY=your-key-here' > .envNow pick how you want FRIDAY to think:
Fast. Qwen3-32B at 535 tok/s, sub-100ms latency, ~6.5s average response. Free tier available.
# Add your Groq API key — get one free at https://console.groq.com
echo 'GROQ_API_KEY=gsk_your_key_here' >> .env
# Run FRIDAY
uv run fridayThat's it. No models to download, no GPU needed. If Groq goes down mid-session, FRIDAY silently falls back to local Ollama (if installed).
Private. Zero cloud calls. Everything runs on your machine. Slower (~10-25s per call on M4 Air) but no API keys, no data leaves your device.
# Install Ollama and pull the model
brew install ollama
ollama pull qwen3.5:9b
ollama serve
# Run FRIDAY (no GROQ_API_KEY in .env = fully local)
uv run fridayFor detailed Ollama setup, troubleshooting, and hardware requirements — see docs/ollama-setup.md.
Set GROQ_API_KEY and have Ollama running. FRIDAY uses Groq for speed, falls back to Ollama automatically if cloud is unreachable. Best of both worlds.
Add or remove GROQ_API_KEY from .env and restart FRIDAY. That's it. No code changes, no config flags.
FRIDAY supports ambient voice — say "Friday" naturally at any point. Works well in quiet environments and moderate background noise. Loud music or overlapping conversations will reduce accuracy — even Siri and Alexa struggle here. A denoiser pre-processing step (coming soon) improves this significantly. Voice is off by default and only runs when you explicitly enable it.
Privacy model:
- Nothing leaves your machine by default. Speech recognition (Silero VAD + MLX Whisper) runs entirely on-device. No audio is sent anywhere.
- Audio is processed in real-time and immediately discarded — FRIDAY keeps a rolling text transcript (last 5 minutes), never raw audio.
- If you enable cloud TTS (ElevenLabs), only FRIDAY's response text is sent to generate speech — your voice and ambient audio still never leave your device.
/listening-offpauses the mic entirely./voicedisables the whole pipeline. You're always in control.
# Start with voice enabled (off by default)
uv run friday --voice
# Or toggle at runtime
/voice
# Pause/resume ambient listening
/listening-off
/listening-onAfter FRIDAY responds, you have an 8-second follow-up window — just keep talking without saying "Friday" again. CLI and voice work simultaneously — type or talk, your choice.
TTS (cloud or local — your choice):
# Cloud TTS (ElevenLabs Flash v2.5, ~75ms latency) — add to .env:
ELEVENLABS_API_KEY=your-key-here
ELEVENLABS_VOICE_ID=JBFqnCBsd6RMkjVDRZzb # Optional — defaults to "George"
# Local TTS (Kokoro-82M ONNX, ~500ms) — just don't set the key above.
# Remove ELEVENLABS_API_KEY from .env and FRIDAY uses Kokoro automatically.To switch between cloud and local TTS: add or remove ELEVENLABS_API_KEY from .env. That's it.
FRIDAY uses Tavily for web search. Sign up at app.tavily.com — the free tier gives you 1,000 searches/month.
For email and calendar access (optional — FRIDAY works without it, just no comms agent):
# 1. Go to https://console.cloud.google.com
# 2. Create a project → Enable Gmail API + Calendar API
# 3. Create OAuth2 credentials (Desktop app is simplest)
# 4. Download the JSON → save as:
cp ~/Downloads/client_secret_*.json ~/.friday/google_credentials.json
# 5. Authenticate (opens browser for consent):
uv run python -m friday.tools.google_authNote: If your app is in "Testing" mode in Google Cloud Console, add your email as a test user under OAuth consent screen → Test users.
FRIDAY can read and send WhatsApp messages through a local Node.js bridge. No third-party servers — runs entirely on your machine.
# 1. Install bridge dependencies
cd friday/whatsapp && npm install
# 2. Start the bridge (first time — shows QR code)
node server.js
# 3. Scan the QR code with WhatsApp → Linked Devices → Link a Device
# 4. Once connected, FRIDAY can use WhatsAppAfter pairing, the session persists — restart the bridge anytime without re-scanning. For background running, auto-start on login, and troubleshooting — see docs/whatsapp-setup.md.
User Input (CLI / Voice)
│
▼
┌───────────┐
│ FRIDAY │ Orchestrator — routes tasks, never does the work itself
│ Core │ Memory + conversation context injected every call
└─────┬─────┘
│
├─ 1. Fast Path → regex → instant (0 LLM, <1s)
├─ 1.5 User Override → @agent → agent dispatch (0s routing)
├─ 2. Oneshot → regex → tool + 1 LLM (1 LLM, ~2s)
├─ 2.5 Direct Dispatch → LLM picks tool + format (2 LLM, ~3-5s)
├─ 3. Agent Dispatch → regex → ReAct loop (2-4 LLM, ~5-10s)
├─ 4. Fast Chat → 1 LLM slim prompt (1 LLM, ~1s)
└─ 5. Full LLM Route → fat prompt + dispatch (4 LLM, ~8-15s)
│
▼ (background thread — user keeps chatting)
│
├────────┬────────┬────────┬────────┬────────┬────────┬────────┬────────┬────────┐
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐┌────────┐
│ Code ││Research││ Memory ││ Comms ││ System ││ Home ││Monitor ││Briefing││ Job ││ Social │
│ Agent ││ Agent ││ Agent ││ Agent ││ Agent ││ Agent ││ Agent ││ Agent ││ Agent ││ Agent │
└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘└───┬────┘
│ │ │ │ │ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
File I/O Tavily ChromaDB Gmail API AppleScript LG WebOS Web fetch Monitors CV Data X API
Terminal httpx SQLite Calendar Playwright WakeOnLan Scheduler Emails WeasyPrint tweepy
Git Known src Semantic Drafts Chrome,PDF Smart Home Diffing Calendar Jinja2 Mentions
│ │
└── asyncio.gather() ────────────────────┘
(parallel tool execution)
| Priority | Path | How | Speed |
|---|---|---|---|
| 1 | Fast Path | Regex → instant canned response or tool call | <1s, 0 LLM |
| 1.5 | User Override | @agent or use agent → direct dispatch |
0s routing |
| 2 | Oneshot | Regex → tool + 1 LLM format | ~3-5s |
| 2.5 | Direct Dispatch | LLM picks tool + 1 LLM format | ~3-5s |
| 3 | Agent Dispatch | LLM classify (~1s) → agent ReAct loop, regex fallback | ~5-10s |
| 4 | Fast Chat | 1 LLM with slim prompt | ~1s |
| 5 | Full LLM Route | Fat prompt + dispatch (ambiguous only) | ~8-15s |
Priority 3 uses Groq LLM classification (~1s) to pick the right agent, with regex as automatic fallback when offline. This replaced the old regex-only routing which couldn't handle ambiguous queries.
All agent work runs in background — user keeps chatting. Live status updates stream to CLI/voice. Parallel tool execution via asyncio.gather().
Qwen3.5 has a built-in reasoning mode that generates internal chain-of-thought. This is powerful for complex tasks but wastes time on simple ones.
FRIDAY uses Ollama's native think parameter:
think=Falsefor conversation and tool calls (~1-2s per LLM call)think=Truefor deep reasoning tasks like "explain how async/await works" (~30-60s but higher quality)
This alone took response times from 84-121s down to 5-12s for conversational messages.
JARVIS/
├── friday/
│ ├── cli.py # Terminal interface (hacker green aesthetic)
│ ├── core/
│ │ ├── config.py # Model, paths, settings (single source of truth)
│ │ ├── types.py # ToolResult, AgentResponse, ErrorCode, Severity
│ │ ├── llm.py # LLM abstraction (cloud via Groq + local Ollama fallback)
│ │ ├── base_agent.py # ReAct loop base class for all agents
│ │ ├── tool_dispatch.py # Direct tool dispatch — 1 LLM picks tool, 1 LLM formats
│ │ ├── prompts.py # Personality, system prompt, dispatch tool schema
│ │ ├── router.py # Intent classification (LLM + regex), agent matching
│ │ ├── fast_path.py # Zero-LLM instant commands (TV, greetings)
│ │ ├── oneshot.py # Regex → tool → 1 LLM format
│ │ ├── briefing.py # Parallel tool calls → 1 LLM synthesis
│ │ └── orchestrator.py # FRIDAY Core — thin dispatcher, imports from above
│ ├── agents/
│ │ ├── code_agent.py # File ops, terminal, git, debugging
│ │ ├── research_agent.py # Web search, page fetching, known sources
│ │ ├── memory_agent.py # Store/recall decisions, lessons, context
│ │ ├── comms_agent.py # Email (Gmail) + Calendar (macOS/iCloud)
│ │ ├── system_agent.py # Mac control, browser, terminal, file ops
│ │ ├── household_agent.py # Smart home control (LG TV, future: all appliances)
│ │ ├── monitor_agent.py # Persistent watchers for URLs, topics, searches
│ │ ├── briefing_agent.py # Daily briefings from monitor alerts + email + calendar
│ │ ├── job_agent.py # CV tailoring, cover letters, PDF generation
│ │ └── social_agent.py # X (Twitter) management
│ ├── data/
│ │ └── cv.py # Structured CV data (single source of truth)
│ ├── tools/
│ │ ├── web_tools.py # Tavily search + httpx page fetch
│ │ ├── file_tools.py # Read, write, list, search (with line ranges, content search)
│ │ ├── terminal_tools.py # Shell execution, background processes, process management
│ │ ├── mac_tools.py # AppleScript, app launcher, screenshots, system info
│ │ ├── browser_tools.py # Playwright browser automation (navigate, click, fill, screenshot)
│ │ ├── memory_tools.py # ChromaDB + SQLite memory operations
│ │ ├── email_tools.py # Gmail read, search, send, draft, label
│ │ ├── calendar_tools.py # macOS/iCloud Calendar read + create events
│ │ ├── imessage_tools.py # iMessage read/send + FaceTime + Contacts
│ │ ├── whatsapp_tools.py # WhatsApp read/send/search via Baileys bridge
│ │ ├── cron_tools.py # Scheduled task CRUD (create, list, delete, toggle)
│ │ ├── watch_tools.py # Standing orders — create, list, cancel watch tasks
│ │ ├── notify.py # Phone notifications via iMessage to self
│ │ ├── tv_tools.py # LG TV WebOS control + WakeOnLan (18 tools)
│ │ ├── pdf_tools.py # PDF read, merge, split, rotate, encrypt, watermark
│ │ ├── call_tools.py # Phone, FaceTime, WhatsApp call history
│ │ ├── x_tools.py # X (Twitter) API — post, search, mentions
│ │ ├── monitor_tools.py # Persistent monitor CRUD + change detection
│ │ ├── briefing_tools.py # Briefing queue, digest, alert delivery
│ │ ├── cv_tools.py # CV get/tailor, cover letters, PDF generation
│ │ └── google_auth.py # Shared OAuth2 for Gmail + Calendar
│ ├── voice/
│ │ ├── config.py # Audio constants, VAD thresholds, trigger words, TTS config
│ │ ├── pipeline.py # Always-on ambient listener + trigger word + follow-up window
│ │ ├── vad.py # Silero VAD v6 wrapper (speech detection)
│ │ ├── stt.py # MLX Whisper local transcription
│ │ └── tts.py # ElevenLabs streaming (cloud) + Kokoro ONNX (local fallback)
│ ├── background/
│ │ ├── monitor_scheduler.py # APScheduler background monitor jobs
│ │ ├── heartbeat.py # Proactive awareness loop (30min ticks, zero-LLM silent checks)
│ │ ├── cron_scheduler.py # User-defined scheduled tasks (APScheduler + SQLite)
│ │ ├── github_sync.py # Background GitHub project sync
│ │ └── memory_processor.py # Background memory processing
│ ├── memory/
│ │ └── store.py # Hybrid memory (semantic + structured)
│ ├── whatsapp/
│ │ ├── server.js # Baileys HTTP bridge (Express + WhatsApp Web)
│ │ └── package.json # Node.js dependencies
│ └── skills/ # (Phase 5 — knowledge docs for agents)
├── Idea/ # Design docs, system maps, tool specs
├── docs/
│ ├── progress.md # Development log
│ ├── ollama-setup.md # Local LLM setup guide (Ollama)
│ ├── whatsapp-setup.md # WhatsApp integration setup (Baileys bridge)
│ ├── friday-glasses-integration.md # Halo glasses integration spec
│ └── background/
│ └── monitor_scheduler.py # APScheduler background monitor jobs
├── data/ # Runtime data (gitignored)
│ └── memory/
│ ├── friday.db # SQLite (conversations, agent calls)
│ └── chroma/ # ChromaDB (semantic memory vectors)
├── .env # API keys (gitignored)
├── pyproject.toml # Project config + dependencies
└── uv.lock # Dependency lock file
The hands. Reads, writes, debugs, and runs code.
Tools: read_file, write_file, list_directory, search_files, run_command, search_web, search_memory
Capabilities:
- Read and modify files with style-matching
- Run terminal commands (git, npm, python, system)
- Search the web for documentation
- Safety checks block dangerous commands (
rm -rf /,mkfs, etc.)
The eyes. Searches the web, reads full pages, synthesises findings.
Tools: search_web, fetch_page, store_memory, search_memory
Capabilities:
- Tavily-powered web search with AI-generated answers
- Full page fetching and HTML stripping (not just snippets)
- Known source injection — for topics like UK visas, it fetches gov.uk directly
- Cross-referencing and date-awareness for time-sensitive topics
Known Sources:
| Topic | Authoritative URL |
|---|---|
| UK Global Talent Visa | gov.uk/global-talent |
| Stripe | stripe.com/docs |
| Paystack | paystack.com/docs |
| Supabase | supabase.com/docs |
| Modal | modal.com/docs |
| Railway | docs.railway.com |
| Vercel | vercel.com/docs |
| Ollama | github.com/ollama/ollama |
The brain's filing system. Stores decisions, lessons, and context for future recall.
Tools: store_memory, search_memory, get_recent_memories
Categories: project, decision, lesson, preference, person, general
Importance scale: 1 (trivial) → 10 (critical, never forget)
The mouth and schedule. Handles email, calendar, iMessage, FaceTime, and contacts.
Tools: read_emails, search_emails, read_email_thread, send_email, draft_email, send_draft, edit_draft, get_calendar, create_event, read_imessages, send_imessage, start_facetime, search_contacts, send_whatsapp, read_whatsapp, search_whatsapp, whatsapp_status
Capabilities:
- Read, search, and triage Gmail (priority-sorted: critical → high → normal)
- Draft and send emails with Travis's tone (never sends without explicit confirmation)
- Full draft lifecycle — create, edit, and send Gmail drafts by ID
- Read macOS/iCloud Calendar (day/week view, next event) — no API keys needed
- Create calendar events via AppleScript — syncs to iCloud automatically
- iMessage — read conversations from
chat.db, send texts via Messages.app AppleScript - FaceTime — initiate video/audio calls, multi-number contact handling
- Contacts — search Contacts.app with fuzzy matching, nickname resolution, emoji support
- WhatsApp — read chats, send messages, search across conversations via local Baileys bridge (Node.js)
- Smart contact resolution — "Ellen's pap", "my bby", "father in law" all resolve correctly via word-overlap scoring
- NSAttributedString parsing — extracts text from newer iMessage binary format (
attributedBody) - Channel-aware — after reading iMessages, replies go via
send_imessage(neverdraft_email) - Tone matching — reads recent messages to match the conversation's vibe when drafting
- Priority sender flagging — Paystack/Stripe = critical, Railway/GitHub = high
- Coding hours warning — flags events during 10pm-4am
Safety gates: send_email, send_draft, send_imessage, send_whatsapp, and create_event all require confirm=True. FRIDAY always previews before acting.
Setup: Email requires Google OAuth2 — see Google API Setup below. Calendar, iMessage, FaceTime, and Contacts work out of the box (native macOS APIs, no API keys needed). iMessage reading requires Full Disk Access for chat.db. WhatsApp requires the Baileys bridge — see docs/whatsapp-setup.md.
The body. Controls the Mac itself — apps, browser, terminal, files.
Core Tools (always loaded): run_command, run_background, open_application, take_screenshot, get_system_info, run_applescript, read_file, list_directory
Browser Tools (loaded on demand): browser_navigate, browser_screenshot, browser_click, browser_get_text, browser_wait_for_login, browser_discover_form, browser_fill_form, browser_upload
PDF Tools (loaded on demand): pdf_read, pdf_metadata, pdf_merge, pdf_split, pdf_rotate, pdf_encrypt, pdf_decrypt, pdf_watermark
Capabilities:
- Open any app from the safe list (Cursor, Chrome, Slack, Finder, etc.)
- Run terminal commands with safety checks + background processes
- Take screenshots (saved to
~/Downloads/friday_screenshots/) - Run AppleScript for Mac automation (dark mode, volume, UI control)
- Automated browsing with persistent sessions — uses Safari with your existing sessions/cookies
- Login detection — detects login pages, pauses for manual login, then continues
- Navigate, click, fill forms, read page content
- "Fill the form on my screen" — discovers all fields on the current Safari page, batch-fills with Travis's details (name, email, phone, LinkedIn, GitHub, website, location), uploads CV if needed, verifies all required fields are filled
- System info — CPU, memory, disk, uptime
- PDF operations — read/extract text+tables, merge, split, rotate, encrypt/decrypt, watermark, metadata
Browser engine: Safari via AppleScript + JavaScript injection — uses your actual Safari with all existing cookies, sessions, saved passwords. No login walls. No Selenium, no Playwright. One JS call fills entire forms.
Dynamic tool loading: Browser, PDF, screen, and form tools are only injected when the task mentions them. Base tool count stays at 8 (comfortable for 9B models), scales to 18+ (browser + forms) or 16 (PDF) when needed. Form tasks also get CV tools and higher iteration limits (15 vs default 5).
Safety: Dangerous buttons (pay, delete, submit) require explicit confirmation. Dangerous terminal commands are blocked.
The home brain. Controls smart devices in Travis's home over local network — no cloud, no accounts.
Tools: 18 TV tools — turn_on_tv, turn_off_tv, tv_screen_off, tv_screen_on, tv_volume, tv_volume_adjust, tv_mute, tv_play_pause, tv_launch_app, tv_close_app, tv_list_apps, tv_list_sources, tv_set_source, tv_remote_button, tv_type_text, tv_notify, tv_get_audio_output, tv_set_audio_output, tv_system_info, tv_status
Capabilities:
- LG TV control via WebOS local API (WiFi, no LG account needed)
- WakeOnLan to power on the TV from off state
- Fast-path routing — simple commands (volume, mute, launch app, power) bypass the LLM entirely via regex pattern matching. ~200-600ms instead of ~30s
- Volume control — exact level ("volume to 20") or relative adjust ("turn it up" → +5), with read-back verification
- Mute/unmute
- Media playback — pause, resume, stop, rewind, fast-forward
- App launching — Netflix, YouTube, Spotify, Prime, Disney+, Apple TV, HDMI inputs, with launch verification
- Screen off/on — audio keeps playing with screen off (Spotify mode)
- Close apps, list installed apps
- Input source switching — list and switch HDMI/antenna sources
- Full remote control — 40+ buttons: navigation, media, numbers (0-9), colours (red/green/yellow/blue), channel up/down, special keys
- IME text input — type directly into search bars without navigating virtual keyboard
- Toast notifications — send messages to the TV screen
- Audio output switching — TV speakers, soundbar, ARC, optical
- In-app search — LLM handles complex multi-step commands like "search for Black Widow on Disney+"
- Multi-step commands — "turn on TV and put on Netflix" handled sequentially with boot delay
- TV status — power state, current volume, active app (friendly names)
Supported Apps:
| Command | App |
|---|---|
netflix |
Netflix |
youtube |
YouTube |
spotify |
Spotify |
prime / amazon |
Prime Video |
disney / disney+ |
Disney+ |
apple tv |
Apple TV |
hdmi1-hdmi4 |
HDMI inputs |
live tv |
Live TV |
browser |
Web Browser |
settings |
TV Settings |
Remote Buttons:
| Category | Buttons |
|---|---|
| Navigation | up, down, left, right, ok, back, home, menu, exit, dash, info |
| Media | play, pause, stop, rewind, fastforward, volume_up, volume_down, mute, channel_up, channel_down |
| Numbers | num_0 through num_9 |
| Colours | red, green, yellow, blue |
| Special | asterisk, cc |
Performance (fast-path):
| Command | Time |
|---|---|
| Volume/mute/pause | 165-340ms |
| Screen off/on | 217-305ms |
| Status check | 310ms |
| Volume set (verified) | 517-567ms |
| App launch (verified) | 2.5-6s |
| Complex search (LLM) | 30-90s |
Setup:
# 1. Add to .env:
LG_TV_IP=192.168.1.xx # TV's local IP (check router admin)
LG_TV_MAC=AA:BB:CC:DD:EE:FF # TV's MAC address (for WakeOnLan)
# 2. Pair with TV (one-time, TV must be on):
uv run python -m friday.tools.tv_tools
# Accept the prompt on your TV → save the client key to .env:
LG_TV_CLIENT_KEY=<key-from-pairing>Future: LG ThinQ API for all LG appliances, smart lights, thermostats.
How fast-path works: FRIDAY uses regex pattern matching to detect simple TV commands ("volume to 20", "mute", "put on Netflix") and executes them directly — no LLM inference needed. Only complex commands like "search for Black Widow on Disney+" fall through to the LLM for multi-step reasoning.
The eyes that never sleep. Creates persistent watchers that track URLs, topics, and web searches for material changes.
Tools: create_monitor, list_monitors, pause_monitor, delete_monitor, get_monitor_history, force_check
Capabilities:
- Watch specific URLs for content changes (e.g. gov.uk visa pages)
- Recurring web searches for topic awareness (e.g. "YC W27 deadline")
- Broad topic monitoring (e.g. "AI visa policy UK")
- Material change detection — keyword filtering so only relevant changes trigger alerts
- SHA-256 content hashing with unified diff analysis
- Importance-based routing: critical = interrupt, high = next interaction, normal = briefing
- APScheduler background jobs: realtime (15min), hourly, daily, weekly
Monitor types:
| Type | Use case | Example |
|---|---|---|
url |
Watch a specific page | gov.uk/global-talent |
search |
Recurring web search | "YC W27 applications" |
topic |
Broad awareness | "AI immigration policy UK" |
Smart diffing: Not everything that changes matters. Nav menu updates, date stamps, minor wording — ignored. New eligibility criteria, deadline changes, policy updates — flagged immediately.
The morning voice. Synthesises monitor alerts, emails, and calendar into tight, actionable briefings.
Tools: get_briefing_queue, get_monitor_alerts, get_daily_digest, mark_briefing_delivered + read_emails, get_calendar, get_call_history + search_x, get_my_mentions
Briefing types:
- Morning briefing — comprehensive: critical alerts, today's calendar, unread emails, missed calls, X feed highlights, monitor changes
- Evening briefing — what shipped, what's blocked, tomorrow's first event
- Quick briefing — one thing, two sentences, the most important item
- "Catch me up" — checks everything: emails, calls, calendar, monitors, X feed
X (Twitter) monitoring — every briefing pulls:
- @samgeorgegh — Ghanaian MP, policy/tech/Ghana news
- Galamsey / illegal mining — breaking news, government action, viral posts
- Travel — viral travel posts, especially Africa-related
- AI / Tech — new AI releases, major announcements, trending posts
- @mentions — anyone who mentioned Travis (surfaced first, actionable)
Call history: Reads phone/FaceTime calls (requires Full Disk Access) and WhatsApp calls (always accessible). Surfaces missed calls in briefings.
Delivery: Briefing items are marked as delivered after being surfaced, so they never repeat.
Example:
"Oya. Three things.
Global Talent Visa page updated — new guidance dropped.
Sam George tweeted about digital infrastructure funding.
Galamsey trending — government announced new drone surveillance.
Calendar's empty. What are we building?"
The career arm. Doesn't just generate CVs — actually applies to jobs autonomously.
Tools: tailor_cv, generate_pdf + search_web + browser_navigate, browser_discover_form, browser_fill_form, browser_screenshot, browser_click, browser_type, browser_scroll, browser_upload, browser_get_text, browser_execute_js, browser_elements, browser_wait_for_login (15 tools, 30 max iterations)
Capabilities:
- 3-phase autonomous workflow: search for the job → tailor CV to the JD → fill the application form
- Searches company career pages itself — uses official sites, follows redirects to Greenhouse/Lever/Workday
- CV tailoring — rewrites summary and reorders experience for specific job descriptions (not generic)
- PDF generation via WeasyPrint + Jinja2 — dark sidebar A4 layout with lime accent
- Batch form filling —
browser_discover_formscrolls the entire page and finds ALL fields,browser_fill_formfills everything in a single JS call (150s → 15s) - React-Select dropdown support — detects React-Select inputs, types to search, clicks option
- File upload via DataTransfer API — bypasses Safari's file chooser restriction, injects file directly
- Verification loop — keeps calling
browser_discover_formuntilall_required_filledis true - Login detection — pauses for manual login on protected job portals
- Never invents experience — only reframes existing data
Safety: Always asks Travis before final submit. Never clicks submit without confirmation.
Name handling: Uses "Angelo Asante" (gov name) on all professional documents. "Travis Moore" is casual/preferred only.
Example commands:
"apply for software engineer at Anthropic"
"go on LinkedIn and apply for AI engineer roles"
"tailor my CV for [role] at [company]"
"generate my CV as PDF"
"fill the form on my screen"
PDF output: Saved to ~/.friday/data/cv_output/
The voice on X. Posts tweets, checks mentions, searches, engages — all through the X API.
Tools: post_tweet, delete_tweet, get_my_mentions, search_x, like_tweet, retweet, get_x_user
Capabilities:
- Post tweets (280 char limit enforced), reply, quote-tweet
- Check @mentions
- Search recent tweets (last 7 days) — costs credits, used sparingly
- Like and retweet
- Look up any public X profile (followers, bio, tweet count)
- Never posts without Travis confirming the text first
Credit awareness: Posting/liking/retweeting is cheap. Searching/lookups cost credits. FRIDAY knows the difference.
Setup:
# Add to .env (from X Developer Portal):
X_CONSUMER_KEY=your_consumer_key
X_CONSUMER_SECRET=your_consumer_secret
X_BEARER_TOKEN=your_bearer_token
X_ACCESS_TOKEN=your_access_token
X_ACCESS_TOKEN_SECRET=your_access_token_secretExample commands:
"tweet this: just shipped v0.2"
"check my mentions"
"search twitter for AI startups UK"
"who is @elonmusk"
"like that tweet"
FRIDAY uses a hybrid memory architecture:
| Layer | Tech | Purpose |
|---|---|---|
| Semantic | ChromaDB | "Find memories similar to X" — cosine similarity search |
| Structured | SQLite | Categories, importance scores, timestamps, agent call logs |
Memory is injected into every system prompt so FRIDAY has context about you, your projects, and past decisions. The more you use FRIDAY, the better it knows you.
This is the real autonomy. You tell FRIDAY to watch someone's messages and handle them while you're busy. It runs in the background, checks every 60 seconds, and only acts when something new comes in.
"watch Teddy Bear's messages for the next hour, reply like me"
"check father in law's messages every 60 seconds, reply as friday"
"do the same for My Bby"
"watch my emails for anything from Stripe, notify me"
"check for missed calls every 2 minutes, ping me if anything comes in"
"open LinkedIn and check for new notifications every 5 minutes"
Real example from a live session:
You: "watch father in law's messages, reply as friday"
FRIDAY: Got it. Watching every 60 seconds.
💛 FRIDAY Watch — replied to Ellen's Pap:
"FRIDAY: Hi, I'm FRIDAY — Travis's AI assistant.
He's been busy building me. I read your chat
and noticed you mentioned your eyes — how are
they feeling?"
FRIDAY classifies each watch by keywords and dispatches to the right executor:
| Watch Type | Keywords | What It Does |
|---|---|---|
| iMessage | (default) | Reads messages, reasons about replies, sends as you or FRIDAY |
| "email", "inbox", "gmail" | Reads unread emails, filters by sender keyword, notifies on new matches | |
| Missed Calls | "missed call", "call log" | Reads call history, fingerprints latest, notifies on new missed calls |
| Browser | "linkedin", "website", "notifications" | Opens URL via Playwright, hashes page content, LLM summarizes changes |
iMessage flow:
- Baseline set — first tick records the current conversation state. No phantom replies.
- Every 60s — reads the latest received messages, compares fingerprint against last check.
- Nothing new? — skip. Zero LLM cost. Zero API calls beyond the message read.
- New message? — checks if you already replied. If yes, skip.
- Unreplied? — reads last 20 messages for full context, then 1 LLM call drafts the reply matching the conversation vibe.
- Sends it — updates state so the same message never triggers twice.
Email/Calls/Browser flow: Each tick reads the relevant data, compares against the last known state fingerprint, and sends a phone notification if something new shows up. No auto-replying — just monitoring and alerting.
Not every message needs a reply. If someone says "okay" or "lol" or drops a thumbs up, FRIDAY leaves it alone. The LLM decides: does this actually need a response, or would replying be forced?
Type @friday in any iMessage conversation and FRIDAY picks it up. She reads the full thread, understands the vibe, and jumps in as herself. She addresses what the other person said AND what you said. She's got your back.
Note: For @friday tagging to work, a watch must be active for that conversation. The watch is what checks for new messages every 60 seconds — that's how FRIDAY sees your tag. No watch, no pickup.
This turns iMessage into a command interface for FRIDAY. You never leave the chat. The other person doesn't know you're directing an AI mid-conversation — they just think you're having a laugh.
You (in iMessage): "I'm innocent 😂 @friday defend me here wai"
💛 FRIDAY Watch — replied:
"FRIDAY: 😂😂😂 As told by Travis, who's busy not telling me
to chill. (I'm just the AI, don't shoot the messenger)"
That's not a technical benchmark or a briefing output — it's FRIDAY holding down your relationships at 2:50am while you build her. Unscripted. Reading the room. Personality fully there.
FRIDAY figures out who she should be based on context:
- "reply as me" — replies as you. Your tone, your energy. The other person doesn't know it's AI.
- "reply as friday" — prefixes with "FRIDAY:" so they know it's the AI.
- You tag FRIDAY — text "@friday am I lying?" or "@friday defend me" in the actual iMessage conversation and FRIDAY jumps in as herself, backs you up.
- You introduce FRIDAY — if you text "she's my AI, called Friday" in the conversation, FRIDAY picks up on it and starts replying as herself.
- They mention FRIDAY — if the other person says "Friday stop" or "Friday please", FRIDAY switches to herself and responds to what they said.
FRIDAY won't commit you to things:
- Calls — "I'm busy building something right now, I'll call you later"
- Money — "Noted, I'll keep it in mind" / "I'll send it when I'm ready"
- Plans — deflects, says you're working on something
- "Stop replying" — FRIDAY respects it, lets them know you're busy
Say "actually reply as FRIDAY" or "change it to every 2 minutes" — FRIDAY updates the existing watch for that contact instead of creating a duplicate.
/clearwatches # kill all active watches instantly
Or tell FRIDAY naturally: "cancel all watches", "stop watching Teddy's messages"
Set recurring tasks in plain English. FRIDAY converts them to cron schedules.
"every morning at 8am, run my briefing"
"every friday at 5pm, check my emails and send me a summary"
"every 30 minutes, check if the gov.uk visa page changed"
Managed conversationally — create, list, delete, toggle on/off. Persisted in SQLite, survives restarts.
FRIDAY sends you alerts via iMessage to your own number. Works instantly, even in DND if you add your own number to the allowed list.
💛 FRIDAY Watch — replied to Teddy Bear: "I miss you too, rest up"
↑ sent to phone
Every watch task reply, heartbeat alert, and proactive notification hits your phone. No custom app needed (yet).
For anything that needs multiple agents working together — research papers, reports, improving existing documents. FRIDAY breaks the task into phases, dispatches parallel sub-agents, and produces a real deliverable.
"do a deep research about energy barriers and create a paper on my desktop"
"read my thesis, research its topics, improve it to research-paper grade"
"I have an idea about using fans and litmus paper to create a tiny missile for a school project. research it and build a submission-ready file in my downloads"
"write a detailed report about AI in healthcare"
"research quantum computing breakthroughs in 2025 and save a report"
"analyze the impact of social media on mental health and create a detailed paper"
Files are saved to the location you specify. If you don't specify, FRIDAY saves to ~/Documents/friday_files/ — keeps your Desktop clean.
- Planner (1 LLM call) — breaks the task into phases with typed steps: SEARCH, FETCH, READ_FILE, WRITE
- Phase execution — steps in the same phase run in parallel. Phase 1 might read an existing file. Phase 2 dispatches 4-6 search agents simultaneously, each running multiple queries + page fetches.
- Section writers (parallel LLM calls) — each section of the document is written by a separate LLM call, all at once. Research data is partitioned so each writer focuses on its section.
- Synthesis (1 LLM call) — writes the abstract and conclusion across all sections.
- Saves to disk — wherever you specify, or
~/Documents/friday_files/by default. Defaults to.docxformat. Supports.docx,.md,.txt,.pdf— just say the format in your request.
Why is it fast? Because nothing waits. Phase 2 fires 4-6 search agents at once — while one is fetching a page, three others are running different queries. Section writers all run simultaneously — a 6-section paper generates all 6 sections in parallel, not one after another. The only sequential parts are planning (1 LLM call) and final synthesis (1 LLM call). Everything in between is parallel.
You: "do a research about elements and materials that can create energy barriers —
plasma shields, electromagnetic fields, metamaterials. save a research paper."
◈ Planning task structure...
◈ Plan: 16 steps across 3 phases
◈ Phase 1: [READ_FILE] Read existing background knowledge
◈ Phase 2: [SEARCH] Research plasma shields
◈ Phase 2: [SEARCH] Research electromagnetic fields
◈ Phase 2: [SEARCH] Research metamaterials
◈ Phase 2: [SEARCH] Comparative study of energy barrier technologies
◈ Phase 2: [SEARCH] Locate peer-reviewed scientific literature
◈ Phase 2: [SEARCH] Analyze practical and economic considerations
◈ Phase 2: [SEARCH] Future directions and advancements
◈ Phase 1: running 1 steps in parallel...
◈ Phase 2: running 7 steps in parallel...
◈ Phase 3: running 8 steps in parallel...
◈ Data gathered: 7 sources, 56000 chars
◈ Writing 8 sections...
◈ Sections written: 8/8
◈ Writing abstract and conclusion...
◈ Assembling final document...
◈ Done. 8 sections, 0 sources, 31680 chars.
Saved to ~/Documents/friday_files/Research_on_Energy_Barriers_...20260327.docx
(200s, 36 tool calls)
8 sections. 31,680 characters. 36 tool calls. 7 web sources fetched. One .docx file with abstract, table of contents, full sections, conclusion, and references. All from a single sentence.
| You say | Saves to |
|---|---|
| "save on my desktop" | ~/Desktop/ |
| "save in downloads" | ~/Downloads/ |
| Nothing specified | ~/Documents/friday_files/ |
Default is .docx. Say the format in your request to override:
| You say | Format |
|---|---|
| "save as a docx" / nothing specified | .docx |
| "save as markdown" / "save as .md" | .md |
| "save as a text file" / "save as .txt" | .txt |
| "save as pdf" | .pdf |
You can also convert existing files: "convert my thesis to pdf", "change that report to markdown".
- School/uni submissions — "I have an idea about X for my school project. Research it and build a submission-ready file"
- Work reports — "write a detailed report about our Q1 performance metrics"
- Thesis improvement — "read my thesis at ~/Documents/thesis.md, research its topics, improve it"
- Idea exploration — "I think metamaterials could be used for cloaking. Deep dive and save a paper"
- Literature review — "do a comprehensive literature review on CRISPR gene editing"
- Competitive analysis — "research the top 5 AI coding assistants and create a comparison report"
Have an existing document? FRIDAY improves it.
"read my thesis at ~/Documents/thesis.md and improve it to research-paper grade"
FRIDAY reads what you wrote, researches the topics it finds, rewrites each section with new evidence and citations, and preserves your voice throughout. Not a rewrite — an upgrade.
- Reads your document (Phase 1) — understands structure, arguments, voice
- Researches the topics it finds (Phase 2, parallel) — 4-6 search agents fire simultaneously
- Rewrites each section (Phase 3, parallel) — strengthens arguments, adds citations, fills gaps
- Preserves your voice — the ideas stay yours. The evidence and structure get better.
Works with any format FRIDAY can read: .docx, .md, .txt.
Ask FRIDAY to look at your screen — read text, understand what's on it, or solve every question on a page. On-command only, never watches passively. Privacy-gated behind FRIDAY_SCREEN_ACCESS=true in .env.
"what's on my screen" → OCR + vision analysis
"what error is this" → diagnoses errors on screen
"read the text on screen" → Apple Vision OCR
"solve the questions on my screen" → full-page capture + solve + .docx
"open Safari and solve the questions on that page" → targets a specific app
"just solve what's on my screen right now" → viewport-only, no scrolling
Screen Reading:
- Takes a screenshot of the frontmost window (not full screen — no dock/menu bar noise)
- Runs Apple Vision OCR (offline, free, fast) to extract all text
- If Qwen2.5-VL is available (via Ollama), sends the image for full visual understanding
- If no vision model, falls back to OCR text + LLM to answer
Full-Page Question Solver:
The killer feature. FRIDAY scrolls through an entire page (browser, PDF, Word doc — any app), OCRs every viewport, deduplicates overlapping text, then solves every question it finds. Answers are saved to a well-formatted .docx with proper headings, bold terms, numbered lists, and structured explanations.
How it works under the hood:
- Activates the target app (if specified) and clicks the content area
- Scrolls to the top of the page (
Cmd+Up) - Captures + OCRs each viewport, scrolls down, repeats (up to 20 pages)
- Deduplicates overlapping text between frames (filters UI chrome before comparison)
- Cleans OCR output — strips browser toolbar, menu bar, short UI fragments
- Sends clean text to LLM with structured solving prompt
- Saves formatted answers to
~/Documents/friday_files/Screen_Answers_<timestamp>.docx
Works with any scrollable app. Tested on Safari with a 20-page workbook — captured all questions, solved them with detailed paragraph-length answers.
App Targeting: Say which app to look at and FRIDAY activates it before capturing. "Open Safari and solve the questions" or "solve questions in Preview". If you don't specify, it uses whatever's in front.
Viewport-Only Mode: When you just want the current view solved without scrolling the whole page, say "just solve what's on my screen" — captures one frame, solves, done.
Screenshots auto-delete after 48 hours. Nothing is stored permanently.
Setup:
# Required — enable screen access
echo "FRIDAY_SCREEN_ACCESS=true" >> .env
# Optional — pull vision model for full image understanding
ollama pull qwen2.5vl:7bWithout the vision model, FRIDAY can still read all text on screen (OCR) and answer questions about it. The vision model adds app/UI/diagram recognition.
| Command | Description |
|---|---|
/quit |
Exit FRIDAY |
/clear |
Reset conversation history |
/memory |
Show recent stored memories |
/voice |
Toggle voice pipeline on/off |
/listening-off |
Pause ambient listening |
/listening-on |
Resume ambient listening |
/clearwatches |
Kill all active watch tasks |
FRIDAY isn't generic. It's built for Travis — a Ghanaian founder based in Plymouth, UK who builds at 2-4am. It understands:
- Ghanaian expressions — "hawfar", "oya", "chale", "e no do", "time no dey", "sharp sharp"
- Energy matching — casual gets casual, urgent gets urgent
- No corporate tone — no "Certainly!", no bullet-point self-descriptions, no fluff
- 2am rule — late night = less polish, more honest
▶ hawfar
FRIDAY E dey. You good? What are we doing.
▶ chale this thing no dey work
FRIDAY What broke. Show me.
▶ time no dey
FRIDAY Sharp sharp. What's the priority.
| Component | Technology | Why |
|---|---|---|
| LLM (cloud) | Qwen3-32B via Groq | 32B params, sub-100ms latency, 535 tok/s, OpenAI-compatible API |
| LLM (local) | Qwen3.5-9B via Ollama | 9B params, fully offline fallback, thinking toggle, Apache 2.0 |
| Package Manager | uv | 10-100x faster than pip |
| Web Search | Tavily | Built for AI agents, returns structured data, AI answers |
| Vector DB | ChromaDB | Lightweight, embedded, cosine similarity |
| Structured DB | SQLite | Zero-config, built into Python |
| CLI Framework | Rich + prompt_toolkit | Beautiful output, history, auto-suggest |
| HTTP | httpx | Async, modern, follow redirects |
| Google APIs | google-api-python-client + google-auth-oauthlib | Gmail OAuth2 |
| Calendar | AppleScript + macOS Calendar.app | Native iCloud/local calendar, no API keys |
| iMessage | SQLite (chat.db) + AppleScript Messages.app |
Read conversations + send texts, no API keys |
| FaceTime | AppleScript FaceTime.app | Initiate calls, multi-number support |
| Contacts | AppleScript Contacts.app | Fuzzy search, nickname resolution, emoji support |
| Cron/Scheduler | APScheduler CronTrigger + SQLite | User-defined scheduled tasks, persistent across restarts |
| Standing Orders | APScheduler (30s ticks) + SQLite + LLM reasoning | Watch iMessages (auto-reply), emails, missed calls, browser pages — type-classified dispatch |
| Phone Notifications | iMessage to self | Instant alerts to iPhone, DND bypass capable |
| Screen Vision & Solver | Apple Vision (Swift OCR) + Qwen2.5-VL (Ollama) | Screen reading, full-page scroll+OCR, question solver → formatted .docx |
| Browser Automation | Safari (Selenium) + Playwright fallback | Safari = your sessions/cookies, no login walls. Playwright fallback for headless. |
| TV Control | pywebostv + wakeonlan | LG TV local API over WiFi, no cloud dependency |
| Background Jobs | APScheduler | Persistent monitor scheduling, async event loop integration |
| PDF Generation | WeasyPrint + Jinja2 | CV and cover letter PDF rendering, clean A4 layout |
| PDF Processing | pypdf + pdfplumber | Read, merge, split, rotate, encrypt, extract text/tables |
| Social Media | tweepy (X API v2) | Post, search, mentions, engage — pay-as-you-go credits |
| Voice Activity | Silero VAD v6 | <1ms/chunk, enterprise-grade end-of-speech detection |
| Speech-to-Text | MLX Whisper (whisper-small) | 10x faster than whisper.cpp on Apple Silicon, always local |
| Text-to-Speech (cloud) | ElevenLabs Flash v2.5 | ~75ms streaming latency, PCM 24kHz, persistent connections |
| Text-to-Speech (local) | Kokoro-82M (ONNX) | 82M params, natural voice, Apache 2.0, ~500ms synthesis |
| Audio I/O | python-sounddevice | Callback-based, clean macOS support |
- Multi-agent orchestrator with smart routing
- 11 specialist agents (Code, Research, Deep Research, Memory, Comms, System, Household, Monitor, Briefing, Job, Social)
- Tool library (web, file, terminal, memory, email, calendar, mac, browser)
- Gmail integration — read, search, send, draft, edit draft, send draft, label, thread
- macOS/iCloud Calendar integration — day/week view, create events (no API keys needed)
- Mac control — AppleScript, app launcher, screenshots, volume, dark mode
- Screen vision — OCR (Apple Vision, offline) + image understanding (Qwen2.5-VL), auto-cleanup after 48h
- Full-page question solver — scroll + OCR entire pages, solve all questions, save formatted .docx, app targeting, viewport-only mode
- Browser automation — Safari (Selenium, your sessions) + Playwright fallback, login detection
- LG TV control — WebOS local API + WakeOnLan (no cloud)
- Persistent monitoring — URL/topic/search watchers with material change detection
- Briefing system — morning/evening/quick briefings from monitor alerts + email + calendar
- Job agent — CV tailoring, cover letters, PDF generation (WeasyPrint + Jinja2)
- Background scheduler — APScheduler runs monitor checks on configurable intervals
- Background process management — start, monitor, kill
- Hybrid memory (ChromaDB + SQLite)
- Streaming CLI with hacker aesthetic
- Smart thinking control (84s → 5s for simple queries)
- Personality + Ghanaian expression understanding
- Known source injection for research
- Vague query detection (ask before wasting time)
- Conversation context injection (agents remember recent turns)
- Live tool call status during agent work
- Compacted tool results for 9B model compatibility
- Voice pipeline — Silero VAD + MLX Whisper + Kokoro TTS
-
--voiceflag and/voiceruntime toggle - Response filter (strips code/markdown for speech, condenses to 3 sentences)
- Activation chime, barge-in support, feedback prevention
- Both CLI and voice work simultaneously (shared FridayCore instance)
- Direct agent dispatch — regex skips routing LLM (4 → 2 LLM calls per query)
- Direct briefing — parallel tools + 1 LLM synthesis (12+ → 1 LLM call)
- Parallel tool execution —
asyncio.gather()when multiple tools in one response - Background agent execution — user keeps chatting while agents work
- Live status updates —
◈ checking emails...→◈ synthesizing... - Streaming synthesis — agent results stream token-by-token to CLI and voice
- Expanded fast path — greeting prefixes, Ollama error recovery
- Unified routing — all queries go through dispatch, LLM always has DISPATCH_TOOL
- Direct tool dispatch — LLM picks from 9 curated tools in 1 call (agents become fallback)
- 7-tier routing: fast path → user override → oneshot → direct dispatch → agent → fast chat → full LLM
- User override —
@comms,@research,@socialetc. bypasses routing entirely - Dual-model architecture — Qwen3.5:9B (primary) + Qwen3:4B (fast)
- Briefing per-task timeouts — prevents one slow API from blocking everything
- Oneshot error fallbacks — instant error responses instead of falling through to slow agents
- Fast chat tier — slim prompt, truncated context, 10-15s conversational responses
- TTFT as primary UX metric — median 3.7s, 69% responsive (<6s)
- Cloud LLM via Groq API (Qwen3-32B, sub-100ms latency, 535 tok/s)
- All LLM paths routed through
cloud_chat()— tool dispatch, agents, formatting, chat - Automatic fallback to local Ollama when cloud unavailable or API key unset
- Thinking block filtering (
<think>...</think>) for Qwen reasoning models - Stream format bridging — Ollama and OpenAI chunk formats unified via
extract_stream_content() - Average response time: 54s → 6.5s (8x improvement)
- Split 1955-line orchestrator into 6 focused modules (prompts, router, fast_path, oneshot, briefing, orchestrator)
- LLM-based intent classification via Groq (~1s) with regex fallback for offline use
- Research agent benchmarks: 45-90s → 4-6s (12x improvement)
- Clean cloud/local auto-switch: no API key = fully local, with key = cloud
- Always-on ambient listening — mic stays open, all speech transcribed continuously
- Trigger word activation — say "Friday" naturally mid-conversation, no wake word needed
- Rolling transcript buffer — 5 minutes of ambient context, injected when triggered
- Follow-up window — 15 seconds after response, any speech treated as directed at FRIDAY
- Cloud TTS — ElevenLabs Flash v2.5 streaming (~75ms), Kokoro local fallback
- Noise/hallucination filtering — parenthetical descriptions, music, TV all filtered out
- VAD tuning — threshold 0.7 filters background music, 400ms min speech
-
/listening-offand/listening-onCLI commands - Cloud vs local TTS — set/remove
ELEVENLABS_API_KEYin.envto switch
- iMessage integration — read conversations from
chat.db, send via AppleScript, NSAttributedString parsing - FaceTime integration — initiate video/audio calls, multi-number contact handling
- Contact resolution — fuzzy matching with word-overlap scoring, nickname/emoji support
- Heartbeat system — proactive background loop (30min default), zero-LLM silent ticks, 1 LLM synthesis only when urgent
- Configurable via
HEARTBEAT.md— plain English, editable at runtime - Quiet hours (1am-7am), daily alert cap (3/day), morning briefing trigger
- Cron scheduler — user-defined scheduled tasks, standard 5-field cron expressions
- Cron tools —
create_cron,list_crons,delete_cron,toggle_cron(conversational creation) - Standing orders (watch tasks) — "watch X's messages for the next hour, reply like me"
- Watch task reasoning — LLM decides if a message needs a reply (skips "okay", "lol", thumbs up)
- Watch identity switching — reply as Travis or as FRIDAY based on instruction + conversation context
- Auto-detection — if Travis introduces FRIDAY or the other person mentions her, she switches to herself
- @friday tagging — type
@fridayin iMessage mid-conversation and she jumps in (requires active watch) - Deflection rules — never agrees to calls, money, or plans. Deflects casually.
- Watch deduplication — updating a watch for the same contact modifies the existing one, no duplicates
- Baseline-first — first tick records state, only replies on genuinely new messages after watch creation
- Watch type classification — keyword dispatch to iMessage, email, calls, or browser executors
- Email watch — reads unread emails, filters by sender keyword, notifies on new matches
- Call log watch — reads missed calls, fingerprints latest, notifies on new missed calls
- Browser watch — opens URL via Playwright, hashes page content, LLM summarizes changes
- Phone notifications — iMessage to self, instant delivery, works with DND bypass
-
/clearwatchesCLI command — kill all active watches instantly - All background systems boot automatically on CLI startup
- Screen vision — "can you see what I'm doing", OCR + vision model, privacy-gated, 48h auto-delete
- Full-page question solver — "solve the questions on Safari", scrolls entire page, OCRs + deduplicates, solves all questions, saves formatted .docx with app targeting and viewport-only mode
- Multi-agent deep research — parallel sub-agents (search + fetch + read + write), phased execution, produces real documents saved to disk
- Skill system (knowledge docs agents read before executing)
- Fine-tuning data collection from sessions
- QLoRA fine-tune on smaller model (personality + routing baked into weights)
- Additional agents (Git, Deploy, Database)
- Self-hosted inference on Modal/RunPod (for privacy or custom fine-tuned models)
- FRIDAY iOS app — native push notifications via APNs, full assistant UI
- Mac Mini server — FRIDAY runs 24/7, ngrok/tunnel for remote access
- Redis async messaging between agents
- MCP server integration (Twilio official MCP available)
- Screenpipe integration (screen context awareness)
- Self-improving loop (auto fine-tune from corrections)
- Multi-user support
- Plugin/extension system
FRIDAY uses Groq for cloud inference — an OpenAI-compatible API running Qwen3-32B at 535 tokens/second with sub-100ms latency. This is what makes FRIDAY feel instant.
Running Qwen3.5-9B locally on an M4 MacBook Air gave us 54s average response time. The M4 Air is fanless — under sustained LLM load, the GPU thermally throttles 2-15x. A 2-call search query took 25-45s. Agent tasks took 45-90s. Cloud inference brought the average down to 6.5s — an 8x improvement.
We tested 4 models across 3 providers:
| Model | Avg Time | Tool Accuracy | Issues |
|---|---|---|---|
| Llama 3.3 70B (Groq) | 8.2s | 60% | Malformed tool calls, string-typed args, fake tool names |
| Llama 3.1 8B (Groq) | 4.8s | 85% | Fast but wrong answers, hallucinated specs |
| Kimi K2 (Groq) | 12.1s | 70% | Slow on follow-ups, 35-48s for some queries |
| Qwen3-32B (Groq) | 6.5s | 100% | Zero tool call failures, best personality match |
Qwen3-32B won on every metric: zero tool failures, accurate search results, proper Ghanaian personality, and fast enough to feel responsive.
All LLM calls go through cloud_chat() in friday/core/llm.py. If Groq is available, it uses the cloud. If not, it silently falls back to local Ollama. No code changes needed to switch.
cloud_chat()
├─ Groq API available? → use cloud (sub-second per call)
└─ No API key or network down? → fall back to local Ollama (10-25s per call)
| Query Type | Local Ollama (M4 Air) | With Groq | Speedup |
|---|---|---|---|
| Greetings, TV commands | <1s | <1s | Same (no LLM) |
| Search query (oneshot) | 25-45s | 3-5s | ~8x |
| Research agent (2 LLM + Tavily) | 45-90s | 4-6s | ~12x |
| Agent task (ReAct loop) | 45-90s | 5-10s | ~9x |
| Intent classification | 10-25s (regex only) | ~1s (LLM) | ~15x |
| Casual chat | 10-25s | 0.5-2s | ~10x |
| Average | ~54s | ~5s | ~10x |
FRIDAY auto-detects what's available. No config flags, no code changes — just environment variables. See Quick Start for setup.
GROQ_API_KEY set?
├─ Yes → cloud_chat() uses Groq API (~1s per call)
│ classify_intent() uses LLM for smart agent routing
│ Auto-fallback to Ollama if Groq is unreachable
│
└─ No → cloud_chat() routes to local Ollama (~10-25s per call)
classify_intent() skips, regex handles all routing
Zero cloud calls, fully offline capable
ELEVENLABS_API_KEY set?
├─ Yes → TTS uses ElevenLabs Flash v2.5 (~75ms streaming)
│ Falls back to Kokoro if cloud fails
│
└─ No → TTS uses Kokoro-82M ONNX (~500ms local synthesis)
Zero cloud calls, fully offline
To switch: add or remove the API key from .env and restart FRIDAY. That's it.
All config lives in friday/core/config.py:
# Cloud LLM (Groq — default, fastest)
CLOUD_API_KEY = os.getenv("GROQ_API_KEY", "")
CLOUD_BASE_URL = os.getenv("CLOUD_BASE_URL", "https://api.groq.com/openai/v1")
CLOUD_MODEL_NAME = os.getenv("CLOUD_MODEL", "qwen/qwen3-32b")
USE_CLOUD = bool(CLOUD_API_KEY) # Auto-enable if key present
# Local Ollama (fallback)
MODEL_NAME = "qwen3.5:9b" # Local model (used when cloud unavailable)
OLLAMA_BASE_URL = "http://localhost:11434"Environment variables (.env):
TAVILY_API_KEY=your-key-here
GROQ_API_KEY=gsk_... # Optional — enables cloud LLM inference
ELEVENLABS_API_KEY=... # Optional — enables cloud TTS (local Kokoro fallback)
ELEVENLABS_VOICE_ID=JBFqnCBsd6RMkjVDRZzb # Optional — defaults to "George"
Google credentials (managed by google_auth.py):
~/.friday/google_credentials.json # OAuth2 client config (from Google Cloud Console)
~/.friday/google_token.json # Auto-saved after first auth
- Speed first, local always available — cloud inference via Groq for sub-second LLM calls. Automatic fallback to local Ollama when offline. Remove the API key and everything runs on your machine.
- Agents are specialists — each agent gets focused context and tools. No god-agent.
- Memory is identity — FRIDAY remembers you. That's what makes it personal.
- Speed over perfection — streaming, think control, fast routing. Latency kills the vibe.
- Personality is not optional — a tool without personality is just a tool.
FRIDAY uses Groq's cloud API for fast inference. Current pricing for the model we use:
| Qwen3-32B on Groq | |
|---|---|
| Input | $0.29 / million tokens |
| Output | $0.59 / million tokens |
| Speed | 662 tokens/sec |
| Free tier | Yes — free credits on signup |
What does this cost in practice? A typical FRIDAY query uses ~500 input tokens and ~200 output tokens. That's ~$0.0003 per query. $1 covers ~3,000 queries. The free tier is more than enough for personal use.
Sign up at console.groq.com — no credit card required for the free tier.
Apache License 2.0 — see LICENSE for full text.
Attribution: If you use FRIDAY in your project, product, or research, please credit the original author:
Built on FRIDAY by Travis Moore (Angelo Asante)
See NOTICE for full attribution requirements.
Built at 2am in Plymouth, UK. By Travis Moore.