Your real-life J.A.R.V.I.S. — an autonomous AI operating system for solo founders.
Voice-controlled. Self-improving. ADHD-native.
One founder + Friday = a 5-person engineering team.
"Will that be all, sir?" — No. Friday does more. Friday is what comes after Jarvis.
In the MCU, Tony Stark replaced J.A.R.V.I.S. with F.R.I.D.A.Y. — an AI that didn't just answer questions, but ran the suit, managed the lab, and made the calls when Tony couldn't. This project is the real-world version of that idea: an always-on AI chief of staff that perceives your workflow, remembers everything, writes production code, and gets smarter every week.
Friday is not a chatbot with a microphone. It's an always-on operating system that wraps around your entire workflow — perceiving what you're doing, remembering everything across projects, generating production-quality code through a validation factory, and getting measurably better every week without manual tuning.
The thesis: most of what slows a solo founder down isn't coding ability — it's context-switching overhead, decision fatigue, forgotten context, and the constant tax of re-explaining what you're working on. Friday eliminates all of it.
You: "Friday, add a health check endpoint to the API that returns uptime and DB status"
↓ intent classified (MEDIUM) → memory retrieved → spec generated
↓ routed to Claude Code → code generated → syntax + lint + tests validated
↓ checkpoint committed → result delivered
Friday: "Done. Health endpoint at /health, returns uptime and pg connection status. Tests pass."
| Problem | How Friday solves it |
|---|---|
| "What was I working on?" | Persistent memory + morning briefings reconstruct your context automatically |
| Context-switching kills momentum | ADHD-native focus tracking detects drift, parks side-quests, nudges you back |
| AI coding tools produce inconsistent output | Code Factory pipeline: spec → decompose → generate → validate → fix — not hope |
| Every AI tool starts from zero | Friday remembers your architecture decisions, coding patterns, and past mistakes forever |
| Token costs spiral with dumb context injection | Complexity classification routes 70% of tasks with near-zero token cost |
| AI assistants require constant hand-holding | Screen awareness means Friday already knows your project before you speak |
┌─────────────────────────────────────────────────────────────┐
│ YOUR MACHINE │
│ ┌─────────────┐ ┌────────────┐ ┌─────────────────────┐ │
│ │ Wake Word │ │ Screen │ │ TTS Playback │ │
│ │ (Porcupine) │ │ Awareness │ │ (streaming) │ │
│ └──────┬───────┘ └─────┬──────┘ └──────▲──────────────┘ │
│ │ Mic capture │ Window/focus │ │
│ ▼ + VAD ▼ tracking │ │
│ ┌──────────────────────────────────────────┐ │
│ │ WebSocket Client (WSS) │ │
│ └──────────────────┬───────────────────────┘ │
└─────────────────────┼───────────────────────────────────────┘
│ Persistent connection
┌─────────────────────┼───────────────────────────────────────┐
│ SERVER ▼ │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ 7-Step Orchestrator │ │
│ │ 1. Transcribe 2. Classify 3. Retrieve memory │ │
│ │ 4. Brain call 5. Dispatch 6. Write memory │ │
│ │ 7. Log outcome │ │
│ └───────┬──────────┬──────────┬──────────┬─────────────┘ │
│ │ │ │ │ │
│ ┌───────▼───┐ ┌────▼────┐ ┌──▼───┐ ┌───▼────────────┐ │
│ │ Memory │ │ Brain │ │ Code │ │ Self-Improve │ │
│ │ SQLite + │ │ OpenAI │ │ Fac- │ │ Saturday Review │ │
│ │ ChromaDB │ │ Claude │ │ tory │ │ Night Shift │ │
│ │ Vector │ │ Ollama │ │ │ │ Consolidation │ │
│ └───────────┘ └─────────┘ └──────┘ └────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ ADHD │ │ Frontend │ │ Research │ │ War Room │ │
│ │ Tracker │ │ Pipeline │ │ Engine │ │ State Doc │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────┘
- Wake word detection via Porcupine — say "Friday" and talk naturally
- Screen awareness — tracks your active window, detects which project you're in, captures context every 5 seconds
- Voice Activity Detection — knows when you've stopped speaking, handles interruptions
- < 2 second end-to-end voice response latency
- Vector search (ChromaDB + sentence-transformers) across all your projects, decisions, and corrections
- Token-budgeted retrieval — trivial tasks get zero context, complex tasks get full memory
- Contradiction detection — new facts automatically supersede old ones
- Weekly consolidation — deduplicates, archives stale entries, resolves conflicts
Not a single LLM call — a full production pipeline:
- Natural language → structured
TaskSpec - Spec → atomic
TaskUnits with dependency graph - Route to best tool (Claude Code / Codex / Cursor) based on empirical performance
- Generate → validate (syntax + lint + tests) → diagnose failures → retry
- Git checkpoint before every change, automatic rollback on failure
- Max 3 retry cycles per step — fail with diagnosis, never silently ship broken code
Built for brains that context-switch:
- Focus stack — ordered list of what you're working on, always visible
- Drift detection — notices when you're bouncing between projects, nudges you back
- Scope parking — "noted, parking that for after the current task"
- Decision compression — picks the best option, gives you 3 seconds to override
- Energy gating — pushes back on deep work requests after 11pm
Friday gets better every week without you doing anything:
- Saturday Review — 6 automated jobs: prompt refinement, memory consolidation, routing optimization, design system evolution, correction mining, ADHD calibration
- Night Shift — runs test suites, health checks, and micro-improvements while you sleep
- Monday briefing — what Friday learned last week, delivered with your morning coffee
Specialized UI generation that avoids AI slop:
- Design system enforcement (HSL tokens, typography scale, spacing scale)
- Component-level isolation — each component gets its own brain call with only design tokens
- Visual QA via headless Playwright + vision model review at 3 viewport widths
- Anti-AI-aesthetic rules hardcoded as constraints
Your operational command center:
- Active projects with status, open commitments, approaching deadlines
"Friday, status"→ voice summary of everything that matters right now- Priority triage: what should you work on next, with reasoning
- Python 3.11+
- A brain provider API key (OpenAI, Anthropic, or local Ollama)
git clone https://github.com/user/friday.git
cd friday
python -m venv .venv && source .venv/bin/activate # or .venv\Scripts\activate on Windows
pip install -e ".[dev]"cp .env.example .env
# Edit .env — at minimum set:
# BRAIN_PROVIDER=openai
# BRAIN_API_KEY=sk-...
# BRAIN_MODEL_NAME=gpt-4o-mini# Terminal 1 — Server
make run-server
# Terminal 2 — Client (requires mic + speakers)
make run-client
# Say "Friday, hello" and start talkingmake test # 142 tests
make lint # ruff checkStart the server and open http://localhost:8765/dashboard — a dark, editorial control panel for monitoring memory, outcomes, focus state, and system health.
friday/
├── client/ # Local PC daemon
│ ├── audio/ # Wake word, mic capture, VAD, TTS playback
│ ├── screen/ # Window tracking, focus stack, project detection
│ ├── connection/ # WebSocket client with auto-reconnect
│ └── main.py # Entry point
├── server/ # Remote server
│ ├── api/ # WebSocket handler, dashboard API, auth, health
│ ├── pipeline/ # 7-step orchestrator, intent, transcription, TTS
│ ├── brain/ # Multi-provider LLM router (OpenAI, Anthropic, Ollama)
│ ├── memory/ # SQLite + ChromaDB, retrieval, writer, consolidation
│ ├── code_factory/ # Spec → decompose → generate → validate pipeline
│ ├── frontend/ # Design system, component gen, visual QA pipeline
│ ├── actions/ # Shell, filesystem, comms, research dispatchers
│ ├── adhd/ # Focus tracker, interventions, calibration
│ ├── war_room/ # Persistent operational state
│ └── improvement/ # Saturday review, night shift, outcome logging, backup
├── shared/ # Protocol, models, constants
├── dashboard/ # Static HTML/CSS/JS control panel
├── deploy/ # Systemd service files
├── scripts/ # Setup, seeding, stress test
└── tests/ # 142 tests covering all modules
All configuration via environment variables (loaded by Pydantic Settings):
| Variable | Default | Description |
|---|---|---|
BRAIN_PROVIDER |
openai |
LLM backend: openai, anthropic, local (Ollama) |
BRAIN_API_KEY |
— | API key for the configured brain provider |
BRAIN_MODEL_NAME |
gpt-4o-mini |
Model name passed to the provider |
OPENAI_API_KEY |
— | OpenAI key for Codex tool routing |
ANTHROPIC_API_KEY |
— | Anthropic key for Claude models |
FRIDAY_DATA_DIR |
./data |
Root directory for SQLite, ChromaDB, logs |
FRIDAY_TOKEN_BUDGET_DAILY |
500000 |
Hard daily token cap |
FRIDAY_IMPROVEMENT_BUDGET_WEEKLY |
100000 |
Token budget for Saturday Review |
FRIDAY_LOG_LEVEL |
INFO |
Log level (DEBUG, INFO, WARNING) |
FRIDAY_WS_SHARED_SECRET |
— | WebSocket auth token (simple mode) |
FRIDAY_WS_JWT_SECRET |
— | JWT signing key (advanced mode) |
SARVAM_API_KEY |
— | Sarvam AI key for STT/TTS |
PORCUPINE_ACCESS_KEY |
— | Picovoice key for wake word detection |
See .env.example for the complete list.
Every interaction flows through the same pipeline:
- Transcribe — Audio → text via Sarvam AI (primary) or local Whisper (fallback)
- Classify — Deterministic regex for built-in commands, heuristic complexity scoring for everything else
- Retrieve — Token-budgeted memory retrieval: project knowledge, identity facts, correction rules
- Think — Structured JSON response from the brain (model-agnostic, schema-enforced)
- Dispatch — Route to handler: talk, code factory, shell, file system, research, frontend pipeline
- Remember — Write new facts to memory, detect contradictions, deduplicate
- Log — Every interaction → JSONL outcome record for self-improvement analysis
Not every request needs GPT-4. Friday classifies complexity before any model call:
| Complexity | Context Budget | Example |
|---|---|---|
| TRIVIAL | 0 tokens | "rename this file", "fix the typo" |
| SIMPLE | 1,000 tokens | "add an import", "what's this function do?" |
| MEDIUM | 2,000 tokens | "build a login page", "refactor this module" |
| COMPLEX | 3,500 tokens | "redesign the auth system", "plan the migration" |
70% of daily interactions hit TRIVIAL/SIMPLE — near-zero cost.
┌─────────────────────┐
│ User Interaction │
└──────────┬──────────┘
│
┌───────────▼───────────┐
│ Memory Writer │
│ • Deduplication │
│ • Contradiction check │
└───┬─────────┬─────────┘
│ │
┌──────────▼──┐ ┌──▼─────────────┐
│ SQLite │ │ ChromaDB │
│ • Sessions │ │ • Projects │
│ • Projects │ │ • Identity │
│ • Outcomes │ │ • Corrections │
└─────────────┘ └────────────────┘
│ │
┌───▼─────────▼─────────┐
│ Context Retriever │
│ • Token budgeting │
│ • Vector similarity │
│ • Priority ranking │
└───────────────────────┘
sudo cp deploy/friday-server.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now friday-servercp deploy/friday-client.service ~/.config/systemd/user/
systemctl --user enable --now friday-clientSee deploy/README.md for full instructions.
- Multi-device sync (phone + desktop)
- Plugin system for custom tools
- Real-time pair programming mode with screen sharing
- Native mobile companion app
- Calendar integration for proactive scheduling
- Slack/Discord bot mode
Contributions welcome. Please:
- Fork the repo and create a feature branch
- Write tests for new functionality
- Run
make test && make lintbefore submitting - Keep PRs focused — one feature per PR
MIT — see LICENSE for details.
Friday — Jarvis was the butler. Friday is the chief of staff.