Shadow — Your AI Coworker for macOS

Shadow is a floating macOS AI agent powered by Gemini Live API that helps you control your computer with voice and vision. Talk to Shadow, and it sees your screen, opens apps, organizes files, searches the web, creates documents, and more.

Demo

Watch the 4-minute demo video →

Features

Voice Control — Tap the character to talk. Shadow listens via Gemini's native audio streaming.
Screen Vision — Shares your screen in real-time. Shadow can see and interact with any app.
30 Tools — File management, app control, mouse/keyboard automation, Google Search, image generation, PDF/DOCX creation & editing, and more.
Two Modes:
- Guided — Asks before every action, highlights targets on screen
- Auto — Acts immediately, narrates what it's doing
Multilingual — Switch languages on the fly (English, Spanish, French, German, Portuguese, Arabic, Chinese, Japanese, Korean, Hindi)
Document Creation — Creates and edits Word documents (.docx) with AI-generated images, fills PDF forms
Persistent Memory — Remembers your preferences across sessions

Architecture

Tech Stack

Component	Technology
Live Voice + Vision	`gemini-2.5-flash-native-audio` via Gemini Live API
Tool Planning + Search	`gemini-3-flash-preview` with Google Search grounding
Image Generation	`gemini-3.1-flash-image-preview`
Screen Vision	`gemini-3.1-flash-lite-preview`
macOS App	Swift, AppKit, AVFoundation, ScreenCaptureKit
Backend	Python, FastAPI, WebSocket, Google GenAI SDK
Deployment	Google Cloud Run + Artifact Registry

Quick Start

Prerequisites

macOS 14+ (Sonoma or later)
Xcode 15+
Python 3.11+
Google AI API key (get one here)

1. Clone & Setup Backend

git clone https://github.com/AnassKartit/shadow.git
cd shadow/backend

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Set your API key
echo "GOOGLE_API_KEY=your-key-here" > .env

# Start the backend
python main.py

The backend runs at ws://localhost:8000/ws. Health check: curl http://localhost:8000/health

2. Build & Run the Swift App

# Open in Xcode
open Shadow/Shadow.xcodeproj

# Or build from command line
cd Shadow && xcodebuild -scheme Shadow -configuration Debug build

Grant Microphone and Screen Recording permissions when prompted
The Shadow character appears as a floating panel on the right side of your screen
Click the character to start talking

3. Deploy Backend to Google Cloud (One Click)

Or deploy manually:

cd backend
gcloud run deploy shadow-backend \
  --source . \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars "GOOGLE_API_KEY=your-key-here,SHADOW_AUTH_TOKEN=your-secret-token"

4. Configure the App

Click the gear icon in Shadow to open settings:

Gemini API Key — Optional if your backend already has GOOGLE_API_KEY set
Auth Token — Must match the SHADOW_AUTH_TOKEN on your backend (protects your endpoint from unauthorized use)
Backend — Choose Cloud (default), Local, or Custom URL for your own deployment

Project Structure

shadow/
├── Shadow/Shadow/           # Swift macOS app
│   ├── ShadowApp.swift      # App delegate, setup, message routing
│   ├── FloatingPanel.swift  # Floating character UI
│   ├── OverlayWindow.swift  # Transparent highlight overlay
│   ├── AudioManager.swift   # Mic capture + audio playback
│   ├── ScreenCapture.swift  # Screen frame capture
│   ├── BackendClient.swift  # WebSocket client
│   └── HotkeyManager.swift  # Global keyboard shortcuts
├── backend/
│   ├── main.py              # FastAPI WebSocket + Gemini Live session
│   ├── shadow_agent/
│   │   ├── agent.py         # System prompt + tool registration
│   │   └── tools.py         # 30 computer-use tools
│   ├── Dockerfile           # Cloud Run deployment
│   └── requirements.txt
└── README.md

Tools (30)

Category	Tools
Files (9)	`list_files`, `search_files`, `read_file`, `create_folder`, `move_files`, `organize_desktop_files`, `find_duplicates`, `find_large_files`, `find_recent_file`
Apps & System (6)	`open_app`, `open_file`, `open_url`, `focus_app`, `applescript`, `run_command`
Screen & Input (5)	`computer_action`, `click_mouse`, `type_text`, `press_key`, `analyze_screen`
Documents (7)	`read_pdf`, `read_docx`, `create_docx`, `edit_docx`, `fill_pdf`, `get_pdf_fields`, `create_note_with_images`
Search (1)	`google_search_and_open` (with grounding)
Images (2)	`generate_explainer`, `generate_and_paste`

Hackathon Requirements

Requirement	How Shadow Meets It
Gemini Live API	`gemini-2.5-flash-native-audio` for real-time voice + screen streaming
Gemini model	`gemini-3-flash-preview` for tool planning and search
Google GenAI SDK	Direct SDK integration for Live API + all tool calls
Google Cloud service	Backend deployed on Cloud Run (us-central1)
Multimodal	Voice in + screen vision + voice out + file/app actions
Beyond text box	Floating character, screen highlights, direct computer control

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
Shadow		Shadow
backend		backend
docs		docs
.gitignore		.gitignore
README.md		README.md
build-and-run.sh		build-and-run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shadow — Your AI Coworker for macOS

Demo

Features

Architecture

Tech Stack

Quick Start

Prerequisites

1. Clone & Setup Backend

2. Build & Run the Swift App

3. Deploy Backend to Google Cloud (One Click)

4. Configure the App

Project Structure

Tools (30)

Hackathon Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shadow — Your AI Coworker for macOS

Demo

Features

Architecture

Tech Stack

Quick Start

Prerequisites

1. Clone & Setup Backend

2. Build & Run the Swift App

3. Deploy Backend to Google Cloud (One Click)

4. Configure the App

Project Structure

Tools (30)

Hackathon Requirements

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages