Echo is a multimodal AI agent that lives on your desktop.
It doesn't just chat—it acts. Using advanced speech-to-speech models and the Model Context Protocol (MCP), Echo listens to your voice, understands your intent, and controls your computer to get things done.
Echo bridges the gap between conversational AI and OS-level control. Most assistants are trapped in a browser tab. Echo integrates with your operating system, allowing it to:
See your screen and understand context.
Hear your voice with sub-second latency (Real-time API).
Act on your apps, files, and workflows using specialized tools.
Whether you're managing a Google Classroom, automating a complex workflow, or building a presentation, Echo acts as your intelligent co-pilot.
Speak naturally. Echo uses Gemini 2.0 Flash's native audio capabilities for fluid, interruptible, human-like conversation. No "wake words" or robotic pauses.
Echo isn't limited to APIs. It can use your computer like a human:
- App Launching: "Open VS Code and Spotify."
- UI Interaction: Click, type, scroll, and navigate GUI applications.
- Screen Perception: It "sees" what you see to provide context-aware help.
Built on the open standard for AI tools. Echo connects to any MCP server:
- FileSystem: Read/Write files safely.
- Terminal: Execute commands and analyze output.
- Browser: Automate web research and tasks.
- Custom: Add your own tools easily.
A dedicated module for educators:
- Course Management: Create courses, invite students, and manage rosters.
- Assignment Automation: Draft and publish assignments with attachments.
- Smart Forms: Generate quizzes and feedback forms automatically.
Watch Echo "think" in real-time. The UI visualizes the Chain of Thought (CoT), showing you exactly how the agent plans and executes complex tasks step-by-step.
Echo uses a hybrid architecture to combine the best of web technologies and native performance.
graph TD
User((User)) -->|Voice/Text| ElectronUI[Electron App / TUI]
subgraph Frontend
ElectronUI -->|WebSocket| Backend
ElectronUI -->|Render| React[React UI]
end
subgraph Core ["Echo Backend (Python)"]
Backend[FastAPI Server] -->|Orchestrate| Agent[Planner Agent]
Backend -->|Stream| Voice[Gemini Live API]
Agent -->|Think| CoT[Chain of Thought]
Agent -->|Execute| Tools[Tool Manager]
end
subgraph Ecosystem ["MCP & APIs"]
Tools <-->|Connect| MCP[MCP Servers]
MCP -->|Control| Windows[Windows OS]
MCP -->|Manage| Classroom[Google Classroom API]
MCP -->|Access| Files[FileSystem]
end
- Python 3.10+
- Node.js 18+
- Google Gemini API Key (with Live API access)
-
Clone the repository
git clone https://github.com/your-org/echo-desktop-agent.git cd echo-desktop-agent -
Set up the environment Echo uses
uvfor fast Python package management (optional but recommended).# Install dependencies pip install -r requirements.txt # Or with uv uv sync
-
Configure Credentials Create a
.envfile in the root directory:GEMINI_API_KEY=your_api_key_here # Optional: For Classroom features GOOGLE_CLIENT_ID=... GOOGLE_CLIENT_SECRET=...
The full visual experience with Voice UI and Chain-of-Thought visualization.
# Terminal 1: Start the backend
python src/backend/main.py
# Terminal 2: Start the UI
cd electron-app
npm install
npm startA lightweight, hacker-friendly interface for the terminal.
# Voice Mode (Interactive)
python TUI.py --mode voice
# Fast Command Mode
python TUI.py --command "Open Notepad and type Hello World"- Gemini Live MCP: Detailed guide for the web frontend and Google Classroom integration.
- Quickstart Guide: Extended setup instructions.
- MCP Configuration: Configure connected tool servers.
Echo/
├── TUI.py # Terminal User Interface entry point
├── electron-app/ # Desktop UI (Node.js/React)
├── gemini_live_mcp/ # Next.js Web Frontend & Classroom Module
├── src/
│ ├── agent/ # Core Agent Logic (Planner, Executor)
│ ├── tools/ # Native Tool Implementations
│ └── utils/ # Helpers for Audio, MCP, Logging
└── tests/ # Unit and Integration Tests
Echo is currently a private research project. Contributions are limited to the core team.
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request