ARIA

Autonomous Responsive Interactive Assistant

ARIA is an AI-powered personal assistant with a 3D avatar that speaks to you. Built on Claude's multi-agent architecture, ARIA combines intelligent conversation routing, real-time voice synthesis, and an expressive avatar to create a natural, interactive AI experience.

    █████╗ ██████╗ ██╗ █████╗
   ██╔══██╗██╔══██╗██║██╔══██╗
   ███████║██████╔╝██║███████║
   ██╔══██║██╔══██╗██║██╔══██║
   ██║  ██║██║  ██║██║██║  ██║
   ╚═╝  ╚═╝╚═╝  ╚═╝╚═╝╚═╝  ╚═╝

What is ARIA?

ARIA is a voice-first AI assistant designed to feel like a conversation with a real person. When you speak to ARIA:

Your voice is transcribed in real-time
An intelligent Orchestrator routes your request to the right specialized agent
A specialized Agent (Code, Teach, Research, etc.) generates a response
ARIA speaks the response through a 3D animated avatar

Unlike text-only chatbots, ARIA is designed for spoken interaction. Responses are concise (2-4 sentences), conversational, and optimized for speech synthesis.

Key Features

Multi-Agent Architecture - Specialized agents for coding, teaching, research, and more
3D Animated Avatar - TalkingHead WebGL avatar with mood-based expressions
Browser-Based TTS - HeadTTS/Kokoro voice synthesis runs locally in your browser via WebGPU
Conversation Memory - Cache Augmented Generation (CAG) maintains context across turns
Conversation Mixer - Adjust verbosity, creativity, and response style in real-time
Cost-Optimized - Smart model tiering (Haiku for routing, Sonnet for tasks, Opus for complex reasoning)

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                              BROWSER                                     │
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐                    │
│  │  TalkingHead │   │   HeadTTS   │   │   Whisper   │                    │
│  │  3D Avatar   │   │   Kokoro    │   │    STT      │                    │
│  │   (WebGL)    │   │  (WebGPU)   │   │  (OpenAI)   │                    │
│  └──────┬───────┘   └──────┬──────┘   └──────┬──────┘                    │
│         │                  │                  │                          │
│         └──────────────────┼──────────────────┘                          │
│                            │ WebSocket                                   │
└────────────────────────────┼─────────────────────────────────────────────┘
                             │
┌────────────────────────────┼─────────────────────────────────────────────┐
│                         SERVER                                            │
│                            ▼                                              │
│  ┌─────────────────────────────────────────────────────────────────────┐ │
│  │                     ORCHESTRATOR (Haiku)                             │ │
│  │              Routes requests to specialized agents                   │ │
│  └─────────────────────────────┬───────────────────────────────────────┘ │
│                                │                                          │
│    ┌───────────┬───────────┬───┴───┬───────────┬───────────┬───────────┐ │
│    ▼           ▼           ▼       ▼           ▼           ▼           │ │
│ ┌──────┐  ┌────────┐  ┌────────┐ ┌──────┐  ┌────────┐  ┌────────┐      │ │
│ │CONVERSE│  │ TEACH  │  │  CODE  │ │RESEARCH│ │ ASSIST │  │  DEMO  │    │ │
│ │Sonnet │  │ Sonnet │  │Sonnet/O│ │Sonnet/O│ │ Sonnet │  │ Sonnet │    │ │
│ └──────┘  └────────┘  └────────┘ └────────┘ └────────┘  └────────┘      │ │
│                                │                                          │
│                    ┌───────────┴───────────┐                             │
│                    │  CAG Conversation Cache │                            │
│                    └───────────┬───────────┘                             │
│                                │                                          │
│                    ┌───────────┴───────────┐                             │
│                    │      Claude API       │                             │
│                    │      (Anthropic)       │                             │
│                    └───────────────────────┘                             │
└───────────────────────────────────────────────────────────────────────────┘

Agents

Agent	Model	Purpose
Orchestrator	Haiku	Fast routing, intent classification
Converse	Sonnet	General conversation, chitchat
Teach	Sonnet	Educational explanations with analogies
Code	Sonnet → Opus	Programming help, code review
Research	Sonnet → Opus	Deep analysis, fact-finding
Assist	Sonnet	Task planning, productivity
Demo	Sonnet	Interactive demonstrations

Agents automatically escalate to Opus for complex tasks.

Requirements

macOS on Apple Silicon (M1/M2/M3/M4) - optimized for native performance
Python 3.11+
Anthropic API Key - Get one at console.anthropic.com
Modern Browser - Chrome, Edge, or Safari with WebGPU support

Note: TTS runs entirely in your browser via WebGPU. No additional audio dependencies required on the server.

Installation

1. Clone the Repository

git clone https://github.com/yourusername/aria.git
cd aria

2. Create Virtual Environment

python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

3. Configure Environment

cp .env.example .env

Edit .env and add your Anthropic API key:

ANTHROPIC_API_KEY=sk-ant-api03-your-key-here
ARIA_HOST=0.0.0.0
ARIA_PORT=8080
LOG_LEVEL=INFO

4. (Optional) Generate SSL Certificates

For microphone access, browsers require HTTPS. Generate self-signed certificates:

mkdir -p certs
openssl req -x509 -newkey rsa:4096 -keyout certs/key.pem -out certs/cert.pem -days 365 -nodes -subj "/CN=localhost"

Running ARIA

Quick Start

./run.sh

Then open your browser to: http://localhost:8080

Startup Options

./run.sh              # Standard startup
./run.sh --debug      # Enable debug logging
./run.sh --reload     # Auto-reload on code changes
./run.sh --ssl        # HTTPS mode (required for microphone)

Manual Startup

source venv/bin/activate
python -m src.main --port 8080

First Run Notes

TTS Model Download: The first time you load ARIA, the browser will download the Kokoro TTS model (~80MB). This may take 30-60 seconds.
Microphone Access: To use voice input, you must run with --ssl and accept the self-signed certificate in your browser.

Usage

Speaking to ARIA

Click the microphone button or press Space to start speaking
Speak naturally - ARIA waits for a 1.2-second pause before responding
ARIA's response appears in the chat and is spoken by the avatar

Conversation Mixer

The equalizer-style mixer panel lets you adjust ARIA's responses in real-time:

Control	Range	Effect
Speed	0.5x - 2x	TTS playback speed
Pause	0 - 4s	Delay before responding
Verbosity	1 - 5	Response length (brief ↔ detailed)
Creativity	1 - 5	Response variation (precise ↔ creative)
Formality	1 - 5	Tone (casual ↔ professional)

Mode Presets

Quick - Brief, to-the-point answers
Explain - Balanced educational responses
Ideate - Creative brainstorming mode
Deep - Thorough, detailed analysis

Agent Routing

ARIA automatically routes your requests:

"Hey, how's it going?" → Converse (chitchat)
"Explain how neural networks work" → Teach (education)
"Write a Python function to..." → Code (programming)
"What are the pros and cons of..." → Research (analysis)
"Help me plan my project" → Assist (task planning)
"Show me what you can do" → Demo (demonstration)

Project Structure

aria/
├── src/
│   ├── main.py              # Entry point
│   ├── agents/              # Agent implementations
│   │   ├── base.py          # Base agent class
│   │   ├── orchestrator.py  # Routing agent
│   │   ├── converse.py      # General conversation
│   │   ├── teach.py         # Educational explanations
│   │   ├── code.py          # Programming assistance
│   │   └── specialized.py   # Research, Assist, Demo
│   ├── core/
│   │   ├── aria.py          # Main orchestrator
│   │   └── cache/           # CAG conversation cache
│   └── api/
│       └── websocket.py     # WebSocket & REST API
│
├── frontend/
│   └── index.html           # Single-page web application
│
├── certs/                   # SSL certificates (optional)
├── docs/                    # Documentation & presentations
├── run.sh                   # Startup script
├── requirements.txt         # Python dependencies
├── .env.example             # Environment template
└── CLAUDE.md                # Project instructions

API Reference

WebSocket Endpoint

ws://localhost:8080/ws/chat

Client → Server Messages

// Send a message
{ "type": "message", "content": "Hello ARIA" }

// Reset conversation
{ "type": "reset" }

// Get statistics
{ "type": "stats" }

// Update mixer settings
{
  "type": "mixer_settings",
  "settings": {
    "speed": 1.0,
    "pauseTime": 1,
    "verbosity": 3,
    "creativity": 3,
    "formality": 2,
    "mode": "quick"
  }
}

Server → Client Messages

// Thinking indicator
{ "type": "thinking", "agent": "Orchestrator" }

// Response
{
  "type": "response",
  "content": "Hello! How can I help you today?",
  "agent": "Converse",
  "mood": "happy",
  "tokens": 42
}

// Error
{ "type": "error", "message": "..." }

REST Endpoints

Endpoint	Method	Description
`/`	GET	Serve main interface
`/health`	GET	Health check
`/api/stats`	GET	Global statistics
`/api/presentations`	GET	List available presentations
`/api/presentation/{filename}`	GET	Serve presentation PDF

Configuration

Environment Variables

Variable	Default	Description
`ANTHROPIC_API_KEY`	required	Your Anthropic API key
`ARIA_HOST`	`0.0.0.0`	Server bind address
`ARIA_PORT`	`8080`	Server port
`LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)

Model Configuration

Models are configured in the agent classes:

class ModelTier(Enum):
    HAIKU = "claude-3-5-haiku-latest"   # Fast, cheap - routing
    SONNET = "claude-sonnet-4-20250514" # Balanced - most tasks
    OPUS = "claude-opus-4-20250514"     # Powerful - complex reasoning

Troubleshooting

"ANTHROPIC_API_KEY not set"

Ensure your .env file exists and contains a valid API key:

cat .env | grep ANTHROPIC

No Sound / TTS Not Working

Ensure you're using a WebGPU-compatible browser (Chrome 113+, Edge 113+)
Check the browser console for WebGPU errors
The first load downloads a ~80MB model - wait for it to complete

Microphone Not Working

Run ARIA with --ssl flag: ./run.sh --ssl
Accept the self-signed certificate warning in your browser
Grant microphone permissions when prompted

Avatar Not Displaying

Check browser console for WebGL errors
Ensure hardware acceleration is enabled in your browser
Try refreshing the page

High Latency

Check your internet connection to Anthropic's API
Use "Quick" mode in the mixer for faster responses
Reduce verbosity slider to get shorter responses

Development

Running Tests

pytest tests/ -v

Debug Mode

./run.sh --debug

This enables:

Verbose logging
Detailed API request/response logs
WebSocket message tracing

Auto-Reload

./run.sh --reload

Automatically restarts the server when Python files change.

Technology Stack

Component	Technology	Purpose
Backend	FastAPI + Uvicorn	WebSocket API server
AI Brain	Claude API (Anthropic)	Multi-agent conversation & reasoning
Image Gen	Gemini API (Google)	AI image generation (optional)
Image Gen	Local Stable Diffusion	Local image generation (optional)
Avatar	TalkingHead	WebGL 3D avatar
TTS	HeadTTS / Kokoro	Browser-based voice synthesis (WebGPU)
STT	OpenAI Whisper	High-accuracy transcription (default)
STT	Web Speech API	Browser-based speech recognition (fallback)
Frontend	Vanilla JS	Single-page application

API Keys

API Key	Purpose	Get it at
`ANTHROPIC_API_KEY`	Required - Claude AI	console.anthropic.com
`OPENAI_API_KEY`	Recommended - Whisper STT (default, more accurate)	platform.openai.com/api-keys
`GEMINI_API_KEY`	AI image generation (optional)	aistudio.google.com/apikey

License

This software and associated documentation files (the "Software") are the exclusive property of [Your Name].

NO PERMISSION is granted to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software without explicit written permission from the copyright holder.

Commercial use requires a separate licensing agreement and royalty payments. Contact: sealmindset@gmail.com, ravance@gmail.com

Acknowledgments

Anthropic - Claude API and multi-agent architecture inspiration
TalkingHead - WebGL avatar rendering
Kokoro - Browser-based TTS via WebGPU
Original PULSE project - Foundation and design patterns

Built with Claude on Apple Silicon

Name		Name	Last commit message	Last commit date
Latest commit History 1,357 Commits
auth_system		auth_system
certs		certs
data		data
docs		docs
frontend		frontend
images		images
node_modules		node_modules
notebook-ui		notebook-ui
notes		notes
presentations		presentations
prompts		prompts
skills		skills
src		src
test-results		test-results
.DS_Store		.DS_Store
.claude_settings.json		.claude_settings.json
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
app_spec.txt		app_spec.txt
claude-progress.txt		claude-progress.txt
features.db		features.db
init.sh		init.sh
install.ps1		install.ps1
install.sh		install.sh
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
run.sh		run.sh
stop.sh		stop.sh
uninstall.ps1		uninstall.ps1
uninstall.sh		uninstall.sh

Folders and files

Latest commit

History

Repository files navigation

ARIA

What is ARIA?

Key Features

Architecture

Agents

Requirements

Installation

1. Clone the Repository

2. Create Virtual Environment

3. Configure Environment

4. (Optional) Generate SSL Certificates

Running ARIA

Quick Start

Startup Options

Manual Startup

First Run Notes

Usage

Speaking to ARIA

Conversation Mixer

Mode Presets

Agent Routing

Project Structure

API Reference

WebSocket Endpoint

Client → Server Messages

Server → Client Messages

REST Endpoints

Configuration

Environment Variables

Model Configuration

Troubleshooting

"ANTHROPIC_API_KEY not set"

No Sound / TTS Not Working

Microphone Not Working

Avatar Not Displaying

High Latency

Development

Running Tests

Debug Mode

Auto-Reload

Technology Stack

API Keys

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages