"Let your models work together: relay, debate, verify."
TandemAI orchestrates multiple LLMs (local + API) in structured collaborative modes for better AI responses.
- Conversation: Natural chat with 3 models responding sequentially
- Answer: Seed β Refine β Polish for best final answer
- Argumentative: Two models debate β Arbiter decides
- Research: Outline β Suggestions β Rewrite for structured research
Local LLMs (Privacy-First):
- β Ollama (http://localhost:11434/v1)
- β LM Studio (http://localhost:1234/v1)
- β vLLM (http://localhost:8000/v1)
- β llama.cpp (http://localhost:8080/v1)
API Providers:
- β OpenAI (GPT-4, GPT-3.5)
- β Anthropic (Claude 3.5 Sonnet)
- β Groq (Ultra-fast inference)
- β Together AI, Fireworks AI
- β OpenRouter (Unified access)
- β DeepSeek (Coding specialist)
- β Kimi/Moonshot (Long context)
- Offline-capable - Works without internet using local models
- Mix & match - Combine local and API providers
- Real-time monitoring - Watch models collaborate in real-time
- Export capabilities - Save conversations with full orchestration details
cd app
yarn install
yarn dev
# Open http://localhost:3000# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh
# Pull a lightweight model
ollama pull llama3.1:8b
# Or install LM Studio from https://lmstudio.ai- Go to Providers tab
- Enable local providers or add API keys
- Test connections to verify they work
- Go to Orchestration tab
- Choose a Quick Preset or customize your own:
- π‘οΈ Local Privacy - Only local models
- β‘ Speed First - Fast lightweight responses
- β Max Quality - Premium models for best results
- π§ Research Deep - Multi-round comprehensive research
- Go to Chat tab
- Ask questions and watch models collaborate!
- Enable Thinking Mode to see intermediate steps
Perfect for sensitive work - uses only local models:
Ollama (Seed) β LM Studio (Refine) β Local Judge
Best of both worlds - API seed, local refine:
GPT-4 (Seed) β Claude (Analyze) β Local Ollama (Polish)
Fast collaborative research:
Groq (Outline) β Together AI (Suggest) β Claude (Rewrite)
Get multiple perspectives:
GPT-4 vs Claude β Local Arbiter decides
- Early Stopping - Stop when changes are minimal
- Temperature Control - Different creativity for seed vs refine
- Token Tracking - Monitor usage across all providers
- Diff Visualization - See what changed between rounds
- Drag-and-drop - Reorder provider sequences
- Export - Save chat history as Markdown
openai_compat- Local servers (Ollama, LM Studio, vLLM, llama.cpp)openai- OpenAI APIanthropic- Anthropic Claudegroq- Groq APItogether- Together AIfireworks- Fireworks AIopenrouter- OpenRouterdeepseek- DeepSeek APIkimi- Kimi/Moonshot
Each mode has different collaboration patterns optimized for specific use cases.
Quick-start configurations for common scenarios with recommended model combinations.
TandemAI is designed to be packaged as a desktop application:
# Install Tauri CLI
cargo install tauri-cli
# Package as desktop app
tauri build# Install Electron
npm install -g electron
# Package app
electron-builder# Build static export
yarn build
yarn export
# Can be served locally or packaged with any web wrapperInstead of just switching between different AI models, TandemAI makes them collaborate:
- Seed models generate initial responses
- Refiner models improve and correct
- Specialists add domain expertise
- Arbiters resolve conflicts and merge perspectives
This creates more thoughtful, accurate, and comprehensive responses than any single model alone.
Current (MVP):
- β All 4 orchestration modes
- β Universal provider support
- β Real-time streaming
- β Offline capability
Next:
- π Fact-checker integration (NLI models)
- π Judge scoring system
- π RAG/document integration
- ποΈ Voice roundtable mode
- π± Mobile app
TandemAI - Where AI models work together for better responses! π€β¨