Skip to content

XxRemsteelexX/TandemAI

Repository files navigation

TandemAI - Local-First LLM Ensemble Platform

"Let your models work together: relay, debate, verify."

TandemAI orchestrates multiple LLMs (local + API) in structured collaborative modes for better AI responses.

✨ Features

🧠 4 Orchestration Modes

  • Conversation: Natural chat with 3 models responding sequentially
  • Answer: Seed β†’ Refine β†’ Polish for best final answer
  • Argumentative: Two models debate β†’ Arbiter decides
  • Research: Outline β†’ Suggestions β†’ Rewrite for structured research

πŸ”Œ Universal Provider Support

Local LLMs (Privacy-First):

API Providers:

  • βœ… OpenAI (GPT-4, GPT-3.5)
  • βœ… Anthropic (Claude 3.5 Sonnet)
  • βœ… Groq (Ultra-fast inference)
  • βœ… Together AI, Fireworks AI
  • βœ… OpenRouter (Unified access)
  • βœ… DeepSeek (Coding specialist)
  • βœ… Kimi/Moonshot (Long context)

πŸ›‘οΈ Privacy & Control

  • Offline-capable - Works without internet using local models
  • Mix & match - Combine local and API providers
  • Real-time monitoring - Watch models collaborate in real-time
  • Export capabilities - Save conversations with full orchestration details

πŸš€ Quick Start

1. Install & Run

cd app
yarn install
yarn dev
# Open http://localhost:3000

2. Local Setup (Recommended)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a lightweight model
ollama pull llama3.1:8b

# Or install LM Studio from https://lmstudio.ai

3. Configure Providers

  1. Go to Providers tab
  2. Enable local providers or add API keys
  3. Test connections to verify they work

4. Setup Orchestration

  1. Go to Orchestration tab
  2. Choose a Quick Preset or customize your own:
    • πŸ›‘οΈ Local Privacy - Only local models
    • ⚑ Speed First - Fast lightweight responses
    • ⭐ Max Quality - Premium models for best results
    • 🧠 Research Deep - Multi-round comprehensive research

5. Start Chatting

  1. Go to Chat tab
  2. Ask questions and watch models collaborate!
  3. Enable Thinking Mode to see intermediate steps

🎯 Use Cases

Local Privacy Mode

Perfect for sensitive work - uses only local models:

Ollama (Seed) β†’ LM Studio (Refine) β†’ Local Judge

Hybrid Quality Mode

Best of both worlds - API seed, local refine:

GPT-4 (Seed) β†’ Claude (Analyze) β†’ Local Ollama (Polish)

Speed Research Mode

Fast collaborative research:

Groq (Outline) β†’ Together AI (Suggest) β†’ Claude (Rewrite)

Debate Mode

Get multiple perspectives:

GPT-4 vs Claude β†’ Local Arbiter decides

βš™οΈ Advanced Features

  • Early Stopping - Stop when changes are minimal
  • Temperature Control - Different creativity for seed vs refine
  • Token Tracking - Monitor usage across all providers
  • Diff Visualization - See what changed between rounds
  • Drag-and-drop - Reorder provider sequences
  • Export - Save chat history as Markdown

πŸ”§ Configuration

Provider Types Supported:

  • openai_compat - Local servers (Ollama, LM Studio, vLLM, llama.cpp)
  • openai - OpenAI API
  • anthropic - Anthropic Claude
  • groq - Groq API
  • together - Together AI
  • fireworks - Fireworks AI
  • openrouter - OpenRouter
  • deepseek - DeepSeek API
  • kimi - Kimi/Moonshot

Orchestration Modes:

Each mode has different collaboration patterns optimized for specific use cases.

Presets:

Quick-start configurations for common scenarios with recommended model combinations.

πŸ“¦ Packaging for Desktop

TandemAI is designed to be packaged as a desktop application:

Option 1: Tauri (Recommended)

# Install Tauri CLI
cargo install tauri-cli

# Package as desktop app
tauri build

Option 2: Electron

# Install Electron
npm install -g electron

# Package app
electron-builder

Option 3: Standalone Web App

# Build static export
yarn build
yarn export

# Can be served locally or packaged with any web wrapper

🎯 Philosophy

Instead of just switching between different AI models, TandemAI makes them collaborate:

  • Seed models generate initial responses
  • Refiner models improve and correct
  • Specialists add domain expertise
  • Arbiters resolve conflicts and merge perspectives

This creates more thoughtful, accurate, and comprehensive responses than any single model alone.

πŸ›£οΈ Roadmap

Current (MVP):

  • βœ… All 4 orchestration modes
  • βœ… Universal provider support
  • βœ… Real-time streaming
  • βœ… Offline capability

Next:

  • πŸ”„ Fact-checker integration (NLI models)
  • πŸ“Š Judge scoring system
  • πŸ” RAG/document integration
  • πŸŽ™οΈ Voice roundtable mode
  • πŸ“± Mobile app

TandemAI - Where AI models work together for better responses! πŸ€–βœ¨

About

πŸ€– TandemAI - Local-First LLM Ensemble Orchestration Platform. Let your AI models work together: relay, debate, verify. Support for Ollama, LM Studio, OpenAI, Anthropic, Groq and more.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages