Compare multiple AI models in parallel — answers, differences, and costs in one place.
Aidiff is a local web app for side-by-side LLM comparison. Send the same task to two or three models (or test multiple prompt variants on one model) and get structured output: raw answers, an AI-powered difference analysis, and performance metrics (latency, tokens, estimated cost).
Great for prompt engineering, model selection, quality checks, and quick A/B tests — without juggling chat tabs.
Pick two or three models (OpenAI, Anthropic, Google), write one prompt, and send it to every column in parallel.
Keep a single model and A/B test two or three prompt variants on the same task — ideal for tone, length, and constraint tweaks.
Answers side by side with latency and estimated cost per column — switch tabs without losing the run.
AI summary: keywords per answer, a six-row mini comparison (tone, length, structure…), and a short assessment in your prompt’s language.
Latency, output tokens, tokens per second, and cost per request — spot the fastest or cheapest model at a glance.
On first launch, enter at least one provider key in the app (saved to your local .env), or add keys to .env before starting. Later, use Manage API Keys in the header to update them.
| Area | Description |
|---|---|
| Model compare | 2–3 columns: GPT (OpenAI), Claude (Anthropic), Gemini (Google) — mix freely |
| Prompt compare | Same model setup, different prompt variants side by side |
| Difference analysis | Auto summary: keywords, mini comparison (tone, length, structure…), assessment — in your prompt’s language |
| Performance tab | Latency, output tokens, tokens/s, cost per request (estimate from built-in pricing table) |
| File attachment | Attach a local file as extra context in the prompt |
| Settings | Manage API keys in the UI; saved to your local .env |
| UI | Glass design, light/dark mode, searchable model picker with provider catalog |
| i18n-ready | Extend UI and analysis prompts via locale files (default: English) |
- Node.js 18+ (LTS recommended)
- At least one API key: OpenAI, Anthropic, or Google AI Studio
git clone https://github.com/<your-user>/aidiff.git
cd aidiff
npm install
cp .env.example .envAdd at least one key to .env:
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=AIza...Start the dev server:
npm run devOpen http://localhost:5173. On first launch you can also enter keys in Settings — they are written to your local .env.
Open Manage API Keys (gear in the header). At least one provider is required. Restart the dev server after saving if the proxy does not pick up new keys.
- Compare Models — one prompt, different models in 2–3 columns
- Compare Prompts — one model, two (optional three) prompt variants
Type your prompt, optionally attach a file, pick models/variants in the slots, press Send.
| Tab | Content |
|---|---|
| Results | Model/variant answers side by side |
| Differences | AI analysis: keywords, mini comparison, assessment |
| Performance | Latency, cost, and token metrics per column |
Older runs can be collapsed; new comparisons stack below.
flowchart LR
subgraph Browser["Browser (React)"]
UI[App + Composer]
Tabs[Results / Diff / Perf]
end
subgraph Vite["Vite Dev / Preview"]
Proxy["API proxy\n/api/openai · anthropic · google"]
Keys["/api/settings/keys\n→ .env"]
end
subgraph Providers["LLM APIs"]
OAI[OpenAI]
ANT[Anthropic]
GEM[Google Gemini]
end
UI --> Proxy
UI --> Keys
Proxy --> OAI
Proxy --> ANT
Proxy --> GEM
Tabs --> UI
- Frontend: React 18, Vite 5, JSX (no separate backend repo)
- API access: In dev and preview, Vite proxies provider requests and injects keys from
.envserver-side — avoids CORS and browser key restrictions - Difference analysis: After a parallel run, Aidiff calls a fixed model (
gemini-2.5-flash) with structured system prompts; the response is parsed and rendered in the UI - Cost estimates: From
MODEL_PRICINGinsrc/constants/appConfig.js(input/output per 1M tokens)
aidiff/
├── docs/screenshots/ # README images (add your PNGs here)
├── public/ # Logo, favicon, static assets
├── src/
│ ├── App.jsx # Main UI, run orchestration
│ ├── components/ # Composer, tabs, diff cards, settings…
│ ├── constants/ # Providers, models, pricing, tabs
│ ├── i18n/ # Locale catalog, diff-analysis prompts
│ ├── lib/ # API clients, diff parser, model catalog
│ ├── locales/ # UI strings (e.g. en.js)
│ └── theme/ # Design tokens, glass CSS
├── vite.config.js # Proxy + /api/settings/keys
├── .env.example
└── package.json
| Command | Description |
|---|---|
npm run dev |
Dev server with HMR |
npm run build |
Production build to dist/ |
npm run preview |
Preview build locally (proxy active) |
npm run lint |
ESLint |
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI / GPT |
ANTHROPIC_API_KEY |
Anthropic / Claude |
GOOGLE_API_KEY |
Google Gemini |
VITE_AIDIFF_LOCALE |
Optional UI locale (must exist in src/i18n/catalog.js, e.g. en) |
.env is gitignored — never commit API keys.
- Add
src/locales/de.js(same shape asen.js) - Register it in
src/i18n/catalog.js - Set
VITE_AIDIFF_LOCALE=deor callsetLocale('de')at runtime
- Local-first: Keys stay on your machine (
.env). Aidiff is not a hosted SaaS. - Proxy requires Vite:
/api/settings/keysand provider proxies run undernpm run devandnpm run preview. Static hosting ofdist/alone does not forward API calls. - Costs: Every run and the difference analysis use API credits on your providers.
- Meta analysis: Prepared in code; UI is currently disabled (
META_ANALYSIS_ENABLED).
npm install
npm run devOn 401/CORS errors: restart the dev server and check keys in .env (no stray spaces). For Google: enable the Generative Language API and billing on your cloud project.
MIT — Copyright (c) 2026 Sebastian Breuer
Built with React + Vite · See what actually makes models different.








