Pennywise

LLM Cost Optimizer — Reduce AI spending by up to 70%

Pennywise analyzes your LLM API usage and intelligently routes, caches, and optimizes requests to cut costs without sacrificing output quality. Built with a Python backend and a Next.js frontend dashboard.

⚠️ Work in progress — core optimization engine is functional, UI and docs are evolving.

The Problem

LLM API costs add up fast. Most applications send every request to the same expensive model, even when a cheaper one would produce identical results. Prompt lengths go unoptimized, duplicate queries aren't cached, and there's no visibility into what's actually driving the bill.

What Pennywise Does

Smart model routing — Classifies incoming requests by complexity and routes simple queries to cheaper models (e.g., GPT-4o-mini, Haiku) while reserving expensive models for tasks that need them.
Response caching — Caches semantically similar queries to avoid redundant API calls.
Prompt optimization — Analyzes and compresses prompts to reduce token count without losing intent.
Cost dashboard — Tracks spending per model, per endpoint, and over time so you can see exactly where your budget goes.
Usage analytics — Identifies patterns in your API usage: which prompts are expensive, which are redundant, and where optimization has the highest ROI.

Architecture

┌─────────────────────────────────────────────────┐
│                  Frontend (Next.js)              │
│         Cost dashboard · Usage analytics         │
└──────────────────────┬──────────────────────────┘
                       │ API
┌──────────────────────┴──────────────────────────┐
│                  Backend (Python)                 │
│                                                  │
│  ┌────────────┐  ┌───────────┐  ┌────────────┐  │
│  │  Request    │  │  Prompt   │  │  Response   │  │
│  │  Router     │  │  Optimizer│  │  Cache      │  │
│  └────────────┘  └───────────┘  └────────────┘  │
│                                                  │
│  ┌────────────┐  ┌───────────────────────────┐  │
│  │  Cost      │  │  LLM Provider Integrations │  │
│  │  Tracker   │  │  (OpenAI, Anthropic, etc.) │  │
│  └────────────┘  └───────────────────────────┘  │
└──────────────────────────────────────────────────┘

Tech Stack

Layer	Technology
Backend	Python, FastAPI
Frontend	Next.js, TypeScript
Database	SQLite (dev) / PostgreSQL (prod)
Caching	Semantic similarity-based dedup
Deployment	Docker-ready

Quick Start

# Clone
git clone https://github.com/Subh24ai/pennywise.git
cd pennywise

# Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # Add your API keys
python main.py

# Frontend (separate terminal)
cd frontend
npm install
npm run dev

Open http://localhost:3000 to access the dashboard.

Configuration

Create a .env file in the backend/ directory:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DATABASE_URL=sqlite:///pennywise.db

How It Saves Money

Strategy	Typical Savings	How
Model downgrading	30–50%	Route simple tasks to cheaper models
Response caching	15–25%	Skip API calls for semantically similar queries
Prompt compression	10–20%	Reduce token count per request
Combined	Up to 70%	All strategies applied together

Project Structure

pennywise/
├── backend/           # Python API server + optimization engine
├── frontend/          # Next.js cost dashboard
├── .gitignore
└── README.md

Roadmap

Multi-provider cost comparison (OpenAI vs Anthropic vs local)
Prompt A/B testing with cost tracking
Webhook alerts for budget thresholds
Export usage reports (CSV/PDF)
Team/org-level usage tracking

License

MIT

Built by Subhash Gupta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pennywise

The Problem

What Pennywise Does

Architecture

Tech Stack

Quick Start

Configuration

How It Saves Money

Project Structure

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Pennywise

The Problem

What Pennywise Does

Architecture

Tech Stack

Quick Start

Configuration

How It Saves Money

Project Structure

Roadmap

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages