Skip to content

Subh24ai/pennywise

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pennywise

LLM Cost Optimizer — Reduce AI spending by up to 70%

Pennywise analyzes your LLM API usage and intelligently routes, caches, and optimizes requests to cut costs without sacrificing output quality. Built with a Python backend and a Next.js frontend dashboard.

⚠️ Work in progress — core optimization engine is functional, UI and docs are evolving.


The Problem

LLM API costs add up fast. Most applications send every request to the same expensive model, even when a cheaper one would produce identical results. Prompt lengths go unoptimized, duplicate queries aren't cached, and there's no visibility into what's actually driving the bill.

What Pennywise Does

  • Smart model routing — Classifies incoming requests by complexity and routes simple queries to cheaper models (e.g., GPT-4o-mini, Haiku) while reserving expensive models for tasks that need them.
  • Response caching — Caches semantically similar queries to avoid redundant API calls.
  • Prompt optimization — Analyzes and compresses prompts to reduce token count without losing intent.
  • Cost dashboard — Tracks spending per model, per endpoint, and over time so you can see exactly where your budget goes.
  • Usage analytics — Identifies patterns in your API usage: which prompts are expensive, which are redundant, and where optimization has the highest ROI.

Architecture

┌─────────────────────────────────────────────────┐
│                  Frontend (Next.js)              │
│         Cost dashboard · Usage analytics         │
└──────────────────────┬──────────────────────────┘
                       │ API
┌──────────────────────┴──────────────────────────┐
│                  Backend (Python)                 │
│                                                  │
│  ┌────────────┐  ┌───────────┐  ┌────────────┐  │
│  │  Request    │  │  Prompt   │  │  Response   │  │
│  │  Router     │  │  Optimizer│  │  Cache      │  │
│  └────────────┘  └───────────┘  └────────────┘  │
│                                                  │
│  ┌────────────┐  ┌───────────────────────────┐  │
│  │  Cost      │  │  LLM Provider Integrations │  │
│  │  Tracker   │  │  (OpenAI, Anthropic, etc.) │  │
│  └────────────┘  └───────────────────────────┘  │
└──────────────────────────────────────────────────┘

Tech Stack

Layer Technology
Backend Python, FastAPI
Frontend Next.js, TypeScript
Database SQLite (dev) / PostgreSQL (prod)
Caching Semantic similarity-based dedup
Deployment Docker-ready

Quick Start

# Clone
git clone https://github.com/Subh24ai/pennywise.git
cd pennywise

# Backend
cd backend
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env  # Add your API keys
python main.py

# Frontend (separate terminal)
cd frontend
npm install
npm run dev

Open http://localhost:3000 to access the dashboard.


Configuration

Create a .env file in the backend/ directory:

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DATABASE_URL=sqlite:///pennywise.db

How It Saves Money

Strategy Typical Savings How
Model downgrading 30–50% Route simple tasks to cheaper models
Response caching 15–25% Skip API calls for semantically similar queries
Prompt compression 10–20% Reduce token count per request
Combined Up to 70% All strategies applied together

Project Structure

pennywise/
├── backend/           # Python API server + optimization engine
├── frontend/          # Next.js cost dashboard
├── .gitignore
└── README.md

Roadmap

  • Multi-provider cost comparison (OpenAI vs Anthropic vs local)
  • Prompt A/B testing with cost tracking
  • Webhook alerts for budget thresholds
  • Export usage reports (CSV/PDF)
  • Team/org-level usage tracking

License

MIT


Built by Subhash Gupta

About

LLM Cost Optimizer that routes requests to cheaper models, caches semantically similar queries, and compresses prompts to reduce AI API spending by up to 70%. Python backend + Next.js dashboard.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors