Stop guessing what your ChatGPT and Claude API calls cost. Track them in 60 seconds with 3 lines of code.
After running docker-compose up -d:
- π Dashboard: http://localhost:8081 - Cost tracking interface
- π API: http://localhost:8000 - Backend API endpoints
- π API Docs: http://localhost:8000/docs - Interactive API documentation
LLM API costs are spiraling out of control. You don't know:
- πΈ How much you're spending on each model
- π Which providers are most expensive
- π If there are cheaper alternatives
- π Your usage patterns over time
LLMscope gives you complete visibility and control over your LLM costs:
- β Real-time cost tracking - See costs as they happen
- β Cost breakdown - By provider, model, and time period
- β Smart recommendations - Get suggestions for cheaper models
- β Usage analytics - Track token usage and patterns
- β Self-hosted - Keep your data private
Using Docker (Recommended):
git clone https://github.com/Blb3D/LLMscope.git
cd LLMscope
docker-compose up -dVisit http://localhost:8081 - You'll see the dashboard (empty until you track your first API call).
After making any OpenAI, Anthropic, or other LLM API call, add 3 lines to log the cost:
Example: Track OpenAI GPT-4 Call
import openai
import requests
# Your normal OpenAI API call
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}]
)
# Add these 3 lines to track cost:
requests.post('http://localhost:8000/api/usage', json={
'provider': 'openai',
'model': 'gpt-4',
'prompt_tokens': response['usage']['prompt_tokens'],
'completion_tokens': response['usage']['completion_tokens']
})That's it! Refresh the dashboard to see your real costs.
Want to test the dashboard before integrating your real API calls?
cd backend
python generate_demo_data.pyThis creates 100 sample API calls to preview the dashboard features.
Track every OpenAI API call:
import openai
import requests
def track_openai_usage(response):
"""Helper function to track OpenAI costs"""
requests.post('http://localhost:8000/api/usage', json={
'provider': 'openai',
'model': response['model'],
'prompt_tokens': response['usage']['prompt_tokens'],
'completion_tokens': response['usage']['completion_tokens']
})
# Use it after any OpenAI call:
response = openai.ChatCompletion.create(
model="gpt-4-turbo",
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
track_openai_usage(response)Track Claude API calls:
import anthropic
import requests
def track_anthropic_usage(model, response):
"""Helper function to track Anthropic costs"""
requests.post('http://localhost:8000/api/usage', json={
'provider': 'anthropic',
'model': model,
'prompt_tokens': response.usage.input_tokens,
'completion_tokens': response.usage.output_tokens
})
# Use it after Claude API calls:
client = anthropic.Anthropic(api_key="your-key")
message = client.messages.create(
model="claude-3-sonnet-20240229",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello, Claude!"}]
)
track_anthropic_usage("claude-3-sonnet", message)Automatic tracking with LangChain callback:
from langchain.callbacks.base import BaseCallbackHandler
from langchain.chat_models import ChatOpenAI
import requests
class LLMscopeCallback(BaseCallbackHandler):
def on_llm_end(self, response, **kwargs):
if hasattr(response, 'llm_output') and response.llm_output:
usage = response.llm_output.get('token_usage', {})
requests.post('http://localhost:8000/api/usage', json={
'provider': 'openai',
'model': response.llm_output.get('model_name', 'gpt-3.5-turbo'),
'prompt_tokens': usage.get('prompt_tokens', 0),
'completion_tokens': usage.get('completion_tokens', 0)
})
# Use with LangChain:
llm = ChatOpenAI(callbacks=[LLMscopeCallback()])
result = llm.predict("What is the capital of France?")Track Gemini API calls:
import google.generativeai as genai
import requests
def track_gemini_usage(model_name, response):
"""Helper function to track Google Gemini costs"""
requests.post('http://localhost:8000/api/usage', json={
'provider': 'google',
'model': model_name,
'prompt_tokens': response.usage_metadata.prompt_token_count,
'completion_tokens': response.usage_metadata.candidates_token_count
})
# Use it after Gemini calls:
genai.configure(api_key="your-key")
model = genai.GenerativeModel('gemini-pro')
response = model.generate_content("Write a poem about AI")
track_gemini_usage('gemini-pro', response)Backend:
cd backend
pip install -r requirements.txt
# Seed the database with LLM pricing data
python seed_pricing.py
# Start the backend
python app.pyFrontend:
cd frontend
npm install
npm run devVisit http://localhost:8081
Track every LLM API call with automatic cost calculation. Monitor spending across 63+ models from OpenAI, Anthropic, Google, Cohere, Together AI, Mistral, Groq, and more.
- Instant cost visibility - See exactly what each request costs
- Provider comparison - Compare costs across different LLM providers
- Token usage analytics - Track prompt and completion tokens
- Auto-refresh - Dashboard updates every 5 seconds
Get intelligent recommendations for cheaper model alternatives:
- Cheapest models first - Groq's Llama-3-8B at $0.000065/1K tokens
- Side-by-side pricing - Compare input/output costs instantly
- Recent usage history - Track your last 100 API calls
- Save money automatically - Identify where you're overspending
- 100% local - Your data never leaves your infrastructure
- No external dependencies - Runs entirely on Docker
- Open source - Audit every line of code
Endpoint: POST http://localhost:8000/api/usage
Request Body:
{
"provider": "openai",
"model": "gpt-4",
"prompt_tokens": 100,
"completion_tokens": 50
}Response:
{
"status": "logged",
"cost_usd": 0.006,
"timestamp": "2025-11-02T10:30:00Z"
}Endpoint: GET http://localhost:8000/api/costs/summary
Response:
{
"total_cost": 15.23,
"total_requests": 1250,
"by_provider": {
"openai": 12.45,
"anthropic": 2.78
},
"by_model": {
"gpt-4": 10.20,
"gpt-3.5-turbo": 2.25,
"claude-3-sonnet": 2.78
}
}Endpoint: GET http://localhost:8000/api/recommendations
Returns a list of LLM models sorted by cost (cheapest first) with pricing details.
Visit http://localhost:8000/docs for full interactive API documentation.
Create a .env file:
DATABASE_PATH=./data/llmscope.db
LLMSCOPE_API_KEY=your-secret-keyβββββββββββββββββββ
β Frontend β React + Vite
β (Port 8081) β Cost dashboard
ββββββββββ¬βββββββββ
β
β HTTP/REST
β
ββββββββββΌβββββββββ
β Backend API β FastAPI + SQLite
β (Port 8000) β Cost tracking
βββββββββββββββββββ
Backend:
- FastAPI
- SQLite
- Python 3.9+
Frontend:
- React
- Vite
- Tailwind CSS
Deployment:
- Docker
- Docker Compose
Tracks all API calls with token counts and costs
Stores pricing data for different LLM models
Application configuration
# Backend tests
cd backend
pytest
# Frontend development
cd frontend
npm run dev
# Production build
docker-compose -f docker-compose.prod.yml up -dSupported Providers: OpenAI, Anthropic, Google, Cohere, Together AI, Mistral, Groq, Ollama, and 60+ models
Coming Soon:
- Cost alerts and budget thresholds
- Export to CSV/PDF
- More provider integrations (request yours in Issues!)
Want to influence the roadmap? Open an issue or start a discussion!
Contributions are welcome! Please feel free to submit a Pull Request.
Business Source License 1.1
- β Self-hosting for personal use, education, and research
- β Modify and redistribute for non-commercial purposes
- β Full source code access - no restrictions on reading the code
Commercial use includes:
- Using LLMscope to monitor production LLM deployments in a business
- Offering LLMscope as a hosted/managed service to customers
- Incorporating LLMscope into a commercial product
Need a commercial license? Contact: bbaker@blb3dprinting.com
On October 29, 2028 (3 years from first publication), this license automatically converts to MIT - making it fully open source forever.
Why BSL? We want LLMscope to be freely available for individuals and small teams, while ensuring companies using it commercially contribute back. This allows us to keep developing new features like SPC analysis, AI copilot, and enhanced reporting.
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions
Built with β€οΈ for the LLM community

