llm-price

Pricing + release metadata and cost estimation for LLMs (OpenAI + Google Gemini).

Features

Model metadata with release dates
Static pricing in USD per 1M tokens
Cost estimation from token counts or raw text
Any currency output (real-time FX from exchangerate.host, cached)
CLI for listing models and summing JSONL usage

Install

pip install llm-price

Quickstart (Python)

from decimal import Decimal

from llm_price import cost_from_text, cost_from_tokens, get_fx_rate, list_models

models = list_models("openai")
print(models[0])

cost = cost_from_text(
    "openai",
    "gpt-4o-mini",
    prompt="Summarize this text.",
    completion="Here is a summary...",
)
print(cost.total_cost)

cost_in_inr = cost_from_tokens(
    "google",
    "gemini-1.5-flash",
    prompt_tokens=120,
    completion_tokens=40,
    currency="INR",
    fx_rate=Decimal("83.12"),
)
print(cost_in_inr.total_cost)

live_fx = get_fx_rate("USD", "INR")
print("Live USD→INR:", live_fx)

CLI

llm-price models --provider openai

llm-price cost \
  --provider openai \
  --model gpt-4o-mini \
  --prompt "Hello" \
  --completion "Hi" \
  --currency USD

llm-price cost \
  --provider google \
  --model gemini-1.5-flash \
  --prompt-tokens 100 \
  --completion-tokens 20 \
  --currency INR \
  --fx-rate "83.12"

JSONL Summation

llm-price sum usage.jsonl supports lines with:

Token counts

{"provider":"openai","model":"gpt-4o-mini","prompt_tokens":1200,"completion_tokens":400}

Raw text (tokenized internally)

{"provider":"openai","model":"gpt-4o-mini","prompt":"Hello","completion":"Hi"}

Pre-computed totals

{"total_cost":{"amount":"0.0123","currency":"USD"}}

Example Scripts

Run these from the repo root after installing dependencies:

python examples/basic_cost.py
python examples/inr_conversion.py
python examples/sum_jsonl.py

Notes

Pricing data is stored in src/llm_price/data/models.json in USD per 1M tokens.
OpenAI pricing is refreshed daily from https://bes-dev.github.io/openai-pricing-api/pricing.json via a GitHub Actions workflow.
OpenAI entries also store cached_input_per_1m when available.
For non-USD output, FX defaults to a real-time rate from exchangerate.host.
You can override it with fx_rate to use a fixed rate.
Rates are cached in-process for 1 hour by default.
Gemini token counting uses the official CountTokens API when GOOGLE_API_KEY is set; otherwise it falls back to an approximation.

Development

pip install -e ".[dev]"
pytest

Release (PyPI)

This project uses GitHub Actions trusted publishing. Create a tag like v0.1.0 and push it to GitHub to trigger the publish workflow:

git tag v0.1.0
git push origin v0.1.0

Configure the PyPI Trusted Publisher with:

PyPI Project Name: llm-price
Owner: VA24d
Repository name: API-price
Workflow name: publish.yml
Environment name: pypi

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
scripts		scripts
src/llm_price		src/llm_price
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-price

Features

Install

Quickstart (Python)

CLI

JSONL Summation

Token counts

Raw text (tokenized internally)

Pre-computed totals

Example Scripts

Notes

Development

Release (PyPI)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llm-price

Features

Install

Quickstart (Python)

CLI

JSONL Summation

Token counts

Raw text (tokenized internally)

Pre-computed totals

Example Scripts

Notes

Development

Release (PyPI)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages