-
Notifications
You must be signed in to change notification settings - Fork 15
FEATURE: Added pricing tracking #122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Seluj78
wants to merge
11
commits into
withceleste:main
Choose a base branch
from
Seluj78:feat/cost-tracking
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Introduce a pydantic-based Cost model that captures detailed pricing components for API calls. The new class supports token-based fields (input/output/cache/reasoning), modality-specific costs (image/audio/ video), and a currency field. It also accepts an optional explicit_total at initialization and exposes a computed total_cost property that returns the explicit total when provided, otherwise sums all non-None components. This change centralizes cost representation to standardize pricing reporting across providers and pricing models, and makes total cost calculation robust and explicit.
Add future import and include Cost typing in IO models. Import Cost from celeste.pricing.cost and expose it on response and streaming response models as an optional field. This allows callers to access pricing information returned with generation outputs and keeps type annotations forward-compatible.
Introduce CostTracker and CostBreakdown to support session-level cost aggregation across multiple API calls. The new CostTracker stores Cost objects with a thread lock for concurrent safety and exposes: - add(cost): safely append a Cost or ignore None - total, breakdown, count properties for aggregated views - get_costs(), reset(), and to_dict() helpers CostBreakdown provides per-category floats (input, output, cache creation/read, reasoning, image, audio, video), a computed total, and a to_dict() method. These changes enable consistent, thread-safe tracking and reporting of cumulative costs for clients and testing purposes.
Introduce a pricing registry to load model cost data from litellm's GitHub repository. Users must explicitly call initialize_pricing() to enable cost tracking. The module fetches a JSON pricing file, caches it under ~/.cache/celeste/model_prices.json, and treats the cache as valid for 24 hours. On failure it warns and attempts to use a stale cache if available. Add helper functions: - initialize_pricing(force_refresh: bool) -> fetches remote data, updates local cache, and sets module state; reads cache first to avoid network calls. - is_initialized() -> reports whether pricing data is available. - get_model_info(model_id, provider) -> looks up model pricing using provider-prefixed keys, bare model IDs, and provider-to-prefix mapping for known providers. Handle JSON and I/O errors with warnings rather than raising to avoid breaking consumers that do not opt in to pricing.
Implement a new pricing calculator module (src/celeste/pricing/calculator.py) that centralizes cost computation for various model usage patterns. Key changes: - Add calculate_cost entry point that loads model info and delegates to specific calculators based on model mode or provider-supplied costs. - Implement token-based cost handling including: - standard input/output token billing - tiered input pricing (above 128k and 200k tokens) - prompt cache handling with separate read and creation costs - reasoning tokens support (separate reasoning rate) - Add specialized calculators for embedding, image, and audio modes (stubs/structure prepared for per-image, per-pixel, per-second, per-character pricing). - Respect provider-supplied costs when model metadata indicates it. - Return a Cost object with granular breakdown (input, output, cache, reasoning) or None when pricing is unavailable. Why: - Consolidate pricing logic to support diverse billing models across providers and model types. - Enable accurate cost attribution for features like prompt caching, large-context tiered pricing, and reasoning token accounting.
Introduce a new pricing package that centralizes cost tracking and calculation utilities. Add src/celeste/pricing/__init__.py which exposes: - Cost dataclass and trackers (Cost, CostBreakdown, CostTracker) - Calculator helpers (calculate_cost, calculate_video_cost) - Registry functions (initialize_pricing, is_initialized, clear_pricing, get_model_info, get_raw_pricing_data, register_model_pricing) Document opt-in initialization behavior and example usage in the module docstring so callers understand that initialize_pricing() must be invoked for cost fields to be populated. This change makes the pricing API easily importable from celeste.pricing and prepares the package for consumers to opt into cost tracking and attach trackers to clients.
Add _aggregate_cost helper to text, audio, and image streaming modules to compute a stream's cost by preferring the last chunk's explicit cost and falling back to calculating cost from aggregated usage. Include cost in returned Output objects and compute usage once to avoid redundant aggregation. Hook into the client's cost_tracker (if configured) to record stream costs when building outputs. These changes enable per-stream cost reporting and tracking for streaming modalities, improving billing visibility and avoiding repeated usage calculations.
Add pricing integration to the BaseClient by importing pricing utilities and wiring cost calculation into request handling and output construction. - import Cost, CostTracker and calculate_cost from celeste.pricing. - add optional cost_tracker field to client config for accumulating costs. - compute usage once per response and derive cost via a new _calculate_cost(usage) helper that uses the pricing registry. - attach cost to the returned Output and, when configured, add the cost to the client's CostTracker. - add future annotations import for forward type hints. This enables per-request cost computation and optional runtime accumulation so callers can monitor and report model usage costs.
Add exports for Cost, CostBreakdown, and CostTracker, plus initialize_pricing, is_initialized, and register_model_pricing to the package __init__.py. This surfaces the pricing API at the top-level celeste package so callers can import pricing types and initialization helpers directly (for example: from celeste import Cost, initialize_pricing). This change simplifies access to pricing functionality and makes it easier to integrate cost tracking and model pricing registration into client code without importing internal modules.
Add a full test suite for CostTracker and CostBreakdown covering default values, total calculation, dict conversion, add/reset behavior, and aggregation. Include tests for ignoring None adds, retrieving a copy of costs, to_dict output, and concurrent access to ensure thread safety. These tests catch edge cases and validate public APIs to prevent regressions in cost accounting and to ensure correctness under concurrent updates.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.