Release v1.1.0 · Center-for-Applied-AI/delm

What's New in v1.1.0

API kwargs pass-through (#47)

Pass arbitrary keyword arguments through to the underlying LLM API call via api_kwargs. For example, set store=False for Fireworks AI.

Flexible rate limit period (#58)

Rate limiting now supports configurable time windows via rate_limit_period_seconds. The old tokens_per_minute / requests_per_minute parameters have been replaced with rate_limit_tokens / rate_limit_requests that work with any time span.

tokencost integration (#57)

Model pricing is now powered by the tokencost package, providing up-to-date pricing for all major LLM providers without manual maintenance.

Other improvements

Oversized requests that exceed bucket capacity are now handled correctly (negative bucket balance with natural recovery)
Floating-point epsilon guard in rate limiter comparisons prevents edge-case hangs
34 new unit tests covering all three features
Updated documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.0

Choose a tag to compare

Sorry, something went wrong.