Skip to content

v1.1.0

Latest

Choose a tag to compare

@ErnaneJ ErnaneJ released this 05 Jun 22:26

What's new

Rate-limit retry with backoff

All LLM and embedding API calls now automatically retry when a rate-limit or quota-exceeded error is received from any provider (Gemini, OpenAI, OpenRouter).

  • If the provider returns a "retry in Xs" hint in the error message (Gemini does this), that exact delay is honoured.
  • Otherwise, exponential backoff is applied: llm_retry_delay × 2^(attempt − 1).
  • A warning is logged on each retry attempt.
  • Resolves #1.

New configuration options

Option Default Description
max_llm_retries 3 Max retry attempts before surfacing the error to the user. Set to 0 to disable.
llm_retry_delay 60 (seconds) Base delay when the provider does not supply a retry-after hint.

Documentation

  • Added a Rate limiting section to the README.
  • Added a Supported models by provider table with links to each provider's model catalogue and the ruby_llm model registry.

Upgrade

# Gemfile
gem "glancer", "~> 1.1"

No migration needed — new config options have sensible defaults and are fully optional.