Skip to content

1.10.0

Choose a tag to compare

@crmne crmne released this 13 Jan 19:03
· 35 commits to main since this release

RubyLLM 1.10: Extended Thinking, Persistent Thoughts & Streaming Fixes πŸ§ βœ¨πŸš†

This release brings first-class extended thinking across providers, full Gemini 3 Pro/Flash thinking-signature support (chat + tools), a Rails upgrade path to persist it, and a tighter streaming pipeline. Plus official Ruby 4.0 support, safer model registry refreshes, a Vertex AI global endpoint fix, and a docs refresh.

🧠 Extended Thinking Everywhere

Tune reasoning depth and budget across providers with with_thinking, and get thinking output back when available:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :high, budget: 8000)

response = chat.ask("Prove it with numbers.")
response.thinking&.text
response.thinking&.signature
response.thinking_tokens
  • response.thinking and chunk.thinking expose thinking content during normal and streaming requests.
  • response.thinking_tokens and response.tokens.thinking track thinking token usage when providers report it.
  • Gemini 3 Pro/Flash fully support thought signatures across chat and tool calls, so multi-step sessions stay consistent.
  • Extended thinking quirks are now normalized across providers so you can tune one API and get predictable output.

Stream thinking and answer content side-by-side:

chat = RubyLLM.chat(model: "claude-opus-4.5")
  .with_thinking(effort: :medium)

chat.ask("Solve this step by step: What is 127 * 43?") do |chunk|
  print chunk.thinking&.text
  print chunk.content
end
  • Streaming stays backward-compatible: existing apps can keep printing chunk.content, while richer UIs can also render chunk.thinking.

🧰 Rails + ActiveRecord Persistence

Thinking output can now be stored alongside messages (text, signature, and token usage), with an upgrade generator for existing apps:

rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate
  • Adds thinking_text, thinking_signature, and thinking_tokens to message tables.
  • Adds thought_signature to tool calls for Gemini tool calling.
  • Fixes a Rails streaming issue where the first tokens could be dropped.

πŸ“Š Unified Token Tracking

All token counts now live in response.tokens and message.tokens, including input, output, cached, cache creation, and thinking tokens.

βœ… Official Ruby 4.0 Support

Ruby 4.0 is now officially supported in CI and dependencies.

🧩 Model Registry Updates

  • Refreshing the registry no longer deletes models from providers you haven't configured.

🌍 Vertex AI Global Endpoint Fix

When vertexai_location is global, the API base now correctly resolves to:

https://aiplatform.googleapis.com/v1beta1

πŸ“š Docs Updates

  • New extended thinking guide.
  • Token usage docs include thinking tokens.

Installation

gem "ruby_llm", "1.10.0"

Upgrading from 1.9.x

bundle update ruby_llm
rails generate ruby_llm:upgrade_to_v1_10
rails db:migrate

Merged PRs

New Contributors

Full Changelog: 1.9.2...1.10.0