Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
5e8c1bb
Initial red-candle provider implementation
cpetersen Sep 8, 2025
5c770dd
Starting to work
cpetersen Sep 8, 2025
fe199a8
Swap qwen for mistral
cpetersen Sep 8, 2025
b8bf331
Trying to add red-candle to the models_to_test.rb
cpetersen Sep 8, 2025
d98834c
Adding red-candle to the models_to_test file
cpetersen Sep 8, 2025
b207f69
Trying to fix the way tool calling support is checked in the specs
cpetersen Sep 8, 2025
ab46320
Deconvoluting local model checks and tool calling support
cpetersen Sep 8, 2025
97d58d2
I think we finally got the local tool calling check correct
cpetersen Sep 8, 2025
9c7f9dc
Enable context length validation for the RedCandle Provider
cpetersen Sep 8, 2025
d5c9129
Working on rubocop fixes
cpetersen Sep 8, 2025
70e1b24
Fixing the rubocop errors
cpetersen Sep 9, 2025
6956724
stubbing the red-candle inference stuff to speed up specs
cpetersen Sep 9, 2025
0aad7d7
Adding an ENV variable so you toggle real red-candle inference on
cpetersen Sep 9, 2025
52a13ca
Adding red-candle to the list of providers in the README
cpetersen Sep 9, 2025
63b4a81
Refactor acts_as API for Rails integration (v1.7)
crmne Sep 9, 2025
78d6429
Updated models
crmne Sep 9, 2025
3f17200
Fix install generator template variable references
crmne Sep 9, 2025
6c7d7be
Use default model when none specified in ActiveRecord chats
crmne Sep 9, 2025
4076001
Add chat UI scaffold generator with Turbo streaming
crmne Sep 9, 2025
b883989
Adding a new bundle group so developer can choose to include red-cand…
cpetersen Sep 9, 2025
685230c
Adding a comment about possibly supporting more red-candle models in …
cpetersen Sep 9, 2025
a928bb1
Remove red-candle from the gemfiles
cpetersen Sep 9, 2025
ee5b762
Properly register red-candle models
cpetersen Sep 9, 2025
43cc0b8
Removed some unused config options
cpetersen Sep 9, 2025
4b67818
Updating the gemfiles again
cpetersen Sep 9, 2025
c1ac17d
Make the capabilities file match the actual capabilities
cpetersen Sep 9, 2025
54b9834
Deep merge chat options
cpetersen Sep 9, 2025
c78ce40
make red-candle off by default
orangewolf Sep 10, 2025
6816be9
improve error messages
orangewolf Sep 10, 2025
a258a39
improved error message
orangewolf Sep 10, 2025
c8df558
Move broadcasts_to from install to chat_ui generator
crmne Sep 10, 2025
f22e0bd
Update generators to support custom model names consistently
crmne Sep 10, 2025
89f8c3c
Simplified post install
crmne Sep 10, 2025
e109713
Remove Git LFS completely from the project
crmne Sep 10, 2025
c893323
Improve v1.7 upgrade experience
crmne Sep 10, 2025
7a77dcc
Update upgrade instructions for custom model names and API usage
crmne Sep 10, 2025
da34385
Enhance chat UI generator with improved UX and models management
crmne Sep 10, 2025
3b1e8cf
Bump to 1.7.0
crmne Sep 10, 2025
7f00d26
Updated appraisal gemfiles
crmne Sep 10, 2025
3d57b9e
Bust README gem cache
crmne Sep 10, 2025
ad036e1
Fix namespaced model table names in upgrade generator (#398)
willcosgrove Sep 10, 2025
c811173
Reorganize generators to follow Rails conventions
crmne Sep 10, 2025
ecc8afa
Fix namespaced models in Model migration and foreign key migrations (…
willcosgrove Sep 10, 2025
004563e
add additional models
orangewolf Sep 11, 2025
552732c
Improve upgrade generator and add troubleshooting docs
crmne Sep 11, 2025
1a5ad13
Remove git LFS support from pipelines
crmne Sep 11, 2025
939d532
Add automatic acts_as declaration updates to upgrade generator
crmne Sep 11, 2025
cd3dfc5
Bump version to 1.7.1
crmne Sep 11, 2025
6ee02d0
Updated appraisal gemfiles
crmne Sep 11, 2025
fa10f0c
Bust README gem cache
crmne Sep 11, 2025
c4895d6
seperate out tokenizers from gguf
orangewolf Sep 11, 2025
0dc8e9a
more complete error message
orangewolf Sep 11, 2025
8c87b59
Working on documentation
cpetersen Sep 11, 2025
252f97f
Merge branch 'main' into red-candle
cpetersen Sep 11, 2025
d437f73
red-candle is optional
cpetersen Sep 11, 2025
9bdb434
require 'candle' is standard
cpetersen Sep 11, 2025
d52e26e
rubocop
orangewolf Sep 12, 2025
8ec93e8
use a spec helper
orangewolf Sep 12, 2025
d1696ff
Remove the too cute pricing method
cpetersen Sep 12, 2025
62a0389
Fix the comment for RubyLLM::Providers::RedCandle::Capabilities
cpetersen Sep 12, 2025
90128bb
Make the require_relative actually relative
cpetersen Sep 12, 2025
9ab992d
Updatae to red-candle 1.3.0 to support ruby 3.1
cpetersen Sep 13, 2025
922e0e9
Update the comment
cpetersen Sep 13, 2025
0d23da4
Fix create_table migrations to prevent foreign key errors (#409) (#411)
matiasmoya Sep 13, 2025
078ef25
Fix: Add resolve method delegation from Models instance to class (#407)
kieranklaassen Sep 13, 2025
32b3648
Models helps should return all supporting modalities (#408)
dacamp Sep 13, 2025
497e3d8
Add Content Moderation Feature (#383)
iraszl Sep 14, 2025
e9f8d50
Fix [BUG] Inflection breaks Rails apps using the `Llm` module/name/na…
crmne Sep 14, 2025
aacd639
Updated appraisal gemfiles
crmne Sep 14, 2025
4ff2231
Add video file support (#405)
altxtech Sep 14, 2025
2ace2d3
Changed note style in video docs
crmne Sep 14, 2025
a4fae99
Remove outdated version notes and clarified moderation being availabl…
crmne Sep 14, 2025
e27eb10
Updated models
crmne Sep 14, 2025
0cb6299
Bump version to 1.8.0
crmne Sep 14, 2025
647756e
Bust gem version cache in README
crmne Sep 14, 2025
a309326
Updated documentation with latest changes
crmne Sep 14, 2025
e99371c
Added moderation to readme and index
crmne Sep 14, 2025
df8ef75
Update gpt-5 capabilities (#345)
mnort9 Sep 15, 2025
96d06c4
Updated models
crmne Sep 15, 2025
369e9d2
Merge branch 'main' into red-candle
orangewolf Sep 15, 2025
c79b852
Cleaned up injection into message model class for chat UI generator
crmne Sep 16, 2025
f9ce1e7
Updated appraisal gemfiles
crmne Sep 16, 2025
1e581cf
Merge branch 'main' into red-candle
orangewolf Sep 17, 2025
0e8cded
Production-ready chunk streaming for chat UI generator
crmne Sep 21, 2025
ae46014
Add funding URI to gemspec metadata
crmne Sep 21, 2025
6a9998a
Updated Appraisal gemfiles
crmne Sep 21, 2025
e08b2e5
Updated models
crmne Sep 21, 2025
46ac613
Bump version to 1.8.1
crmne Sep 21, 2025
a74233f
Display tool calls in message template (#416)
marckohlbrugge Sep 22, 2025
25ea0d3
Merge branch 'main' into red-candle
orangewolf Sep 22, 2025
636ef94
Set adapter to be net_http instead of Faraday.default_adapter. (#429)
jkogara Sep 24, 2025
8823739
Fix chat UI generator for namespaced models.
crmne Sep 24, 2025
c2e5bff
Simplify moderation example in documentation
crmne Sep 24, 2025
a70717f
Run generator tests only for latest version of Ruby and Rails
crmne Sep 24, 2025
d08118b
Run full test suite in latest Ruby and Rails version and upload test …
crmne Sep 24, 2025
1f5bc69
Exclude generators from codecov calculation
crmne Sep 24, 2025
b0fb8e8
Update Ruby Style Guide badge to point to RuboCop
crmne Sep 24, 2025
99e9594
Bump to 1.8.2
crmne Sep 24, 2025
10b31b3
Update README and index documentation to clarify embedding generation…
crmne Sep 26, 2025
a0efaa4
Updated appraisal gemfiles
crmne Sep 26, 2025
702b9b8
Merge branch 'main' into red-candle
orangewolf Oct 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 0 additions & 5 deletions .gitattributes

This file was deleted.

18 changes: 13 additions & 5 deletions .github/workflows/cicd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,6 @@ jobs:

steps:
- uses: actions/checkout@v4
with:
lfs: true

- name: Set up Ruby
uses: ruby/setup-ruby@v1
Expand All @@ -67,13 +65,24 @@ jobs:
run: bundle exec rubocop

- name: Run tests
if: matrix.ruby-version != '3.4' || matrix.rails-version != 'rails-8.0'
run: bundle exec appraisal ${{ matrix.rails-version }} rspec --tag '~generator'
env: # Dummy environment variables for local providers
OLLAMA_API_BASE: http://localhost:11434/v1
GPUSTACK_API_BASE: http://localhost:11444/v1
GPUSTACK_API_KEY: test
SKIP_COVERAGE: true

- name: Run full test suite with coverage (latest Ruby/Rails)
if: matrix.ruby-version == '3.4' && matrix.rails-version == 'rails-8.0'
run: bundle exec appraisal ${{ matrix.rails-version }} rspec
env: # Dummy environment variables for local providers
OLLAMA_API_BASE: http://localhost:11434/v1
GPUSTACK_API_BASE: http://localhost:11444/v1
GPUSTACK_API_KEY: test

- name: Upload coverage to Codecov
if: matrix.ruby-version == '3.4' && matrix.rails-version == 'rails-8.0'
uses: codecov/codecov-action@v5
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Expand All @@ -85,11 +94,12 @@ jobs:
run: |
FARADAY_VERSION=1.10.3 bundle install
bundle exec appraisal ${{ matrix.rails-version }} bundle install
bundle exec appraisal ${{ matrix.rails-version }} rspec
bundle exec appraisal ${{ matrix.rails-version }} rspec --tag '~generator'
env: # Dummy environment variables for local providers
OLLAMA_API_BASE: http://localhost:11434/v1
GPUSTACK_API_BASE: http://localhost:11444/v1
GPUSTACK_API_KEY: test
SKIP_COVERAGE: true

publish:
name: Build + Publish
Expand All @@ -99,8 +109,6 @@ jobs:

steps:
- uses: actions/checkout@v4
with:
lfs: true

- name: Set up Ruby
uses: ruby/setup-ruby@v1
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
with:
lfs: true

- name: Setup Ruby for models guide generation (root Gemfile)
uses: ruby/setup-ruby@v1
Expand Down
15 changes: 2 additions & 13 deletions .overcommit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ PreCommit:

RSpec:
enabled: true
command: ['bundle', 'exec', 'rspec']
command: ['bundle', 'exec', 'rspec', '--tag', '~generator']
on_warn: fail

TrailingWhitespace:
Expand All @@ -24,20 +24,9 @@ PreCommit:
description: 'Update appraisal gemfiles'
command: ['bundle', 'exec', 'appraisal', 'update']

PrePush:
GitLfs:
enabled: true
description: 'Push LFS objects to remote'
command: ['bash', '-c', 'git lfs pre-push "$@"', '--']

PostCheckout:
ALL: # Special hook name that customizes all hooks of this type
quiet: true # Change all post-checkout hooks to only display output on failure

IndexTags:
enabled: true # Generate a tags file with `ctags` each time HEAD changes

LfsInstall:
enabled: true
description: 'Ensure Git LFS files are pulled'
command: ['git', 'lfs', 'pull']
enabled: true # Generate a tags file with `ctags` each time HEAD changes
1 change: 1 addition & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ AllCops:
- docs/**/*
- vendor/**/*
- gemfiles/**/*
- lib/generators/**/templates/**/*
SuggestExtensions: false

Metrics/ClassLength:
Expand Down
33 changes: 33 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,39 @@ rake vcr:record[all] # Everything

Always check cassettes for leaked API keys before committing.

## Optional Dependencies

### Red Candle Provider

The Red Candle provider enables local LLM execution using quantized GGUF models. It requires a Rust toolchain, so it's optional for contributors.

**To work WITHOUT Red Candle (default):**
```bash
bundle install
bundle exec rspec # Red Candle tests will be skipped
```

**To work WITH Red Candle:**
```bash
# Enable the Red Candle gem group
bundle config set --local with red_candle
bundle install

# Run tests with stubbed Red Candle (fast, default)
bundle exec rspec

# Run tests with real inference (slow, downloads models)
RED_CANDLE_REAL_INFERENCE=true bundle exec rspec
```

**To switch back to working without Red Candle:**
```bash
bundle config unset with
bundle install
```

The `bundle config` settings are stored in `.bundle/config` (gitignored), so each developer can choose their own setup without affecting others.

## Important Notes

* **Never edit `models.json`, `aliases.json`, or `available-models.md`** - they're auto-generated by `rake models`
Expand Down
6 changes: 6 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,9 @@ group :development do # rubocop:disable Metrics/BlockLength
# Optional dependency for Vertex AI
gem 'googleauth'
end

# Optional group for Red Candle provider (requires Rust toolchain)
# To include: bundle config set --local with red-candle
group :red_candle, optional: true do
gem 'red-candle', '~> 1.3'
end
25 changes: 19 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@

Battle tested at [<picture><source media="(prefers-color-scheme: dark)" srcset="https://chatwithwork.com/logotype-dark.svg"><img src="https://chatwithwork.com/logotype.svg" alt="Chat with Work" height="30" align="absmiddle"></picture>](https://chatwithwork.com) — *Claude Code for your documents*

[![Gem Version](https://badge.fury.io/rb/ruby_llm.svg?a=7)](https://badge.fury.io/rb/ruby_llm)
[![Ruby Style Guide](https://img.shields.io/badge/code_style-standard-brightgreen.svg)](https://github.com/testdouble/standard)
[![Gem Version](https://badge.fury.io/rb/ruby_llm.svg?a=10)](https://badge.fury.io/rb/ruby_llm)
[![Ruby Style Guide](https://img.shields.io/badge/code_style-rubocop-brightgreen.svg)](https://github.com/rubocop/rubocop)
[![Gem Downloads](https://img.shields.io/gem/dt/ruby_llm)](https://rubygems.org/gems/ruby_llm)
[![codecov](https://codecov.io/gh/crmne/ruby_llm/branch/main/graph/badge.svg?a=2)](https://codecov.io/gh/crmne/ruby_llm)

Expand Down Expand Up @@ -41,6 +41,7 @@ chat.ask "What's the best way to learn Ruby?"
```ruby
# Analyze any file type
chat.ask "What's in this image?", with: "ruby_conf.jpg"
chat.ask "What's happening in this video?", with: "video.mp4"
chat.ask "Describe this meeting", with: "meeting.wav"
chat.ask "Summarize this document", with: "contract.pdf"
chat.ask "Explain this code", with: "app.rb"
Expand Down Expand Up @@ -68,6 +69,11 @@ RubyLLM.paint "a sunset over mountains in watercolor style"
RubyLLM.embed "Ruby is elegant and expressive"
```

```ruby
# Moderate content for safety
RubyLLM.moderate "Check if this text is safe"
```

```ruby
# Let AI use your code
class Weather < RubyLLM::Tool
Expand Down Expand Up @@ -100,18 +106,19 @@ response = chat.with_schema(ProductSchema).ask "Analyze this product", with: "pr
## Features

* **Chat:** Conversational AI with `RubyLLM.chat`
* **Vision:** Analyze images and screenshots
* **Vision:** Analyze images and videos
* **Audio:** Transcribe and understand speech
* **Documents:** Extract from PDFs, CSVs, JSON, any file type
* **Image generation:** Create images with `RubyLLM.paint`
* **Embeddings:** Vector search with `RubyLLM.embed`
* **Embeddings:** Generate embeddings with `RubyLLM.embed`
* **Moderation:** Content safety with `RubyLLM.moderate`
* **Tools:** Let AI call your Ruby methods
* **Structured output:** JSON schemas that just work
* **Streaming:** Real-time responses with blocks
* **Rails:** ActiveRecord integration with `acts_as_chat`
* **Async:** Fiber-based concurrency
* **Model registry:** 500+ models with capability detection and pricing
* **Providers:** OpenAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, and any OpenAI-compatible API
* **Providers:** OpenAI, Anthropic, Gemini, VertexAI, Bedrock, DeepSeek, Mistral, Ollama, OpenRouter, Perplexity, GPUStack, [RedCandle](https://github.com/scientist-labs/red-candle), and any OpenAI-compatible API

## Installation

Expand All @@ -132,18 +139,24 @@ end
## Rails

```bash
# Install Rails Integration
rails generate ruby_llm:install

# Add Chat UI (optional)
rails generate ruby_llm:chat_ui
```

```ruby
class Chat < ApplicationRecord
acts_as_chat
end

chat = Chat.create! model_id: "claude-sonnet-4"
chat = Chat.create! model: "claude-sonnet-4"
chat.ask "What's in this file?", with: "report.pdf"
```

Visit `http://localhost:3000/chats` for a ready-to-use chat interface!

## Documentation

[rubyllm.com](https://rubyllm.com)
Expand Down
103 changes: 41 additions & 62 deletions docs/_advanced/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ The registry stores crucial information about each model, including:
* **`name`**: A human-friendly name.
* **`context_window`**: Max input tokens (e.g., `128_000`).
* **`max_tokens`**: Max output tokens (e.g., `16_384`).
* **`supports_vision`**: If it can process images.
* **`supports_vision`**: If it can process images and videos.
* **`supports_functions`**: If it can use [Tools]({% link _core_features/tools.md %}).
* **`input_price_per_million`**: Cost in USD per 1 million input tokens.
* **`output_price_per_million`**: Cost in USD per 1 million output tokens.
Expand Down Expand Up @@ -86,7 +86,7 @@ chat_models = RubyLLM.models.refresh!.chat_models

**Local Provider Models:**

By default, `refresh!` includes models from local providers like Ollama and GPUStack if they're configured. To exclude local providers and only fetch from remote APIs (available in v1.6.5+):
By default, `refresh!` includes models from local providers like Ollama and GPUStack if they're configured. To exclude local providers and only fetch from remote APIs:

```ruby
# Only fetch from remote providers (Anthropic, OpenAI, etc.)
Expand All @@ -95,6 +95,33 @@ RubyLLM.models.refresh!(remote_only: true)

This is useful when you want to refresh only cloud-based models without querying local model servers.

### Dynamic Model Registration (Red Candle)

Some providers register their models dynamically at runtime rather than through the models.json file. Red Candle is one such provider - it registers its GGUF models when the gem is loaded.

**How Red Candle Models Work:**

1. **Not in models.json**: Red Candle models don't appear in the static models.json file since they're only available when the gem is installed.

2. **Dynamic Registration**: When ruby_llm.rb loads and Red Candle is available, it adds models to the in-memory registry:
```ruby
# This happens automatically in lib/ruby_llm.rb
RubyLLM::Providers::RedCandle.models.each do |model|
RubyLLM.models.instance_variable_get(:@models) << model
end
```

3. **Excluded from refresh!**: The `refresh!(remote_only: true)` flag excludes Red Candle and other local providers.

4. **Currently Supported Models**:
- `google/gemma-3-4b-it-qat-q4_0-gguf`
- `TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF`
- `TheBloke/Mistral-7B-Instruct-v0.2-GGUF`
- `Qwen/Qwen2.5-1.5B-Instruct-GGUF`
- `microsoft/Phi-3-mini-4k-instruct`

Red Candle models are only available when the gem is installed with the red_candle group enabled. See the [Configuration Guide]({% link _getting_started/configuration.md %}) for installation instructions.

**For Gem Development:**

The `rake models:update` task is designed for gem maintainers and updates the `models.json` file shipped with the gem:
Expand All @@ -108,68 +135,20 @@ This task is not intended for Rails applications as it writes to gem directories

**Persisting Models to Your Database:**

If you want to store model information in your application's database for persistence, querying, or caching, create your own migration and sync logic. Here's an example schema and production-ready sync job:
For Rails applications, the install generator sets up everything automatically:

```ruby
# db/migrate/xxx_create_llm_models.rb
create_table "llm_models", force: :cascade do |t|
t.string "model_id", null: false
t.string "name", null: false
t.string "provider", null: false
t.boolean "available", default: false
t.boolean "is_default", default: false
t.datetime "last_synced_at"
t.integer "context_window"
t.integer "max_output_tokens"
t.jsonb "metadata", default: {}
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.string "slug"
t.string "model_type"
t.string "family"
t.datetime "model_created_at"
t.date "knowledge_cutoff"
t.jsonb "modalities", default: {}, null: false
t.jsonb "capabilities", default: [], null: false
t.jsonb "pricing", default: {}, null: false

t.index ["model_id"], unique: true
t.index ["provider", "available", "context_window"]
t.index ["capabilities"], using: :gin
t.index ["modalities"], using: :gin
t.index ["pricing"], using: :gin
end
```bash
rails generate ruby_llm:install
rails db:migrate
```

# app/jobs/sync_llm_models_job.rb
class SyncLLMModelsJob < ApplicationJob
queue_as :default
retry_on StandardError, wait: 1.seconds, attempts: 5

def perform
RubyLLM.models.refresh!

found_model_ids = RubyLLM.models.chat_models.filter_map do |model_data|
attributes = model_data.to_h
attributes[:model_id] = attributes.delete(:id)
attributes[:model_type] = attributes.delete(:type)
attributes[:model_created_at] = attributes.delete(:created_at)
attributes[:last_synced_at] = Time.now

model = LLMModel.find_or_initialize_by(model_id: attributes[:model_id])
model.assign_attributes(**attributes)
model.save ? model.id : nil
end

# Mark missing models as unavailable instead of deleting them
LLMModel.where.not(id: found_model_ids).update_all(available: false)
end
end
This creates the Model table and loads model data from the gem's registry.

# Schedule it to run periodically
# config/schedule.rb (with whenever gem)
every 6.hours do
runner "SyncLLMModelsJob.perform_later"
end
To refresh model data from provider APIs:

```ruby
# Fetches latest model info from configured providers (requires API keys)
Model.refresh!
```

## Exploring and Finding Models
Expand Down Expand Up @@ -323,4 +302,4 @@ image = RubyLLM.paint(
* **Your Responsibility:** Ensure the model ID is correct for the target endpoint.
* **Warning Log:** A warning is logged indicating validation was skipped.

Use these features when the standard registry doesn't cover your specific model or endpoint needs. For standard models, rely on the registry for validation and capability awareness. See the [Chat Guide]({% link _core_features/chat.md %}) for more on using the `chat` object.
Use these features when the standard registry doesn't cover your specific model or endpoint needs. For standard models, rely on the registry for validation and capability awareness. See the [Chat Guide]({% link _core_features/chat.md %}) for more on using the `chat` object.
Loading