Ruby SDK for Langfuse - Open-source LLM observability and prompt management.
- π― Prompt Management - Fetch and compile prompts with variable substitution
- π LLM Tracing - Built on OpenTelemetry for distributed tracing
- β‘ Flexible Caching - In-memory or Rails.cache (Redis) backends with TTL
- π¬ Text & Chat Prompts - Support for both simple text and chat/completion formats
- π§ Mustache Templating - Logic-less variable substitution with nested objects and lists
- π Automatic Retries - Built-in retry logic with exponential backoff
- π‘οΈ Fallback Support - Graceful degradation when API is unavailable
- π Rails-Friendly - Global configuration pattern with
Langfuse.configure
Add to your Gemfile:
gem 'langfuse'Then run:
bundle install# config/initializers/langfuse.rb (Rails)
Langfuse.configure do |config|
config.public_key = ENV['LANGFUSE_PUBLIC_KEY']
config.secret_key = ENV['LANGFUSE_SECRET_KEY']
config.base_url = "https://cloud.langfuse.com" # Optional (default)
config.cache_ttl = 60 # Cache prompts for 60 seconds (optional)
config.cache_backend = :memory # :memory (default) or :rails for distributed cache
config.timeout = 5 # Request timeout in seconds (optional)
end# Use the global singleton client
client = Langfuse.client
prompt = client.get_prompt("greeting")
text = prompt.compile(name: "Alice")
puts text # => "Hello Alice!"Text prompts are simple string templates with Mustache variables:
# Fetch a text prompt
prompt = Langfuse.client.get_prompt("email-template")
# Access metadata
prompt.name # => "email-template"
prompt.version # => 3
prompt.labels # => ["production"]
prompt.tags # => ["email", "customer"]
# Compile with variables
email = prompt.compile(
customer_name: "Alice",
order_number: "12345",
total: "$99.99"
)
# => "Dear Alice, your order #12345 for $99.99 has shipped!"Chat prompts return arrays of messages ready for LLM APIs:
# Fetch a chat prompt
prompt = Langfuse.client.get_prompt("support-assistant")
# Compile with variables
messages = prompt.compile(
company_name: "Acme Corp",
support_level: "premium"
)
# => [
# { role: :system, content: "You are a premium support agent for Acme Corp." },
# { role: :user, content: "How can I help you today?" }
# ]
# Use directly with OpenAI
require 'openai'
client = OpenAI::Client.new
response = client.chat(
parameters: {
model: "gpt-4",
messages: messages
}
)# Fetch specific version
prompt = Langfuse.client.get_prompt("greeting", version: 2)
# Fetch by label
production_prompt = Langfuse.client.get_prompt("greeting", label: "production")
# Note: version and label are mutually exclusive
# client.get_prompt("greeting", version: 2, label: "production") # Raises ArgumentErrorImportant: The version and label parameters are mutually exclusive. You can specify one or the other, but not both. When using cache warming with version overrides, the SDK automatically handles this - prompts with version overrides won't send the default label.
The SDK uses Mustache for powerful templating:
# Nested objects
prompt.compile(
user: { name: "Alice", email: "alice@example.com" }
)
# Template: "Hello {{user.name}}, we'll email you at {{user.email}}"
# Result: "Hello Alice, we'll email you at alice@example.com"
# Lists/Arrays
prompt.compile(
items: [
{ name: "Apple", price: 1.99 },
{ name: "Banana", price: 0.99 }
]
)
# Template: "{{#items}}β’ {{name}}: ${{price}}\n{{/items}}"
# Result: "β’ Apple: $1.99\nβ’ Banana: $0.99"
# HTML escaping (automatic)
prompt.compile(content: "<script>alert('xss')</script>")
# Template: "User input: {{content}}"
# Result: "User input: <script>alert('xss')</script>"
# Raw output (triple braces to skip escaping)
prompt.compile(raw_html: "<strong>Bold</strong>")
# Template: "{{{raw_html}}}"
# Result: "<strong>Bold</strong>"See Mustache documentation for more templating features.
# One-liner: fetch and compile
text = Langfuse.client.compile_prompt("greeting", variables: { name: "Alice" })
# With fallback for graceful degradation
prompt = Langfuse.client.get_prompt(
"greeting",
fallback: "Hello {{name}}!",
type: :text
)The SDK provides comprehensive LLM tracing built on OpenTelemetry. Traces capture LLM calls, nested operations, token usage, and costs.
Langfuse.trace(name: "chat-completion", user_id: "user-123") do |trace|
trace.generation(
name: "openai-call",
model: "gpt-4",
input: [{ role: "user", content: "Hello!" }]
) do |gen|
response = openai_client.chat(
parameters: {
model: "gpt-4",
messages: [{ role: "user", content: "Hello!" }]
}
)
gen.output = response.choices.first.message.content
gen.usage = {
prompt_tokens: response.usage.prompt_tokens,
completion_tokens: response.usage.completion_tokens,
total_tokens: response.usage.total_tokens
}
end
endLangfuse.trace(name: "rag-query", user_id: "user-456") do |trace|
# Document retrieval
docs = trace.span(name: "retrieval", input: { query: "What is Ruby?" }) do |span|
results = vector_db.search(query_embedding, limit: 5)
span.output = { count: results.size }
results
end
# LLM generation with context
trace.generation(
name: "gpt4-completion",
model: "gpt-4",
input: build_prompt_with_context(docs)
) do |gen|
response = openai_client.chat(...)
gen.output = response.choices.first.message.content
gen.usage = {
prompt_tokens: response.usage.prompt_tokens,
completion_tokens: response.usage.completion_tokens
}
end
endPrompts fetched via the SDK are automatically linked to your traces:
prompt = Langfuse.client.get_prompt("support-assistant", version: 3)
Langfuse.trace(name: "support-query") do |trace|
trace.generation(
name: "response",
model: "gpt-4",
prompt: prompt # Automatically captured in trace!
) do |gen|
messages = prompt.compile(customer_name: "Alice")
response = openai_client.chat(parameters: { model: "gpt-4", messages: messages })
gen.output = response.choices.first.message.content
end
endThe SDK is built on OpenTelemetry, which means:
- Automatic Context Propagation: Trace context flows through your application
- APM Integration: Traces appear in Datadog, New Relic, Honeycomb, etc.
- W3C Trace Context: Standard distributed tracing across microservices
- Rich Instrumentation: Works with existing OTel instrumentation (HTTP, Rails, Sidekiq)
For more tracing examples and advanced usage, see Tracing Guide.
All configuration options:
Langfuse.configure do |config|
# Required: Authentication
config.public_key = "pk_..."
config.secret_key = "sk_..."
# Optional: API Settings
config.base_url = "https://cloud.langfuse.com" # Default
config.timeout = 5 # Seconds (default: 5)
# Optional: Caching
config.cache_backend = :memory # :memory (default) or :rails
config.cache_ttl = 60 # Cache TTL in seconds (default: 60, 0 = disabled)
config.cache_max_size = 1000 # Max cached prompts (default: 1000, only for :memory backend)
config.cache_lock_timeout = 10 # Lock timeout in seconds (default: 10, only for :rails backend)
# Optional: Logging
config.logger = Rails.logger # Custom logger (default: Logger.new($stdout))
endConfiguration can also be loaded from environment variables:
export LANGFUSE_PUBLIC_KEY="pk_..."
export LANGFUSE_SECRET_KEY="sk_..."
export LANGFUSE_BASE_URL="https://cloud.langfuse.com"Langfuse.configure do |config|
# Keys are automatically loaded from ENV if not explicitly set
endFor non-global usage:
config = Langfuse::Config.new do |c|
c.public_key = "pk_..."
c.secret_key = "sk_..."
end
client = Langfuse::Client.new(config)The SDK supports two caching backends:
Built-in thread-safe in-memory caching with TTL and LRU eviction:
Langfuse.configure do |config|
config.cache_backend = :memory # Default
config.cache_ttl = 60 # Cache for 60 seconds
config.cache_max_size = 1000 # Max 1000 prompts in memory
end
# First call hits the API
prompt1 = Langfuse.client.get_prompt("greeting") # API call
# Second call uses cache (within TTL)
prompt2 = Langfuse.client.get_prompt("greeting") # Cached!
# Different versions are cached separately
v1 = Langfuse.client.get_prompt("greeting", version: 1) # API call
v2 = Langfuse.client.get_prompt("greeting", version: 2) # API callIn-Memory Cache Features:
- Thread-safe with Monitor-based synchronization
- TTL-based expiration
- LRU eviction when max_size is reached
- Perfect for single-process apps, scripts, and Sidekiq workers
- No external dependencies
For multi-process deployments (e.g., large Rails apps with many Passenger/Puma workers):
# config/initializers/langfuse.rb
Langfuse.configure do |config|
config.public_key = ENV['LANGFUSE_PUBLIC_KEY']
config.secret_key = ENV['LANGFUSE_SECRET_KEY']
config.cache_backend = :rails # Use Rails.cache (typically Redis)
config.cache_ttl = 300 # 5 minutes
endRails.cache Backend Features:
- Shared cache across all processes and servers
- Distributed caching with Redis/Memcached
- Automatic stampede protection with distributed locks
- Exponential backoff (50ms, 100ms, 200ms) when waiting for locks
- No max_size limit (managed by Redis/Memcached)
- Ideal for large-scale deployments (100+ processes)
How Stampede Protection Works:
When using Rails.cache backend, the SDK automatically prevents "thundering herd" problems:
- Cache Miss: First process acquires distributed lock, fetches from API
- Concurrent Requests: Other processes wait (exponential backoff) instead of hitting API
- Cache Populated: Waiting processes read from cache once first process completes
- Fallback: If lock holder crashes, lock auto-expires (configurable timeout)
# Configure lock timeout (default: 10 seconds)
Langfuse.configure do |config|
config.cache_backend = :rails
config.cache_lock_timeout = 15 # Lock expires after 15s
endThis is automatic for Rails.cache backend - no additional configuration needed!
When to use Rails.cache:
- Large Rails apps with many worker processes (Passenger, Puma, Unicorn)
- Multiple servers sharing the same prompt cache
- Deploying with 100+ processes that all need consistent cache
- Already using Redis for Rails.cache
When to use in-memory cache:
- Single-process applications
- Scripts and background jobs
- Smaller deployments (< 10 processes)
- When you want zero external dependencies
Pre-warm the cache during deployment to prevent cold-start API calls:
Auto-Discovery (Recommended):
The SDK can automatically discover and warm ALL prompts in your Langfuse project. By default, it fetches prompts with the "production" label to ensure you're warming production-ready prompts during deployment.
# Rake task - automatically discovers all prompts with "production" label
bundle exec rake langfuse:warm_cache_all# Programmatically - warms all prompts with "production" label
warmer = Langfuse::CacheWarmer.new
results = warmer.warm_all
puts "Cached #{results[:success].size} prompts"
# => Cached 12 prompts (automatically discovered with "production" label)
# Warm with a different label (e.g., staging)
results = warmer.warm_all(default_label: "staging")
# Warm latest versions (no label)
results = warmer.warm_all(default_label: nil)Manual Prompt List:
Or specify exact prompts to warm:
# In your deploy script (Capistrano, etc.)
bundle exec rake langfuse:warm_cache[greeting,conversation,rag-pipeline]
# Or via environment variable
LANGFUSE_PROMPTS_TO_WARM=greeting,conversation rake langfuse:warm_cache# In deployment scripts or initializers
warmer = Langfuse::CacheWarmer.new
results = warmer.warm(['greeting', 'conversation', 'rag-pipeline'])
puts "Cached #{results[:success].size} prompts"
# => Cached 3 prompts
# With error handling
if results[:failed].any?
results[:failed].each do |failure|
logger.warn "Failed to cache #{failure[:name]}: #{failure[:error]}"
end
end
# Strict mode - raise on any failures
warmer.warm!(['greeting', 'conversation']) # Raises CacheWarmingError if any failWith specific versions or labels:
# Auto-discovery with version overrides (version takes precedence over label)
warmer.warm_all(
versions: { 'greeting' => 2 } # greeting uses version 2, others use "production" label
)
# Override label for specific prompts
warmer.warm_all(
default_label: "production",
labels: { 'greeting' => 'staging' } # greeting uses "staging", others use "production"
)
# Manual list with versions/labels
warmer.warm(
['greeting', 'conversation'],
versions: { 'greeting' => 2 },
labels: { 'conversation' => 'production' }
)Available Rake Tasks:
# Auto-discover and warm ALL prompts (recommended for deployment)
rake langfuse:warm_cache_all
# Warm cache with specific prompts
rake langfuse:warm_cache[prompt1,prompt2,prompt3]
# List prompt details
LANGFUSE_PROMPT_NAMES=greeting,conversation rake langfuse:list_prompts
# Clear the cache
rake langfuse:clear_cachebegin
prompt = Langfuse.client.get_prompt("my-prompt")
rescue Langfuse::NotFoundError => e
puts "Prompt not found: #{e.message}"
rescue Langfuse::UnauthorizedError => e
puts "Authentication failed: #{e.message}"
rescue Langfuse::ApiError => e
puts "API error: #{e.message}"
endException Hierarchy:
StandardError
βββ Langfuse::Error
βββ Langfuse::ConfigurationError
βββ Langfuse::ApiError
βββ Langfuse::NotFoundError (404)
βββ Langfuse::UnauthorizedError (401)
The SDK is designed to be test-friendly:
# RSpec example
RSpec.configure do |config|
config.before(:each) do
Langfuse.reset! # Clear global state
end
end
# Mock prompts in tests
before do
allow_any_instance_of(Langfuse::Client)
.to receive(:get_prompt)
.with("greeting")
.and_return(
Langfuse::TextPromptClient.new(
"name" => "greeting",
"version" => 1,
"type" => "text",
"prompt" => "Hello {{name}}!",
"labels" => [],
"tags" => [],
"config" => {}
)
)
end# Get a prompt (returns TextPromptClient or ChatPromptClient)
client.get_prompt(name, version: nil, label: nil)
# List all prompts in your Langfuse project
client.list_prompts(page: nil, limit: nil) # Returns Array of prompt metadata
# Get and compile in one call
client.compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil)# TextPromptClient
prompt.compile(variables = {}) # Returns String
# ChatPromptClient
prompt.compile(variables = {}) # Returns Array of message hashes
# Both have these properties:
prompt.name # String
prompt.version # Integer
prompt.prompt # String (text) or Array (chat)
prompt.labels # Array
prompt.tags # Array
prompt.config # Hash# Create a trace
Langfuse.trace(name:, user_id: nil, session_id: nil, metadata: {}) do |trace|
# Add spans, generations, events
end
# Add a generation (LLM call)
trace.generation(name:, model:, input:, prompt: nil) do |gen|
gen.output = "..."
gen.usage = { prompt_tokens: 10, completion_tokens: 20 }
end
# Add a span (any operation)
trace.span(name:, input: nil) do |span|
span.output = "..."
span.metadata = { ... }
end
# Add an event (point-in-time)
trace.event(name:, input: nil, output: nil)See API documentation for complete reference.
- Ruby >= 3.2.0
- No Rails dependency (works with any Ruby project)
All components are thread-safe:
Langfuse.configureandLangfuse.clientare safe to call from multiple threadsPromptCacheuses Monitor-based synchronizationClientinstances can be shared across threads
# Clone the repository
git clone https://github.com/langfuse/langfuse-ruby.git
cd langfuse-ruby
# Install dependencies
bundle install
# Run tests
bundle exec rspec
# Run linter
bundle exec rubocop -aCurrent Status: Production-ready with 99.6% test coverage
For detailed implementation plans and progress, see:
- IMPLEMENTATION_PLAN_V2.md - Detailed roadmap
- PROGRESS.md - Current status and milestones
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
- Check existing issues and roadmap
- Open an issue to discuss your idea
- Fork the repo and create a feature branch
- Write tests for your changes
- Ensure
bundle exec rspecandbundle exec rubocoppass - Submit a pull request
MIT License - see LICENSE for details.
Need help? Open an issue on GitHub.