Langfuse Ruby SDK

Ruby SDK for Langfuse - Open-source LLM observability and prompt management.

Features

🎯 Prompt Management - Fetch and compile prompts with variable substitution
📊 LLM Tracing - Built on OpenTelemetry for distributed tracing
⚡ Flexible Caching - In-memory or Rails.cache (Redis) backends with TTL
💬 Text & Chat Prompts - Support for both simple text and chat/completion formats
🔧 Mustache Templating - Logic-less variable substitution with nested objects and lists
🔄 Automatic Retries - Built-in retry logic with exponential backoff
🛡️ Fallback Support - Graceful degradation when API is unavailable
🚀 Rails-Friendly - Global configuration pattern with Langfuse.configure

Installation

Add to your Gemfile:

gem 'langfuse'

Then run:

bundle install

Quick Start

Configuration

# config/initializers/langfuse.rb (Rails)
Langfuse.configure do |config|
  config.public_key = ENV['LANGFUSE_PUBLIC_KEY']
  config.secret_key = ENV['LANGFUSE_SECRET_KEY']
  config.base_url = "https://cloud.langfuse.com"  # Optional (default)
  config.cache_ttl = 60           # Cache prompts for 60 seconds (optional)
  config.cache_backend = :memory  # :memory (default) or :rails for distributed cache
  config.timeout = 5              # Request timeout in seconds (optional)
end

Basic Usage

# Use the global singleton client
client = Langfuse.client
prompt = client.get_prompt("greeting")
text = prompt.compile(name: "Alice")
puts text  # => "Hello Alice!"

Prompt Management

Text Prompts

Text prompts are simple string templates with Mustache variables:

# Fetch a text prompt
prompt = Langfuse.client.get_prompt("email-template")

# Access metadata
prompt.name        # => "email-template"
prompt.version     # => 3
prompt.labels      # => ["production"]
prompt.tags        # => ["email", "customer"]

# Compile with variables
email = prompt.compile(
  customer_name: "Alice",
  order_number: "12345",
  total: "$99.99"
)
# => "Dear Alice, your order #12345 for $99.99 has shipped!"

Chat Prompts

Chat prompts return arrays of messages ready for LLM APIs:

# Fetch a chat prompt
prompt = Langfuse.client.get_prompt("support-assistant")

# Compile with variables
messages = prompt.compile(
  company_name: "Acme Corp",
  support_level: "premium"
)
# => [
#      { role: :system, content: "You are a premium support agent for Acme Corp." },
#      { role: :user, content: "How can I help you today?" }
#    ]

# Use directly with OpenAI
require 'openai'
client = OpenAI::Client.new
response = client.chat(
  parameters: {
    model: "gpt-4",
    messages: messages
  }
)

Versioning

# Fetch specific version
prompt = Langfuse.client.get_prompt("greeting", version: 2)

# Fetch by label
production_prompt = Langfuse.client.get_prompt("greeting", label: "production")

# Note: version and label are mutually exclusive
# client.get_prompt("greeting", version: 2, label: "production")  # Raises ArgumentError

Important: The version and label parameters are mutually exclusive. You can specify one or the other, but not both. When using cache warming with version overrides, the SDK automatically handles this - prompts with version overrides won't send the default label.

Advanced Templating

The SDK uses Mustache for powerful templating:

# Nested objects
prompt.compile(
  user: { name: "Alice", email: "alice@example.com" }
)
# Template: "Hello {{user.name}}, we'll email you at {{user.email}}"
# Result: "Hello Alice, we'll email you at alice@example.com"

# Lists/Arrays
prompt.compile(
  items: [
    { name: "Apple", price: 1.99 },
    { name: "Banana", price: 0.99 }
  ]
)
# Template: "{{#items}}• {{name}}: ${{price}}\n{{/items}}"
# Result: "• Apple: $1.99\n• Banana: $0.99"

# HTML escaping (automatic)
prompt.compile(content: "<script>alert('xss')</script>")
# Template: "User input: {{content}}"
# Result: "User input: &lt;script&gt;alert('xss')&lt;/script&gt;"

# Raw output (triple braces to skip escaping)
prompt.compile(raw_html: "<strong>Bold</strong>")
# Template: "{{{raw_html}}}"
# Result: "<strong>Bold</strong>"

See Mustache documentation for more templating features.

Convenience Methods

# One-liner: fetch and compile
text = Langfuse.client.compile_prompt("greeting", variables: { name: "Alice" })

# With fallback for graceful degradation
prompt = Langfuse.client.get_prompt(
  "greeting",
  fallback: "Hello {{name}}!",
  type: :text
)

LLM Tracing & Observability

The SDK provides comprehensive LLM tracing built on OpenTelemetry. Traces capture LLM calls, nested operations, token usage, and costs.

Basic Example

Langfuse.trace(name: "chat-completion", user_id: "user-123") do |trace|
  trace.generation(
    name: "openai-call",
    model: "gpt-4",
    input: [{ role: "user", content: "Hello!" }]
  ) do |gen|
    response = openai_client.chat(
      parameters: {
        model: "gpt-4",
        messages: [{ role: "user", content: "Hello!" }]
      }
    )

    gen.output = response.choices.first.message.content
    gen.usage = {
      prompt_tokens: response.usage.prompt_tokens,
      completion_tokens: response.usage.completion_tokens,
      total_tokens: response.usage.total_tokens
    }
  end
end

RAG Pipeline Example

Langfuse.trace(name: "rag-query", user_id: "user-456") do |trace|
  # Document retrieval
  docs = trace.span(name: "retrieval", input: { query: "What is Ruby?" }) do |span|
    results = vector_db.search(query_embedding, limit: 5)
    span.output = { count: results.size }
    results
  end

  # LLM generation with context
  trace.generation(
    name: "gpt4-completion",
    model: "gpt-4",
    input: build_prompt_with_context(docs)
  ) do |gen|
    response = openai_client.chat(...)
    gen.output = response.choices.first.message.content
    gen.usage = {
      prompt_tokens: response.usage.prompt_tokens,
      completion_tokens: response.usage.completion_tokens
    }
  end
end

Automatic Prompt Linking

Prompts fetched via the SDK are automatically linked to your traces:

prompt = Langfuse.client.get_prompt("support-assistant", version: 3)

Langfuse.trace(name: "support-query") do |trace|
  trace.generation(
    name: "response",
    model: "gpt-4",
    prompt: prompt  # Automatically captured in trace!
  ) do |gen|
    messages = prompt.compile(customer_name: "Alice")
    response = openai_client.chat(parameters: { model: "gpt-4", messages: messages })
    gen.output = response.choices.first.message.content
  end
end

OpenTelemetry Integration

The SDK is built on OpenTelemetry, which means:

Automatic Context Propagation: Trace context flows through your application
APM Integration: Traces appear in Datadog, New Relic, Honeycomb, etc.
W3C Trace Context: Standard distributed tracing across microservices
Rich Instrumentation: Works with existing OTel instrumentation (HTTP, Rails, Sidekiq)

For more tracing examples and advanced usage, see Tracing Guide.

Configuration

All configuration options:

Langfuse.configure do |config|
  # Required: Authentication
  config.public_key = "pk_..."
  config.secret_key = "sk_..."

  # Optional: API Settings
  config.base_url = "https://cloud.langfuse.com"  # Default
  config.timeout = 5                              # Seconds (default: 5)

  # Optional: Caching
  config.cache_backend = :memory      # :memory (default) or :rails
  config.cache_ttl = 60               # Cache TTL in seconds (default: 60, 0 = disabled)
  config.cache_max_size = 1000        # Max cached prompts (default: 1000, only for :memory backend)
  config.cache_lock_timeout = 10      # Lock timeout in seconds (default: 10, only for :rails backend)

  # Optional: Logging
  config.logger = Rails.logger    # Custom logger (default: Logger.new($stdout))
end

Environment Variables

Configuration can also be loaded from environment variables:

export LANGFUSE_PUBLIC_KEY="pk_..."
export LANGFUSE_SECRET_KEY="sk_..."
export LANGFUSE_BASE_URL="https://cloud.langfuse.com"

Langfuse.configure do |config|
  # Keys are automatically loaded from ENV if not explicitly set
end

Instance-Based Configuration

For non-global usage:

config = Langfuse::Config.new do |c|
  c.public_key = "pk_..."
  c.secret_key = "sk_..."
end

client = Langfuse::Client.new(config)

Caching

The SDK supports two caching backends:

In-Memory Cache (Default)

Built-in thread-safe in-memory caching with TTL and LRU eviction:

Langfuse.configure do |config|
  config.cache_backend = :memory      # Default
  config.cache_ttl = 60               # Cache for 60 seconds
  config.cache_max_size = 1000        # Max 1000 prompts in memory
end

# First call hits the API
prompt1 = Langfuse.client.get_prompt("greeting")  # API call

# Second call uses cache (within TTL)
prompt2 = Langfuse.client.get_prompt("greeting")  # Cached!

# Different versions are cached separately
v1 = Langfuse.client.get_prompt("greeting", version: 1)  # API call
v2 = Langfuse.client.get_prompt("greeting", version: 2)  # API call

In-Memory Cache Features:

Thread-safe with Monitor-based synchronization
TTL-based expiration
LRU eviction when max_size is reached
Perfect for single-process apps, scripts, and Sidekiq workers
No external dependencies

Rails.cache Backend (Distributed)

For multi-process deployments (e.g., large Rails apps with many Passenger/Puma workers):

# config/initializers/langfuse.rb
Langfuse.configure do |config|
  config.public_key = ENV['LANGFUSE_PUBLIC_KEY']
  config.secret_key = ENV['LANGFUSE_SECRET_KEY']
  config.cache_backend = :rails      # Use Rails.cache (typically Redis)
  config.cache_ttl = 300              # 5 minutes
end

Rails.cache Backend Features:

Shared cache across all processes and servers
Distributed caching with Redis/Memcached
Automatic stampede protection with distributed locks
Exponential backoff (50ms, 100ms, 200ms) when waiting for locks
No max_size limit (managed by Redis/Memcached)
Ideal for large-scale deployments (100+ processes)

How Stampede Protection Works:

When using Rails.cache backend, the SDK automatically prevents "thundering herd" problems:

Cache Miss: First process acquires distributed lock, fetches from API
Concurrent Requests: Other processes wait (exponential backoff) instead of hitting API
Cache Populated: Waiting processes read from cache once first process completes
Fallback: If lock holder crashes, lock auto-expires (configurable timeout)

# Configure lock timeout (default: 10 seconds)
Langfuse.configure do |config|
  config.cache_backend = :rails
  config.cache_lock_timeout = 15  # Lock expires after 15s
end

This is automatic for Rails.cache backend - no additional configuration needed!

When to use Rails.cache:

Large Rails apps with many worker processes (Passenger, Puma, Unicorn)
Multiple servers sharing the same prompt cache
Deploying with 100+ processes that all need consistent cache
Already using Redis for Rails.cache

When to use in-memory cache:

Single-process applications
Scripts and background jobs
Smaller deployments (< 10 processes)
When you want zero external dependencies

Cache Warming

Pre-warm the cache during deployment to prevent cold-start API calls:

Auto-Discovery (Recommended):

The SDK can automatically discover and warm ALL prompts in your Langfuse project. By default, it fetches prompts with the "production" label to ensure you're warming production-ready prompts during deployment.

# Rake task - automatically discovers all prompts with "production" label
bundle exec rake langfuse:warm_cache_all

# Programmatically - warms all prompts with "production" label
warmer = Langfuse::CacheWarmer.new
results = warmer.warm_all

puts "Cached #{results[:success].size} prompts"
# => Cached 12 prompts (automatically discovered with "production" label)

# Warm with a different label (e.g., staging)
results = warmer.warm_all(default_label: "staging")

# Warm latest versions (no label)
results = warmer.warm_all(default_label: nil)

Manual Prompt List:

Or specify exact prompts to warm:

# In your deploy script (Capistrano, etc.)
bundle exec rake langfuse:warm_cache[greeting,conversation,rag-pipeline]

# Or via environment variable
LANGFUSE_PROMPTS_TO_WARM=greeting,conversation rake langfuse:warm_cache

# In deployment scripts or initializers
warmer = Langfuse::CacheWarmer.new
results = warmer.warm(['greeting', 'conversation', 'rag-pipeline'])

puts "Cached #{results[:success].size} prompts"
# => Cached 3 prompts

# With error handling
if results[:failed].any?
  results[:failed].each do |failure|
    logger.warn "Failed to cache #{failure[:name]}: #{failure[:error]}"
  end
end

# Strict mode - raise on any failures
warmer.warm!(['greeting', 'conversation'])  # Raises CacheWarmingError if any fail

With specific versions or labels:

# Auto-discovery with version overrides (version takes precedence over label)
warmer.warm_all(
  versions: { 'greeting' => 2 }  # greeting uses version 2, others use "production" label
)

# Override label for specific prompts
warmer.warm_all(
  default_label: "production",
  labels: { 'greeting' => 'staging' }  # greeting uses "staging", others use "production"
)

# Manual list with versions/labels
warmer.warm(
  ['greeting', 'conversation'],
  versions: { 'greeting' => 2 },
  labels: { 'conversation' => 'production' }
)

Available Rake Tasks:

# Auto-discover and warm ALL prompts (recommended for deployment)
rake langfuse:warm_cache_all

# Warm cache with specific prompts
rake langfuse:warm_cache[prompt1,prompt2,prompt3]

# List prompt details
LANGFUSE_PROMPT_NAMES=greeting,conversation rake langfuse:list_prompts

# Clear the cache
rake langfuse:clear_cache

Error Handling

begin
  prompt = Langfuse.client.get_prompt("my-prompt")
rescue Langfuse::NotFoundError => e
  puts "Prompt not found: #{e.message}"
rescue Langfuse::UnauthorizedError => e
  puts "Authentication failed: #{e.message}"
rescue Langfuse::ApiError => e
  puts "API error: #{e.message}"
end

Exception Hierarchy:

StandardError
└── Langfuse::Error
    ├── Langfuse::ConfigurationError
    └── Langfuse::ApiError
        ├── Langfuse::NotFoundError    (404)
        └── Langfuse::UnauthorizedError (401)

Testing

The SDK is designed to be test-friendly:

# RSpec example
RSpec.configure do |config|
  config.before(:each) do
    Langfuse.reset!  # Clear global state
  end
end

# Mock prompts in tests
before do
  allow_any_instance_of(Langfuse::Client)
    .to receive(:get_prompt)
    .with("greeting")
    .and_return(
      Langfuse::TextPromptClient.new(
        "name" => "greeting",
        "version" => 1,
        "type" => "text",
        "prompt" => "Hello {{name}}!",
        "labels" => [],
        "tags" => [],
        "config" => {}
      )
    )
end

API Reference

Client Methods

# Get a prompt (returns TextPromptClient or ChatPromptClient)
client.get_prompt(name, version: nil, label: nil)

# List all prompts in your Langfuse project
client.list_prompts(page: nil, limit: nil)  # Returns Array of prompt metadata

# Get and compile in one call
client.compile_prompt(name, variables: {}, version: nil, label: nil, fallback: nil, type: nil)

Prompt Client Methods

# TextPromptClient
prompt.compile(variables = {})  # Returns String

# ChatPromptClient
prompt.compile(variables = {})  # Returns Array of message hashes

# Both have these properties:
prompt.name        # String
prompt.version     # Integer
prompt.prompt      # String (text) or Array (chat)
prompt.labels      # Array
prompt.tags        # Array
prompt.config      # Hash

Tracing Methods

# Create a trace
Langfuse.trace(name:, user_id: nil, session_id: nil, metadata: {}) do |trace|
  # Add spans, generations, events
end

# Add a generation (LLM call)
trace.generation(name:, model:, input:, prompt: nil) do |gen|
  gen.output = "..."
  gen.usage = { prompt_tokens: 10, completion_tokens: 20 }
end

# Add a span (any operation)
trace.span(name:, input: nil) do |span|
  span.output = "..."
  span.metadata = { ... }
end

# Add an event (point-in-time)
trace.event(name:, input: nil, output: nil)

See API documentation for complete reference.

Requirements

Ruby >= 3.2.0
No Rails dependency (works with any Ruby project)

Thread Safety

All components are thread-safe:

Langfuse.configure and Langfuse.client are safe to call from multiple threads
PromptCache uses Monitor-based synchronization
Client instances can be shared across threads

Development

# Clone the repository
git clone https://github.com/langfuse/langfuse-ruby.git
cd langfuse-ruby

# Install dependencies
bundle install

# Run tests
bundle exec rspec

# Run linter
bundle exec rubocop -a

Roadmap & Status

Current Status: Production-ready with 99.6% test coverage

For detailed implementation plans and progress, see:

IMPLEMENTATION_PLAN_V2.md - Detailed roadmap
PROGRESS.md - Current status and milestones

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Check existing issues and roadmap
Open an issue to discuss your idea
Fork the repo and create a feature branch
Write tests for your changes
Ensure bundle exec rspec and bundle exec rubocop pass
Submit a pull request

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
lib		lib
spec		spec
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
.ruby-version		.ruby-version
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE		LICENSE
README.md		README.md
langfuse.gemspec		langfuse.gemspec

License

simplepractice/langfuse-ruby

Folders and files

Latest commit

History

Repository files navigation

Langfuse Ruby SDK

Features

Installation

Quick Start

Configuration

Basic Usage

Prompt Management

Text Prompts

Chat Prompts

Versioning

Advanced Templating

Convenience Methods

LLM Tracing & Observability

Basic Example

RAG Pipeline Example

Automatic Prompt Linking

OpenTelemetry Integration

Configuration

Environment Variables

Instance-Based Configuration

Caching

In-Memory Cache (Default)

Rails.cache Backend (Distributed)

Cache Warming

Error Handling

Testing

API Reference

Client Methods

Prompt Client Methods

Tracing Methods

Requirements

Thread Safety

Development

Roadmap & Status

Contributing

License

Links

Support

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages