LlmEvalRuby

A Ruby gem for LLM evaluation that provides prompt management and tracing functionality. This gem supports both local and Langfuse backends for managing prompts and traces.

Features

Prompt Management: Fetch and compile prompts from local files or Langfuse
Tracing: Track LLM calls with spans, generations, and traces
Observable Pattern: Automatically trace method calls with decorators
Multiple Adapters: Support for local file system and Langfuse backends
Template Support: Liquid templating for dynamic prompt compilation

Installation

Add this line to your application's Gemfile:

gem 'llm_eval_ruby'

And then execute:

$ bundle install

Or install it yourself as:

$ gem install llm_eval_ruby

Configuration

Configure the gem in your application initializer:

LlmEvalRuby.configure do |config|
  # Choose your adapter: :local or :langfuse
  config.adapter = :langfuse

  # Langfuse configuration
  config.langfuse_options = {
    public_key: "your_public_key",
    secret_key: "your_secret_key",
    host: "https://your-langfuse-instance.com"
  }

  # Local configuration (for Rails applications)
  config.local_options = {
    prompts_path: "app/prompts"
  }
end

Usage

Prompt Management

Text Prompts

# Fetch a text prompt
prompt = LlmEvalRuby::PromptRepositories::Text.fetch(name: "my_prompt")

# Fetch and compile with variables
compiled_prompt = LlmEvalRuby::PromptRepositories::Text.fetch_and_compile(
  name: "my_prompt",
  variables: { user_name: "John", task: "summarize" }
)

Chat Prompts

# Fetch chat prompts (returns array of messages)
messages = LlmEvalRuby::PromptRepositories::Chat.fetch(name: "chat_prompt")

# Fetch and compile chat prompts with variables
compiled_messages = LlmEvalRuby::PromptRepositories::Chat.fetch_and_compile(
  name: "chat_prompt",
  variables: { context: "some context", question: "What is Ruby?" }
)

Versioned Prompts

# Fetch specific version
prompt = LlmEvalRuby::PromptRepositories::Text.fetch(
  name: "my_prompt",
  version: "v1.2.0"
)

Tracing

Basic Tracing

# Create a trace
# Langfuse does an upsert if id is given
trace = LlmEvalRuby::Tracer.trace(
  id: "trace_id",
  name: "llm_call",
  input: { prompt: "Hello, world!" }
)

# Create a span within a trace
span = LlmEvalRuby::Tracer.span(
  name: "preprocessing",
  trace_id: trace.id,
  input: { data: "raw input" }
)

# Create a generation (LLM call)
generation = LlmEvalRuby::Tracer.generation(
  name: "gpt_call",
  trace_id: trace.id,
  input: { messages: [{ role: "user", content: "Hello!" }] },
  model: "gpt-4"
)

Block-based Tracing

# Trace with automatic timing
result = LlmEvalRuby::Tracer.span(
  name: "data_processing",
  input: { data: input_data }
) do |span|
  # Your processing logic here
  process_data(input_data)
end

# Generation with automatic result capture
response = LlmEvalRuby::Tracer.generation(
  name: "llm_call",
  input: { prompt: "Translate this text" },
  model: "gpt-4"
) do |generation|
  # Your LLM call here
  client.completions(prompt: "Translate this text")
end

Observable Pattern

Use the Observable module to automatically trace method calls:

class MyLLMService
  include LlmEvalRuby::Observable

  # Trace as a span
  observe :preprocess_data, type: :span
  def preprocess_data(input)
    # Method implementation
  end

  # Trace as a generation
  observe :call_llm, type: :generation
  def call_llm(messages)
    # LLM call implementation
  end

  # Trace as a regular trace
  observe :process_request
  def process_request(request)
    # Processing logic
  end
end

# Usage
service = MyLLMService.new
service.instance_variable_set(:@trace_id, "some-trace-id")
service.process_request(request_data)

Local Prompt Management

For local prompt management, organize your prompts in the configured directory:

app/prompts/
├── my_chat_prompt/
│   ├── system.txt
│   └── user.txt
└── my_text_prompt/
    └── user.txt

Example prompt files with Liquid templating:

app/prompts/summarize/system.txt

You are a helpful assistant that summarizes text.

app/prompts/summarize/user.txt

Please summarize the following text for {{ user_name }}:

{{ text_to_summarize }}

Advanced Usage

Using Custom Langfuse Clients

You can pass custom Langfuse client instances to use different credentials per request:

# Create a custom client with different credentials
custom_client = LlmEvalRuby::ApiClients::Langfuse.new(
  host: "https://custom-langfuse.com",
  username: "custom_public_key",
  password: "custom_secret_key"
)

# Use custom client with Tracer
tracer = LlmEvalRuby::Tracer.new(adapter: :langfuse, client: custom_client)
tracer.trace(name: "custom_trace", input: { query: "test" })

# Use custom client with Text repository
text_repo = LlmEvalRuby::PromptRepositories::Text.new(
  adapter: :langfuse,
  client: custom_client
)
prompt = text_repo.fetch(name: "my_prompt")

# Use custom client with Chat repository
chat_repo = LlmEvalRuby::PromptRepositories::Chat.new(
  adapter: :langfuse,
  client: custom_client
)
messages = chat_repo.fetch(name: "chat_prompt")

If no client is provided, the default client from langfuse_options configuration is used.

Updating Generations

# Update a generation with results
LlmEvalRuby::Tracer.update_generation(
  id: generation.id,
  output: { response: "Generated text" },
  usage: { prompt_tokens: 10, completion_tokens: 20 }
)

Custom Trace Data

trace = LlmEvalRuby::Tracer.trace(
  name: "complex_workflow",
  input: { query: "user query" },
  metadata: { user_id: "123", session_id: "abc" },
  tags: ["production", "important"]
)

Adapters

Langfuse Adapter

The Langfuse adapter provides:

Cloud-based prompt management
Advanced tracing and analytics
Version control for prompts
Team collaboration features

Local Adapter

The local adapter provides:

File-based prompt storage
Local development workflow
No external dependencies
Simple prompt organization

Error Handling

The gem includes basic error handling:

begin
  prompt = LlmEvalRuby::PromptRepositories::Text.fetch(name: "nonexistent")
rescue LlmEvalRuby::Error => e
  puts "Error: #{e.message}"
end

Requirements

Ruby >= 3.3.0
HTTParty for API calls
Liquid for template rendering

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/test-IO/llm_eval_ruby. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

License

The gem is available as open source under the terms of the MIT License.

Code of Conduct

Everyone interacting in the LlmEvalRuby project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
bin		bin
lib		lib
sig		sig
spec		spec
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
LICENSE.txt		LICENSE.txt
README.md		README.md
Rakefile		Rakefile
llm_eval_ruby.gemspec		llm_eval_ruby.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LlmEvalRuby

Features

Installation

Configuration

Usage

Prompt Management

Text Prompts

Chat Prompts

Versioned Prompts

Tracing

Basic Tracing

Block-based Tracing

Observable Pattern

Local Prompt Management

Advanced Usage

Using Custom Langfuse Clients

Updating Generations

Custom Trace Data

Adapters

Langfuse Adapter

Local Adapter

Error Handling

Requirements

Development

Contributing

License

Code of Conduct

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 6

Uh oh!

Languages

License

test-IO/llm_eval_ruby

Folders and files

Latest commit

History

Repository files navigation

LlmEvalRuby

Features

Installation

Configuration

Usage

Prompt Management

Text Prompts

Chat Prompts

Versioned Prompts

Tracing

Basic Tracing

Block-based Tracing

Observable Pattern

Local Prompt Management

Advanced Usage

Using Custom Langfuse Clients

Updating Generations

Custom Trace Data

Adapters

Langfuse Adapter

Local Adapter

Error Handling

Requirements

Development

Contributing

License

Code of Conduct

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 6

Uh oh!

Languages

Packages