Skip to content

maxemitchell/ruby-parallelization-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ruby Parallelization Benchmark Framework

NOTE: At the moment this is (mostly) entirely vibecoded by Claude 4 Opus + Cursor. I have tested that things run + fixed rubocop errors more hands on, but have not yet audited the implementation of each parallelization approach.

A comprehensive benchmarking framework for evaluating Ruby's diverse parallelization approaches, including Ractors, threads, processes, fibers, and popular concurrency gems.

Features

  • Multiple Concurrency Models: Test various Ruby parallelization approaches

    • Sequential (baseline)
    • Native Threads
    • Process Forking
    • Fibers
    • Ractors (Ruby 3.0+)
    • Parallel gem (with automatic backend selection)
    • Async gem (fiber-based async I/O)
    • Concurrent-ruby (futures, promises, thread pools)
  • Realistic Workloads

    • CPU-intensive (fibonacci, prime checking, matrix operations)
    • I/O-bound (sleep, file operations, HTTP simulation)
    • Memory-intensive (array creation, string operations, hash manipulation)
    • Mixed workloads (combining CPU and I/O tasks)
    • Web application simulation
    • Data pipeline processing
  • Advanced Metrics

    • Iterations per second (IPS) with statistical analysis
    • Automatic warmup and confidence intervals
    • Garbage collection impact measurement
    • Multi-core scaling analysis

Requirements

  • Ruby 3.3.0+ (Ruby 3.0+ for Ractor support)
  • Bundler for dependency management

Installation

# Clone the repository
git clone https://github.com/yourusername/parallelization-benchmark.git
cd parallelization-benchmark

# Install dependencies
bundle install

Quick Start

Basic Usage

# Run all benchmarks with default settings
ruby bin/benchmark.rb

# Run specific adapters and workloads
ruby bin/benchmark.rb -a threads,processes -w cpu_intensive,io_bound

# Use a custom configuration file
ruby bin/benchmark.rb -c config/benchmark_config.yaml

Running Tests

# Run the test suite
bundle exec rspec

# Run with code style checking
bundle exec rake

Configuration

The framework uses YAML configuration files. See config/benchmark_config.yaml for an example:

warmup_iterations: 2
benchmark_time: 5
worker_counts: [1, 2, 4, 8, 16]

workload_config:
  cpu_intensive:
    task_count: 1000
    complexity: 1000
    algorithm: fibonacci

  io_bound:
    task_count: 100
    io_type: sleep
    duration: 0.01

Available Adapters

  • sequential - Baseline sequential execution
  • threads - Ruby native threads
  • processes - Process forking
  • fibers - Fiber-based cooperative concurrency
  • ractor - Ractor-based true parallelism (Ruby 3.0+)
  • parallel_gem - Parallel gem with auto backend selection
  • async_gem - Async gem for fiber-based async I/O
  • concurrent_ruby - Concurrent-ruby thread pools and futures

Available Workloads

  • cpu_intensive - CPU-bound computational tasks
  • io_bound - I/O-bound operations
  • memory_intensive - Memory allocation and manipulation
  • mixed - Combined CPU and I/O operations
  • web_application - Web request simulation
  • data_pipeline - ETL-style data processing

Understanding Results

The benchmark generates JSON reports with detailed metrics:

{
  "metadata": {
    "timestamp": "20250130_143022",
    "ruby_version": "3.3.0",
    "platform": "arm64-darwin",
    "cpu_count": 8
  },
  "results": {
    "cpu_intensive": {
      "threads": {
        "1": {
          "ips": 150.5,
          "time": 0.00664,
          "gc_stats": {
            "count": 2,
            "time": 0.012
          }
        }
      }
    }
  }
}

Key Metrics

  • IPS (Iterations Per Second): Higher is better
  • Time: Average time per iteration (lower is better)
  • GC Stats: Garbage collection impact

Development

Project Structure

lib/
├── parallelization_benchmark.rb          # Main entry point
├── parallelization_benchmark/
│   ├── version.rb                        # Version information
│   ├── benchmark_orchestrator.rb         # Main orchestrator
│   ├── base_adapter.rb                   # Base adapter class
│   ├── base_workload.rb                  # Base workload class
│   ├── adapters/                         # Concurrency adapters
│   │   ├── sequential_adapter.rb
│   │   ├── thread_adapter.rb
│   │   ├── process_adapter.rb
│   │   ├── fiber_adapter.rb
│   │   ├── ractor_adapter.rb
│   │   ├── parallel_gem_adapter.rb
│   │   ├── async_gem_adapter.rb
│   │   └── concurrent_ruby_adapter.rb
│   └── workloads/                        # Benchmark workloads
│       ├── cpu_intensive_workload.rb
│       ├── io_bound_workload.rb
│       ├── memory_intensive_workload.rb
│       ├── mixed_workload.rb
│       ├── web_application_workload.rb
│       └── data_pipeline_workload.rb

Adding Custom Adapters

class MyCustomAdapter < ParallelizationBenchmark::BaseAdapter
  def execute(workload)
    # Your implementation here
    workload.tasks.each(&:call)
  end
end

# Register the adapter
orchestrator.register_adapter(:my_custom, MyCustomAdapter)

Adding Custom Workloads

class DatabaseWorkload < ParallelizationBenchmark::BaseWorkload
  def generate_tasks
    Array.new(@config[:task_count]) do
      -> { simulate_database_query }
    end
  end

  private

  def simulate_database_query
    # Your simulation here
  end
end

# Register the workload
orchestrator.register_workload(:database, DatabaseWorkload)

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run the tests (bundle exec rake)
  5. Commit your changes (git commit -am 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Performance Tips

For CPU-Intensive Workloads

  • MRI Ruby: Process-based parallelism typically outperforms threads due to GIL
  • JRuby/TruffleRuby: Threads can achieve true parallelism
  • Ractors: Show true parallelism in MRI Ruby 3.0+

For I/O-Bound Workloads

  • Threads: Generally perform well for I/O waiting
  • Async/Fibers: Excellent for high-concurrency I/O operations
  • Processes: Higher overhead but good isolation

For Memory-Intensive Workloads

  • Processes: Higher memory usage but better GC isolation
  • Threads: Shared memory but potential GC contention
  • Ractors: Good balance with isolated heaps

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages