Skip to content

parallax-compiler/parallax-benchmarks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Parallax Benchmarks

Comprehensive performance benchmarks for the Parallax GPU offload compiler.

Overview

This directory contains:

  • micro/ - Microbenchmarks for individual algorithms
  • apps/ - Real-world application benchmarks
  • results/ - Benchmark results and performance data
  • src/ - Shared benchmark infrastructure

Quick Start

# Build all benchmarks
mkdir build && cd build
cmake ..
make -j$(nproc)

# Run microbenchmarks
./micro/bench_for_each
./micro/bench_transform
./micro/bench_performance

# Run application benchmarks
./apps/bench_image_processing

Benchmark Categories

Microbenchmarks (micro/)

  • bench_for_each - std::for_each performance across dataset sizes
  • bench_transform - std::transform performance
  • bench_performance - Comprehensive performance suite
  • bench_correctness - Correctness validation

Application Benchmarks (apps/)

  • bench_image_processing - Image filtering, transformations
  • bench_scientific_computing - Monte Carlo, numerical methods

Performance Results

Latest results on NVIDIA GeForce GTX 980M (December 2025):

std::for_each Performance

Dataset Size Time (ms) Throughput (M elem/s)
1K 5.57 0.18
10K 0.27 36.77
100K 0.44 228.06
1M 1.34 744.36

std::transform Performance

Dataset Size Time (ms) Throughput (M elem/s)
1K 2.61 0.38
10K 0.27 36.63
100K 0.41 243.33
1M 1.37 732.34

Key Findings:

  • Excellent scaling on large datasets (>100K elements)
  • 700+ million elements/second on 1M dataset
  • Overhead dominant on small datasets (<10K)
  • Linear performance scaling with dataset size

Running Benchmarks

Basic Usage

# Run single benchmark
./micro/bench_for_each

# Run with specific size
./micro/bench_performance --size 1000000

# Run with iterations
./micro/bench_performance --iterations 100

# Save results
./micro/bench_performance --output results/my_results.json

Comparison Mode

# Compare with CPU baseline
./micro/bench_performance --compare-cpu

Performance Guidelines

Dataset Size Expected Behavior Use Case
< 1K CPU likely faster Don't use GPU
1K - 10K Break-even point Context dependent
10K - 100K 10-100x speedup Good for GPU
> 100K 100-1000x speedup Excellent for GPU

Hardware Requirements

  • Vulkan 1.2+ capable GPU
  • 4GB+ GPU memory recommended
  • CPU with AVX2 for baseline comparisons

See Also

About

Performance benchmarks for Parallax

Resources

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

No packages published