Skip to content

ClickHouse/sql-mandelbrot-benchmark

 
 

Repository files navigation

sql-mandelbrot-benchmark

Because why benchmark sql engines with boring aggregates when you can generate fractals?

This project uses recursive Common Table Expressions (CTE) to calculate the Mandelbrot set entirely in SQL — no loops, no procedural code, just pure SQL. It serves as a fun and visually appealing benchmark for testing recursive query performance, floating-point precision, and computational capabilities of SQL engines.

Mandelbrot Set

What is This?

A benchmark suite that:

  • Computes the famous Mandelbrot set using SQL recursive CTEs
  • Tests multiple SQL engines, currently just DuckDB and a Python implementation for reference.
  • Generates beautiful fractal images as proof of correct computation
  • Reveals which database / SQL engine renders infinity fastest

Quick Start

# Clone the repository
git clone https://github.com/yourusername/duckbrot.git
cd duckbrot

# Install dependencies
pip install -r requirements.txt

# Run the benchmark suite
python main.py

Current Benchmark Results

Current results on 1400x800 pixels, 256 max iterations, Macbook Pro M4 Max:

🏆 Engine/Implementation Time (ms) Relative Performance
* Mac Metal GPU (unfair, but the true limit) 0.77 ms ∞ 😵
1 NumPy (vectorized, unrolled) 664 ms 0.78x
2 ArrowDatafusion (SQL) 848 ms 1.00x (baseline)
3 ClickHouse (SQL) 1,039 ms 1.23x slower
3 chDB (SQL) 1,541 ms 1.82x slower
4 DuckDB (SQL) 1,839 ms 2.17x slower
5 FasterPybrot 2,654 ms 3.13x slower
6 FastPybrot 2,875 ms 3.39x slower
7 Pure Python 3,070 ms 3.62x slower
8 SQLite (SQL) 42,565 ms 50.2x slower

Winner overall: NumPy - Just 17% faster than ArrowDatafusion using loop unrolling and vectorized operations!

Winner SQL: ArrowDatafusion - Incredibly fast, nearly matching optimized NumPy performance!

How It Works

The Mandelbrot set is computed by iterating the formula z = z² + c for each pixel in the complex plane:

WITH RECURSIVE
  -- Generate pixel grid and map to complex plane
  pixels AS (
    SELECT
      x, y,
      -2.5 + (x * 3.5 / width) AS cx,
      -1.0 + (y * 2.0 / height) AS cy
    FROM generate_series(0, width-1) AS x,
         generate_series(0, height-1) AS y
  ),
  -- Recursively iterate z = z² + c
  mandelbrot_iterations AS (
    SELECT x, y, cx, cy, 0.0 AS zx, 0.0 AS zy, 0 AS iteration
    FROM pixels

    UNION ALL

    SELECT
      x, y, cx, cy,
      zx * zx - zy * zy + cx AS zx,
      2.0 * zx * zy + cy AS zy,
      iteration + 1
    FROM mandelbrot_iterations
    WHERE iteration < max_iterations
      AND (zx * zx + zy * zy) <= 4.0
  )
SELECT x, y, MAX(iteration) AS depth
FROM mandelbrot_iterations
GROUP BY x, y;

The iteration count determines the color of each pixel, creating the iconic fractal pattern.

Adding New Benchmarks

Want to test PostgreSQL, MySQL, MariaDB, SQLite or even Oracle or SQL-Server? Just:

  1. Create a new file (e.g., postgresqlbrot.py)
  2. Implement a run_postgresqlbrot(width, height, max_iterations) function (the DuckDB implementation is a good starting point)
  3. Add one line to main.py:
    BENCHMARKS = [
        ("ClickHouse (SQL)", "clickbrot", "run_clickbrot"),
        ("DuckDB (SQL)", "duckbrot", "run_duckbrot"),
        ("Pure Python", "pybrot", "run_pybrot"),
        ..., 
        ("PostgreSQL", "postgresqlbrot", "run_postgresqlbrot"),  # New!
    ]

The framework handles everything else automatically!

Configuration

Adjust the benchmark parameters in main.py:

WIDTH = 1400           # Image width in pixels
HEIGHT = 800           # Image height in pixels
MAX_ITERATIONS = 256   # Maximum recursion depth

Higher values = more detail, longer computation time.

Known Engine Compatibility

✅ Works Great

  • NumPy - Highly optimized with loop unrolling and vectorized operations (fastest!)
  • ClickHouse - Excellent performance, full precision
  • DuckDB - Excellent performance, proper DOUBLE precision
  • Pure Python - Reference implementation, just to have an idea how fast the database engines are
  • SQLite - Works but significantly slower due to recursive CTE overhead

Should Work (untested, please contribute 🤙)

  • PostgreSQL (with proper recursive CTE support)
  • others

Known Issues

  • Some engines might struggle with support for DOUBLE precision and may use DECIMAL (not good for fractals, and lead to pixelated results)
  • Watch out for type inference - explicit ::DOUBLE casts are critical!

What This Tests

This benchmark evaluates:

  1. Recursive CTE Performance - How efficiently engines handle deep recursion
  2. Floating-Point Precision - DOUBLE vs DECIMAL arithmetic accuracy
  3. Query Optimization - How well engines optimize complex recursive queries
  4. Scalability - Performance with increasing iterations and resolution

Contributing

Contributions very welcome! Especially:

  • New SQL engine implementations (PostgreSQL, MySQL, etc.)
  • Performance optimizations
  • Better visualization options
  • Benchmark result submissions

License

MIT License - See LICENSE file for details.

Credits

Created by Thomas Zeutschler, Ulrich Ludmann, and Jakub Jirak (the grand master of GPU fractals)

Inspired by the mathematical beauty of the Mandelbrot set and the curiosity about SQL engine performance.

Learn More


Curious which database renders infinity fastest? Clone and find out! 🌀

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 78.3%
  • Swift 17.0%
  • Shell 4.5%
  • Makefile 0.2%