Skip to content

BenchBox 0.1.2

Choose a tag to compare

@joeharris76 joeharris76 released this 24 Feb 00:31
· 15 commits to main since this release

Added

  • DataFrame mode for all benchmarks — complete DataFrame query implementations across all 18 benchmarks including TPC-DS (102 queries), TPC-H (22 queries), SSB, ClickBench, NYC Taxi, TSBS DevOps, H2ODB, AMPLab, CoffeeShop, TPC-H Skew, and Data Vault. DataFrame platforms: Polars, DuckDB, DataFusion, PySpark, Pandas, Modin, Dask, and cuDF (GPU).
  • ASCII chart visualizations — terminal-native ASCII rendering replacing Plotly HTML charts. Seven chart types: performance bar, distribution box, query heatmap, comparison bar, diverging bar, summary box, and query latency histogram. ANSI colors, Unicode box-drawing, and best/worst highlighting.
  • 14 new SQL platform adapters — PostgreSQL, Trino, PrestoDB, Apache Spark, AWS Athena, Azure Synapse, Microsoft Fabric, Firebolt, MotherDuck, InfluxDB 3.x, TimescaleDB, ClickHouse Cloud, Onehouse Quanton, and managed Spark variants (EMR, Dataproc, Glue, Fabric Spark, Synapse Spark, Dataproc Serverless).
  • Open table format support — Delta Lake, Apache Iceberg, Apache Hudi, DuckLake, and Vortex columnar format with format conversion orchestration and manifest v2 for multi-format tracking.
  • Physical tuning DDL generation — platform-specific DDL generators for DuckDB, Snowflake, Redshift, BigQuery, ClickHouse, Firebolt, PostgreSQL, TimescaleDB, Trino/Presto/Athena, and Spark family with sort keys, partitioning, clustering, and compression.
  • Query plan capture and comparison — plan parsers for DuckDB, PostgreSQL, Redshift, DataFusion, and SQLite. Comparison engine with regression detection, fingerprinting, historical tracking with flapping detection, and CLI visualization.
  • Interactive CLI wizard — guided benchmark configuration with platform selection, tuning wizard, scale factor validation, phase/query selection, onboarding, and persistent preferences.
  • TPC-DI benchmark — complete implementation across 4 phases: core schema, query suite, ETL pipeline, and validation/testing.
  • Cross-platform comparison enginebenchbox compare command with multi-platform analysis, SQL vs DataFrame comparison, and unified visualization.
  • Unified tuning configuration — YAML-based tuning system with per-platform DDL generation, write-time physical layout configuration, and dry-run preview.
  • Cloud storage and deployment modes for S3/GCS/ADLS/DBFS with credential setup wizard and cost estimation.
  • TPC compliance improvements: stream-aware validation, query permutations, warmup/measurement iterations, maintenance operations (RF1/RF2), and --seed for reproducibility.
  • New benchmarks: AI/ML Primitives, Metadata Primitives, Write Primitives, Transaction Primitives.
  • MCP server: suggest_charts and generate_chart tools; --queries and --validation-mode CLI flags; tiered --help.

Fixed

  • TPC-DS data generation reliability — segfaults with fractional scale factors, parallel generation errors, streaming compression, and chunked file handling.
  • Cloud platform stability — credential refresh errors, schema creation ordering, UC Volume uploads, S3 key handling, and BigQuery/Snowflake/Redshift/Databricks adapter issues.
  • Type safety — 150+ type errors resolved across production code with proper annotations and TYPE_CHECKING imports.
  • SQL dialect translation — SQLGlot compatibility for DuckDB, ClickHouse, DataFusion, and Netezza; reserved keyword quoting and identifier case sensitivity.
  • Security hardening: SQL injection prevention, parameterized queries, path traversal protection.
  • CLI hanging in non-interactive mode, progress display precision, --quiet mode propagation.
  • TPC compliance: correct stream permutations, maintenance phase SQL execution, Power@Size calculation parity between SQL and DataFrame modes.

Changed

  • Dropped Plotly HTML charts in favor of ASCII-only rendering.
  • Lazy-load cloud platform adapters to speed up CLI startup and test suite.
  • Optimized TPC-DS smoke tests with selective table generation.

Full Changelog: https://github.com/joeharris76/BenchBox/blob/main/CHANGELOG.md#012---2026-02-09