FabLab SQL Endpoint Benchmark

Performance benchmark comparing Microsoft Fabric SQL endpoints using TPC-DS-inspired queries.

What this benchmarks

Four endpoints are tested under identical conditions:

Endpoint	Type	Delta config
`lakehouse_default`	Lakehouse SQL endpoint	No partition, no V-Order
`lakehouse_partitioned`	Lakehouse SQL endpoint	`PARTITION BY ss_sold_date_sk`
`lakehouse_vorder`	Lakehouse SQL endpoint	V-Order enabled at write time
`warehouse`	Fabric Warehouse	Standard configuration

Five TPC-DS-inspired queries are run at SF100 (~100 GB) scale in two cache modes:

Cold — first run after capacity resume (true cold cache, guaranteed by pause/resume cycle)
Warm — 3 repetitions on a hot, active capacity

Repository layout

fablab-sql-endpoint/
├── provision/          ← One-time Fabric resource setup + capacity manager
├── data_generation/    ← TPC-DS data generation (dsdgen) + OneLake upload
├── ingestion/          ← Spark notebooks to load CSVs → Delta / Warehouse
├── sql/                ← Five benchmark SQL queries (q01–q05)
├── benchmark/          ← Main runner, config, connection and utilities
├── analysis/           ← Results analysis notebook (charts + statistics)
├── results/            ← Output CSV/JSON files (committed to repo)
├── specs/              ← Authoritative project specification (Spanish)
├── .env.example        ← Environment variable template
└── requirements.txt

Documentation by folder

Folder	README	What it covers
`provision/`	✅	Create Fabric workspace/Lakehouse/Warehouse; capacity pause/resume
`data_generation/`	✅	Generate TPC-DS CSVs with dsdgen; split, gzip and upload to OneLake
`ingestion/`	✅	Load CSVs into 3 Lakehouse schemas and the Warehouse via Spark
`benchmark/`	✅	Run the benchmark; config reference; output format
`results/`	—	Output CSV/JSON files — committed to the repo

End-to-end workflow

1. provision/setup_fabric.py       → create workspace, Lakehouse, Warehouse
2. data_generation/generate_csv.py → generate TPC-DS data with dsdgen
3. data_generation/upload_to_onelake.py → split, gzip, upload to OneLake
4. ingestion/01_lakehouse_ingest.ipynb  → load data into 3 Lakehouse schemas
5. ingestion/02_warehouse_ingest.sql    → load data into Warehouse (CTAS)
6. benchmark/runner.py             → run benchmark (cold + warm blocks)
7. analysis/analyze_results.ipynb  → compare results across endpoints

Prerequisites

Python 3.14+ (py launcher on Windows)
Azure CLI — az login for authentication
Microsoft ODBC Driver 18 for SQL Server
azcopy v10 — for OneLake upload
WSL 2 with Ubuntu — for dsdgen compilation and CSV splitting

# Install Python dependencies
pip install -r requirements.txt

# Authenticate
az login

Copy .env.example to .env and fill in the values printed by provision/setup_fabric.py.

Results

Benchmark output files are committed to results/:

File	Scale	Description
`benchmark_20260410T160604.csv`	SF100	Raw results — all 4 endpoints, Q1–Q5, cold + warm
`benchmark_20260410T160604.json`	SF100	Same data in JSON format

Open analysis/analyze_results.ipynb to generate comparison charts from any results file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FabLab SQL Endpoint Benchmark

What this benchmarks

Repository layout

Documentation by folder

End-to-end workflow

Prerequisites

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github		.github
analysis		analysis
benchmark		benchmark
data_generation		data_generation
ingestion		ingestion
provision		provision
results		results
specs		specs
sql		sql
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

FabLab SQL Endpoint Benchmark

What this benchmarks

Repository layout

Documentation by folder

End-to-end workflow

Prerequisites

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages