A minimal set of bash scripts to run the Join Order Benchmark (JOB) against:
- vanilla PostgreSQL (current
my_postgres10build); - Bao (learned planner-hook on a separate PG fork);
- Neo (neural optimizer, hints via
pg_hint_plan); - Balsa (successor to Neo by the same authors);
- HyperQO / AlphaJoin (hint-based learned optimizers — replayed from prior measurements in
plot_data/all_methods.csv); - SkinnerDB (standalone Java DBMS, no PostgreSQL);
- MCTS-Extreme (this repo's
contrib/mcts_extreme); - AQO (this repo's
contrib/aqo— adaptive cardinality learning); - SAIO (ported PGXN extension
contrib/saio— simulated-annealing join-order search; see SAIO_REPORT.md for the port story and bug fixes).
Each base runner does the same three things:
- Reads queries from
$QUERY_FILES(default source/queries/). - Saves the
EXPLAINplan toplans/<method>/<query>.plan. - Runs each query
ITERStimes and writes timings toresults/<method>_job.csv.
After a run, collect.sh aggregates all results/*_job.csv into a single wide table results/job_aggregate.csv (median / p95 / min / max / n per query and method).
| File / dir | Purpose |
|---|---|
| lib.sh | Shared variables (PG_BASE, INSTDIR, PGDATA, PGPORT, QUERY_DIR, ITERS, …) and helpers (pg_ensure_up, run_query_once, save_plain_explain, csv_header). Every other script does source ./lib.sh. |
| job_create.sh | Creates the imdb DB, applies schema.sql, loads CSVs via copy.sql, creates FK indexes, runs VACUUM ANALYZE. |
| job_run.sh | Runs against vanilla PG (method plain_pg). |
| job_run_bao.sh | Runs through Bao (port 5500, separate PG build, requires bao_server running). |
| job_run_neo.sh | Runs through Neo (port 5501, requires pg_hint_plan and pre-generated hints). |
| job_run_balsa.sh | Runs through Balsa (port 5502, requires pg_hint_plan and hints). |
| job_run_skinner.sh | Runs through SkinnerDB (Java jar, no PG). |
| bao_train_and_test.sh | Online-train Bao on JOB (EPOCHS passes, retrains every 25 queries), then one clean test pass into results/bao_job.csv. |
| job_run_mcts_approbation.sh | Approbation sweep for MCTS along the reward_mode × luby × rollout axes, on the production-best substrate. |
| job_run_mcts_ablation.sh | Ablation: start from production-best, turn ONE feature off at a time, measure the cost of removing each (gate, top_k, depth, kernels, luby, …). |
| job_run_mcts_aqo.sh | Three-way: plain_pg / mcts_only / mcts_aqo. Measures how much per-query cardinality correction (AQO learn) helps MCTS converge across the 5 reruns. |
| job_compare_methods.sh | Head-to-head comparison: pg / mcts / pg_aqo / mcts_aqo over JOB; AQO in controlled mode for stable cardinality estimates. |
| job_cardinality_analysis.sh | Per-node q-error analysis via EXPLAIN (ANALYZE, VERBOSE, BUFFERS) for the same configs. Phase B of the cardinality study. |
| job_dump_plans.sh | Standalone EXPLAIN dumper — saves plans per (config, query) and a side-by-side digest. |
| job_greedy_mcts.sh | Coordinate-descent (greedy) tuner over MCTS GUCs. Supports env-var pinning for multi-pass runs. |
| job_run_saio_vs_pg.sh | pg vs saio (defaults) on JOB queries with n_rels > 11 (GEQO threshold). Captures planning_ms and exec_ms separately. |
| job_run_saio_configs.sh | Multi-config SAIO sweep — pg plus saio_default / saio_mid / saio_cheap / saio_cheap_r3 / saio_cheap_r5 over JOB. Drives the SAIO findings in SAIO_REPORT.md. |
| job_saio_tuning_pilot.sh | SAIO parameter pilot on 4 worst-case JOB queries — exploratory sweep before launching the full multi-config run. Writes to results/saio_tuning/<run-id>/. |
| aqo_setup.sh | Builds & installs AQO, adds it to shared_preload_libraries, raises hash-table limits (fs_max_items, fss_max_items, dsm_size_max), restarts PG, runs CREATE EXTENSION aqo. |
| aqo_train.sh | Trains AQO over N iterations on the JOB workload, with optional --with-mcts to train on MCTS-shaped plans, --no-reset to keep prior state. Tracks per-iter geomean exec time, learned-query count, and AQO q-error from aqo_query_stat. |
| collect.sh | Folds results/*_job.csv into results/job_aggregate.csv. |
| source/ | Fork of join-order-benchmark: schema.sql, copy.sql, fkindexes.sql, IMDb CSV data (source/csv/), 113 queries (source/queries/). |
| articles/ | Reading list (PDFs) organised by topic: base/ (Selinger, Monma-Sidney, DPhyp, JOB), AI/ (Neo, Bao, Balsa, DQ, DeepDB, RTOS, HyperQO, JOGGER, LLMSteer, surveys), MCTS/ (AlphaJoin, HyperQO, our practical-MCTS draft), randomised/ (Ioannidis-Kang, quantum ant-colony, LOAM), reopt/ (SkinnerDB, QuerySplit, Liu-Ives, Robust JO), extra/. |
| plot_data/ | Pre-aggregated CSVs that feed the cross-method plot scripts: all_methods.csv (wide format: query,n_rel,pg_e2e_ms,hq_e2e_ms,alpha_e2e_ms,mcts_e2e_ms,bao_e2e_ms,neo_e2e_ms,skinner_e2e_ms), e2e_by_size.csv. |
plans/<method>/ |
(created on run) plain-text EXPLAIN plans. |
results/<method>_job.csv |
(created on run) query,iter,exec_ms. |
results/compare/<run-id>/ |
(created on run) MCTS-vs-AQO compare-run output: per_query.csv, summary.csv, plans/<cfg>/. |
results/compare_saio/<run-id>/ |
(created on run) SAIO compare runs: per_query.csv with planning_ms,exec_ms, summary.csv. |
results/saio_tuning/<run-id>/ |
(created on run) SAIO tuning pilots. |
logs/ |
(created on run) cluster logs. |
| SAIO_REPORT.md | Port of PGXN saio to PG 19devel: API migration, two runtime bugs (get_all_nodes_rec losing nodes; saio_move early-exit), saio_restarts GUC, and the empirical SAIO-vs-PG comparison on all 113 JOB queries. |
| RESULTS.md | Aggregated results / talk-track for the JOB head-to-head numbers. |
All plot scripts write into plots/ and accept an optional --src=PATH to point at an explicit CSV instead of the latest auto-detected run.
| Script | Inputs | Outputs |
|---|---|---|
| make_compare_plots.py | $BENCH_ROOT/results/compare/<latest>/per_query.csv |
slide37_worst_best_ratio.png, e2e_scatter_all.png, e2e_per_query_ratio_sorted.png. |
| make_e2e_scatter_split.py | plot_data/all_methods.csv |
One log-log scatter per method vs PG (e2e_scatter_{hq,alpha,bao,neo,skinner,mcts}.png) — same data as the 2×3 e2e_scatter_all.png grid. |
| make_method_vs_pg_per_query.py | plot_data/all_methods.csv |
Per-algorithm grouped bar charts (PG vs method, one bar per query). Flags: --min-n=N / --max-n=N to restrict by n_rel, --tag=STR to suffix the filename (e.g. _small / _large), --rows=N to force vertical wrap into N subplot rows, --horizontal for tall barh layout. |
| make_all_methods_per_query.py | plot_data/all_methods.csv |
Single grouped bar chart with all 7 methods (PG · HyperQO · AlphaJoin · Bao · Neo · SkinnerDB · MCTS) per query, log Y. Flags: --min-n=N, --out=PATH. |
| make_saio_plots.py | $BENCH_ROOT/results/compare_saio/<latest>/per_query.csv (pg + saio) |
saio_e2e_scatter.png (planning / execution / total scatter), saio_e2e_ratio_sorted.png, saio_planning_vs_exec.png (stacked side-by-side bars). |
| make_saio_configs_plots.py | $BENCH_ROOT/results/compare_saio/<latest>/per_query.csv (pg + N saio variants) |
saio_configs_e2e_scatter.png, saio_configs_e2e_ratio.png, saio_configs_by_nrels.png, saio_configs_planning_vs_exec.png. |
| make_per_query_planning_exec.py | $BENCH_ROOT/results/compare_saio/<latest>/per_query.csv |
One stacked-bar PNG per query in plots/per_query/ — every non-pg config side-by-side with PG. Flags: --min-n=N (default 12), --queries=10a,11b,.... |
| analyze_compare.py | $BENCH_ROOT/results/compare/<latest>/per_query.csv |
Numeric breakdown of AQO's effect on PG vs MCTS (pg_aqo/pg vs mcts_aqo/mcts); writes plots/aqo_improvement.png plus updates the slide37 / scatter PNGs. |
Environment variables (from lib.sh)
Override with export before invoking the script.
| Variable | Default | Meaning |
|---|---|---|
PG_BASE |
$HOME/my_postgres10 |
Single point of customization. All other PG paths derive from this. |
PG_DATA_NAME |
vacuum_stats9 |
Subdir name under $PG_BASE for PGDATA. |
INSTDIR |
$PG_BASE/my/inst/bin |
PG bin directory. |
PGDATA |
$PG_BASE/$PG_DATA_NAME |
Cluster data dir. |
PGPORT |
5499 |
Vanilla PG port. |
PGUSER |
$(whoami) |
|
PGDATABASE |
postgres |
Used before imdb is created. |
QUERY_DIR |
$HOME/source |
Root of the JOB fork (schema.sql, csv/, queries/ live here). |
QUERY_FILES |
$QUERY_DIR/queries |
Directory with *.sql. |
BENCH_ROOT |
$HOME/min_job |
Where to write plans/, results/, logs/. |
ITERS |
5 |
Reruns per query. |
STATEMENT_TIMEOUT_MS |
600000 |
10 minutes per query. |
Each learned optimizer brings its own *_PORT, *_PGDATA, *_INSTDIR, *_HINTS_DIR — see the header of the matching job_run_*.sh.
# 0. Bring up the cluster and confirm it listens on 5499
# (lib.sh::pg_ensure_up does this automatically).
# 1. Create the imdb DB from CSVs (source — source/)
export QUERY_DIR=$HOME/min_job/source
./job_create.sh imdb
# 2. Baseline run against vanilla PG
./job_run.sh imdb 5 # 5 iters per query -> results/plain_pg_job.csv
# 3. (Optional) AQO setup + training
./aqo_setup.sh imdb # extension + raised hash limits + restart
./aqo_train.sh imdb 30 # 30 iters; AQO learns from DP/GEQO plans
./aqo_train.sh imdb 30 \ # additional pass: AQO also learns
--with-mcts \ # from MCTS-shaped plans
--no-reset # preserves prior learning
# 4. Each learned optimizer (see prerequisites below)
./job_run_bao.sh imdb 5
./job_run_neo.sh imdb 5
./job_run_balsa.sh imdb 5
./job_run_skinner.sh 5
# 5. Head-to-head: pg vs mcts, with and without AQO (4 configs × 113 queries × ITERS)
./job_compare_methods.sh imdb 5
# 6. SAIO comparisons (optional)
./job_run_saio_vs_pg.sh imdb 3 12 # 2-way: pg vs saio on n_rels>=12
./job_run_saio_configs.sh imdb 3 12 # multi-config sweep
# 7. Build plots from the latest runs
python3 make_compare_plots.py # pg/mcts/aqo comparisons
python3 make_saio_plots.py # saio 2-way
python3 make_saio_configs_plots.py # saio multi-config
python3 make_per_query_planning_exec.py # per-query stacked bars
python3 make_all_methods_per_query.py # all 7 optimizers, one chart
python3 make_method_vs_pg_per_query.py --max-n=11 --tag=small --rows=2
python3 make_method_vs_pg_per_query.py --min-n=12 --tag=large
# 8. Aggregate everything
./collect.sh # -> results/job_aggregate.csvThis repo's contrib/mcts_extreme adds an MCTS-based join-search planner; contrib/aqo learns cardinality predictions from past executions. They compose: MCTS provides the join order, AQO provides the row estimates that drive cost-based pruning inside MCTS.
The typical pipeline:
# A. Build & install AQO; raise hash-table limits in postgresql.conf;
# restart PG; CREATE EXTENSION aqo.
./aqo_setup.sh imdb
# B. Train AQO over N iters (mode='learn' by default).
./aqo_train.sh imdb 30 # AQO learns from PG-DP/GEQO plans
./aqo_train.sh imdb 30 --with-mcts --no-reset # AQO also learns from MCTS plans
# Outputs: results/aqo_train/<run-id>/{summary.csv,per_query.csv,log.txt}
# C. Compare planning methods at AQO-converged cardinalities.
./job_compare_methods.sh imdb 5
# Six possible configs (default: 4):
# pg PG default join search. No AQO.
# mcts MCTS-extreme on ALL queries. No AQO.
# pg_aqo PG + AQO controlled (uses learned cardinalities).
# mcts_aqo MCTS + AQO. ★ flagship.
# dp_only diagnostic: force DP for all n_rels.
# geqo_only diagnostic: force GEQO for all n_rels.
# D. Per-node cardinality q-error analysis on the same configs.
./job_cardinality_analysis.sh imdb 1 # uses EXPLAIN (ANALYZE, VERBOSE)
# E. Ablation: cost of removing each MCTS feature from production-best.
./job_run_mcts_ablation.sh imdb 5
# Variants: pg, best, no_gate, k0_bushy, k_heur, k_bandit, with_topk5,
# low_depth, low_budget, no_luby, reward_avg
# F. Plots.
python3 make_compare_plots.pyjob_compare_methods.sh saves per-query plans alongside its CSV — output layout:
$RESULTS_DIR/compare/<run-id>/
├── per_query.csv (query,n_rels,config,iter,exec_ms,top_cost)
├── summary.csv (per-config geomean / median / wins-vs-pg / losses-vs-pg / geo_ratio_vs_pg)
├── log.txt
└── plans/
├── pg/<query>.plan
├── mcts/<query>.plan
├── pg_aqo/<query>.plan
└── mcts_aqo/<query>.plan
For visual per-query comparison, use job_dump_plans.sh to produce a side-by-side digest:
./job_dump_plans.sh imdb 0 # 0 = plain EXPLAIN (fast). 1 = EXPLAIN ANALYZE (slow).
# -> $PLANS_DIR/<config>/<query>.plan
# -> $PLANS_DIR/diff/<query>.txt -- aligned blocks of all configscontrib/saio is this repo's port of the PGXN saio extension (J. Urbański's simulated-annealing join planner) onto PG 19devel. The port fixes two long-standing runtime bugs and adds multi-restart via the new saio_restarts GUC — see SAIO_REPORT.md for the full story.
The driver scripts all share output schema query,n_rels,config,iter,planning_ms,exec_ms,top_cost, so all SAIO plot scripts work on any of their outputs.
# A. Build SAIO and ensure it's on the dynamic-library path.
cd ../contrib/saio && make && make install
# B. Quick pilot: SAIO defaults vs custom tunings on the 4 worst-case queries.
./job_saio_tuning_pilot.sh imdb 3
# -> $RESULTS_DIR/saio_tuning/<run-id>/per_query.csv
# C. Head-to-head: pg vs saio (defaults) on all queries with n_rels > 11.
./job_run_saio_vs_pg.sh imdb 3 12
# -> $RESULTS_DIR/compare_saio/<run-id>/per_query.csv
# D. Multi-config sweep: pg + saio_default + saio_cheap + saio_cheap_r3 …
./job_run_saio_configs.sh imdb 3 12
# E. Plots from any of the above runs.
python3 make_saio_plots.py # 2-config run (pg + saio)
python3 make_saio_configs_plots.py # N-config sweep
python3 make_per_query_planning_exec.py # one PNG per query
python3 make_per_query_planning_exec.py --min-n=0 # … including small queriesmake_per_query_planning_exec.py auto-detects every non-pg config in the run and stacks them side-by-side per query, so the same plot script works for a 2-way pg vs saio comparison and for the full pg / saio_default / saio_cheap / saio_cheap_r3 matrix.
The headline SAIO result on JOB (113 queries, saio_cheap config): SAIO wins outright on 10 queries (9%), loses within 2× on 51%, loses 2–5× on 29%; the most dramatic single-query speedup is 31c (n_rels=11) — PG 20.3 s → SAIO 0.9 s — and 31c is a DP query, not a GEQO one. Full breakdown in SAIO_REPORT.md.
plot_data/all_methods.csv is the canonical wide-format CSV behind every cross-method chart. It pre-aggregates per-query medians for pg / hq / alpha / bao / neo / skinner / mcts so the plotting scripts don't need to re-run any optimizer:
# Single combined chart, all queries.
python3 make_all_methods_per_query.py
# Same chart, restricted to n_rel >= 12 (the GEQO-trigger subset).
python3 make_all_methods_per_query.py --min-n=12 --out=plots/all_methods_per_query_n12.png
# One PG-vs-method bar chart per algorithm, split into "small" (n<12) and "large" (n>=12).
python3 make_method_vs_pg_per_query.py --max-n=11 --tag=small --rows=2
python3 make_method_vs_pg_per_query.py --min-n=12 --tag=large
# One log-log scatter per method (vs PG) — same data as the e2e_scatter_all.png grid.
python3 make_e2e_scatter_split.py--rows=N for vertical bar charts wraps a long query list into N stacked subplot rows (useful when the small subset is 93 queries wide); --horizontal flips to one row per query.
All learned optimizers below are external projects — not vendored here. The job_run_*.sh scripts assume you cloned them into $HOME and built them. CPU-only PyTorch is sufficient for the ML methods (no CUDA needed); training on CPU takes hours.
Adapted forks at github.com/Alena0704 — every repo referenced below has a fork on my GitHub patched to build against the current PostgreSQL master (PG 19devel) used in this repo. Upstream branches in many of these projects target PG 11–14 and no longer compile cleanly against master (API drift in
pathnodes.h, list-as-array changes,pg_prng_*, extension-state API, …). Prefer cloning my forks when you reproduce the JOB pipeline here; the upstream repos are still listed for traceability.Pinned upstream PG commit. All forks here are built and tested against PostgreSQL master at:
a0a0c0c20ec5f8787bb1be5f476c4e59f6810634— "Skip other sessions' temp tables in REPACK, CLUSTER, and VACUUM FULL" (Álvaro Herrera, 2026-05-05)This is the commit that
origin/HEADof the PG mirror in this repo points to. If you build against a substantially newer master, expect minorpathnodes.h/ planner-API drift; the patch series in each fork tracks this commit and may need a small rebase forward.Direct fork links: saio · aqo · Neo · balsa · BaoForPostgreSQL · skinnerdb · jo-bench · parameterized-jo-bench · pg_track_optimizer · postgres (PG master mirror with my patches).
Bao — upstream learnedsystems/BaoForPostgreSQL · fork Alena0704/BaoForPostgreSQL (adapted to current master)
git clone https://github.com/Alena0704/BaoForPostgreSQL ~/BaoForPostgreSQL
cd ~/BaoForPostgreSQL/pg_extension && make USE_PGXS=1 install
initdb -D ~/bao_pgdata
pg_ctl -D ~/bao_pgdata -o "-p 5500" start
psql -p 5500 postgres -c "CREATE EXTENSION pg_bao"
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install -r ~/BaoForPostgreSQL/bao_server/requirements.txt
export CUDA_VISIBLE_DEVICES=""
cd ~/BaoForPostgreSQL/bao_server && python main.py # listens on :9381Then re-run job_create.sh against port 5500 to load IMDb into the Bao cluster. Bao ships its own PG fork (planner modifications), so pg_bao cannot be loaded into a vanilla PG. Marcus et al., SIGMOD 2021.
job_run_bao.sh checks that bao_server is listening on http://localhost:9381; otherwise pg_bao silently falls back to the default planner, and timings would reflect PG rather than Bao.
Neo — upstream KostasMparmparousis/Neo · fork Alena0704/Neo (adapted to current master)
The original Marcus et al. (VLDB 2019) code was never released, so we use a community re-implementation. From 2022 onwards Neo has effectively been superseded by Balsa — if you pick one, pick Balsa.
git clone https://github.com/Alena0704/Neo ~/neo
# Build PG14 with pg_hint_plan:
git clone https://github.com/ossc-db/pg_hint_plan ~/pg_hint_plan
cd ~/pg_hint_plan && make PG_CONFIG=~/neo_pg/inst/bin/pg_config install
initdb -D ~/neo_pgdata
pg_ctl -D ~/neo_pgdata -o "-p 5501" start
echo "shared_preload_libraries = 'pg_hint_plan'" >> ~/neo_pgdata/postgresql.conf
pg_ctl -D ~/neo_pgdata restart
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install -r ~/neo/requirements.txt
export CUDA_VISIBLE_DEVICES=""
# Train Neo on the JOB train-split, then generate hints:
# ~/neo/hints/<query>.hint <- pg_hint_plan string per queryjob_run_neo.sh reads the hint from $NEO_HINTS_DIR/<query>.hint, loads pg_hint_plan, prepends the hint to the SQL, and measures.
Balsa — upstream balsa-project/balsa · fork Alena0704/balsa (adapted to current master)
Yang et al., SIGMOD 2022. Successor to Neo, trains without expert demonstrations.
git clone https://github.com/Alena0704/balsa ~/balsa
cd ~/balsa
./scripts/build_pg.sh # builds PG14 with pg_hint_plan and Balsa's patch -> ~/balsa/inst/
./scripts/init_db.sh # cluster on port 5502
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install -r ~/balsa/requirements.txt
export CUDA_VISIBLE_DEVICES=""
python -m balsa.train --workload=job --device=cpu
# Balsa writes per-query hints into ~/balsa/hints/job/<query>.hintjob_run_balsa.sh applies the hint and times via pg_hint_plan, exactly like Neo.
SkinnerDB — upstream cornelldbgroup/skinnerdb · fork Alena0704/skinnerdb (adapted to current master)
Trummer et al., SIGMOD 2019. Standalone Java DBMS — no PostgreSQL involved.
git clone https://github.com/Alena0704/skinnerdb ~/SkinnerDB
cd ~/SkinnerDB && mvn package
# Load IMDb into SkinnerDB's bundled storage:
java -jar target/skinnerdb-1.0-SNAPSHOT-jar-with-dependencies.jar \
--load $HOME/min_job/source/csvjob_run_skinner.sh pipes SQL into the jar via stdin and parses the Query took XXX ms line. Neither \timing nor pg_hint_plan apply here.
The upstream postgrespro/aqo master branch targets older PG releases. Two sources work against current master:
- postgrespro/aqo@master_development — upstream development branch, follows the core-side patch that adds the AQO hooks.
- Alena0704/postgres@aqo-patched-master — full PG-master fork with the AQO core-hooks patch (
contrib/aqo/aqo_master.patch) already applied; this is the branch the scripts here are tested against.
# Option A: build the PG fork that already has the AQO core hooks applied.
git clone -b aqo-patched-master https://github.com/Alena0704/postgres ~/my_postgres
cd ~/my_postgres && ./configure --prefix=$PWD/my/inst && make -j && make install
# AQO is in contrib/aqo/ on this branch; build & install it like any contrib module.
cd contrib/aqo && make && make install
# Option B: keep upstream PG and apply the dev-branch AQO on top.
git clone -b master_development https://github.com/postgrespro/aqo ~/aqo
# … then apply the matching core-hooks patch and rebuild PG (see aqo_setup.sh).
./aqo_setup.sh imdbAlena0704/saio is the fork of parkag/saio adapted to current master. The port story (API migration, two runtime bug fixes, new saio_restarts GUC) is documented in SAIO_REPORT.md. The actual extension code lives in this repo at contrib/saio/.
- The baseline
job_run_*.shscripts write CSVs in the samequery,iter,exec_msformat, so collect.sh picks up any new method automatically — just drop a file intoresults/<method>_job.csv. The SAIO and AQO comparison scripts use the wider schemaquery,n_rels,config,iter,planning_ms,exec_ms,top_cost(one row per config) and write underresults/compare/<run-id>/orresults/compare_saio/<run-id>/. - Bao is timed as plain wallclock (
\timing) — that's total cost (planning + execution); Bao injects hints viaplanner_hook. Neo and Balsa are timed against pre-generated hints, so inference time is not part ofexec_ms. HyperQO and AlphaJoin numbers inplot_data/all_methods.csvcome from prior measurements and include their own planning overhead. - The IMDb deployment has only been tested in C-locale (see source/README.md).
- AQO needs PG's core hooks. Don't apply patches by hand — clone Alena0704/postgres@aqo-patched-master (the branch already carries the AQO core-hooks patch on top of upstream master) and build from there.
- MCTS-Extreme's production-best config (
fixed K=1, min_relations=13, depth=8, top_k=0, expl=1.0, budget=100, phases=5) is documented inmin_job/adaptive_kernels/PARAM_STUDY.md. The comparison and ablation scripts usemin_relations=2so MCTS runs on every query — for a head-to-head with DP/GEQO across the full workload. - SAIO's "cheap" tuning (
saio_equilibrium_factor=4, saio_temperature_reduction_factor=0.7, saio_moves_before_frozen=2) brings planning overhead down from 5–11 s (defaults) to 0.3–0.8 s; the_r3/_r5variants enable multi-restart (saio_restarts=3/5) to recover from local minima. See SAIO_REPORT.md §3 for the ablation. plot_data/all_methods.csvis a pre-aggregated wide-format snapshot — regenerating it requires re-running the relevantjob_run_*.shand folding the outputs together; the cross-method plot scripts read this CSV directly and don't re-execute any queries.