This project spins up two containers:
- Postgres 16 (
db) - Python 3.11 benchmark runner (
bench)
They share volumes; the bench container writes directly to a bind-mounted folder so results appear on the host as the run progresses. The benchmark generates a source table with 30 columns and performs experiments across table sizes of 2^N rows, measuring:
- Index storage as a percentage of table size
- SELECT latency using an index (cold-ish and hot runs)
- INSERT latency
- SELECT latency without using an index (cold-ish and hot runs)
- Table storage without indexes
Results are incrementally written to results.json.
Set via environment in docker-compose.yml:
MAX_N(default 25): highest N, table rows = 2^NM_MAX(default 30): sweep M from 1..M_MAX (1 = PK only). Additional B-tree indexes created onc01..c{M-1}.INSERT_BATCH(default 1000): number of rows inserted for insert timing per N.REPEATS(default 3): target repeats for timing samples when operations are slow.TIME_THRESHOLD_S(default 5): if the first measured times for a group of operations are below this threshold, useFEW_REPEATSinstead ofREPEATS.FEW_REPEATS(default 2): reduced repeat count used when times are belowTIME_THRESHOLD_S.
Quickest: use the provided helper script run_benchmark.sh.
Examples:
# Small smoke test with M sweep up to 5; writes results into ./results and tears down stack afterwards
M_MAX=5 ./run_benchmark.sh -n 4 -b 1000 -o ./results --down
# Larger run (be careful, 2^20 rows prepopulation); keep stack running after
M_MAX=8 REPEATS=5 TIME_THRESHOLD_S=5 FEW_REPEATS=2 ./run_benchmark.sh -n 20 -b 2000 -o ./resultsOptions:
-n MAX_Nhighest N (rows = 2^N). Default 25- Sweep for M is controlled via env
M_MAX(default 30) -b INSERT_BATCHinsert batch size for timing. Default 1000-o OUT_DIRhost folder to copy results into. Default ./results--downbring down stack and remove volumes after run
Environment overrides: MAX_N, M_MAX, INSERT_BATCH, OUT_DIR, BRING_DOWN=1.
How results are written
- The runner binds your chosen OUT_DIR to
/benchin thebenchcontainer, so files appear on the host immediately after each (N,M). - Files:
OUT_DIR/results.json– compact tuples for plottingOUT_DIR/results_full.json– full structured results
Or manually via docker compose:
docker compose build
docker compose up --remove-orphansWatch the bench container logs for progress. Results will appear in OUT_DIR while the run is active; no copy step is required.
- "Cold" timings are first-run approximations; true cold cache requires superuser privileges or OS cache control which are not applied here.
- The planner is nudged to use or avoid indexes via
enable_*GUCs. - The dataset is pseudo-random but correlated to reflect more realistic distributions.
./run_benchmark.sh -n 25 -m 20 -r 1 -f 10 -b 10scp -r testdb:postgres-demo/results/'*'.json results/ && python3 plot.pyStandard D4ads v6 (4 vcpus, 16 GiB memory).