LLM Query Performance Testing

This benchmark is designed to measure the experiential impact of database performance on LLM chat interactions. Unlike traditional benchmarks, this focuses on showing user experience differences rather than raw metrics.

Compare OLAP (ClickHouse) vs OLTP (PostgreSQL) using LLM-style query patterns with npm run bulk-test
Demonstrate how query latency affects AI-powered data conversations with npm run latency-sim

📊 Results → ClickHouse is faster on datasets >50K records (up to 16.8x faster at 10M records).

The above chat shows the resulting chat performance, at 10m rows, of Postgres (unindexed) and ClickHouse.

This repo ships with results for tests run from 10k to 10m rows (see Benchmarking Methodology for details).

Prerequisites

Node.js v18+
Docker v20+

Setup

# Get the code
git clone https://github.com/514-labs/llm-test.git
cd llm-test

# Setup and build
npm install
npm run build

# Configure (optional - defaults work fine)
cp .env.example .env

Run chat sim

Since the repo ships with results, you can run the chat sim with: npm run latency-sim.

Run your own tests

If you want to run the benchmark yourself:

# Run Docker Desktop

# Optionally, edit .env to set the test sizes you are interested in. 
# Default: BULK_TEST_SIZES=10000,50000,100000,500000,1000000,5000000,10000000

# Run the bulk test tool
npm run bulk-test

Commands

npm start - Run Benchmark (data generation + queries)
npm run query-test - Query-only test with statistical analysis (requires databases running/populated)
npm run bulk-test - Comprehensive testing (configurable databases and dataset sizes)
npm run generate-graphs - Generate performance visualizations
npm run latency-sim - Interactive LLM conversation delay simulator
npm run clean - Reset databases and clear results

CLI options: see npm run help. Use -- --help for details for a particular command.

Data

46-column aircraft tracking records (position, altitude, transponder codes). See Schema Comparison

Query Pattern

LLM simulation: "How many aircraft are in the air on average every minute for the past hour?"

Q1: Table discovery
Q2: Schema exploration
Q3: Hourly counts
Q4: Average calculation

Configuration

Key .env settings: BULK_TEST_SIZES, BULK_TEST_DATABASES, BULK_TEST_CLEAR_RESULTS, DATASET_SIZE, BATCH_SIZE, PARALLEL_INSERT, container resources.

🔧 Full config →

Output

output/test-results.{json,csv} with timing data and 95% confidence intervals.

Methodology

Deterministic data (seeded)
3x warmup queries
95% CI across 100+ iterations
Database-specific optimizations

🔬 Details → | Schema →

Troubleshooting

Connection errors: docker ps
Memory issues: reduce DATASET_SIZE
Port conflicts: check 8123, 5432, 5433
Reset: npm run clean && npm run kill-dbs && npm run start-dbs

Built by the team behind the open source Moose Stack.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github		.github
output		output
src		src
tests		tests
visualization		visualization
.env		.env
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
BENCHMARK_METHODOLOGY.md		BENCHMARK_METHODOLOGY.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONFIGURATION.md		CONFIGURATION.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
RESULTS.md		RESULTS.md
SCHEMA_COMPARISON.md		SCHEMA_COMPARISON.md
SECURITY.md		SECURITY.md
blog-db-learnings.md		blog-db-learnings.md
cleanup-containers.sh		cleanup-containers.sh
latency-sim-10m.gif		latency-sim-10m.gif
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
visualization-data.json		visualization-data.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Query Performance Testing

Prerequisites

Setup

Run chat sim

Run your own tests

Commands

Data

Query Pattern

Configuration

Output

Methodology

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

514-labs/LLM-query-test

Folders and files

Latest commit

History

Repository files navigation

LLM Query Performance Testing

Prerequisites

Setup

Run chat sim

Run your own tests

Commands

Data

Query Pattern

Configuration

Output

Methodology

Troubleshooting

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages