This is a demo project for the "ClickHouse for Symfony Developers" talk presented at SymfonyCon 2025.
This project demonstrates how to integrate ClickHouse with Symfony applications and compares its performance against traditional relational databases (PostgreSQL, MySQL, MariaDB) and Elasticsearch for analytical workloads.
The benchmark suite provides a systematic way to compare query performance across different database engines. Here's how it works:
Creates identical table schemas across all supported databases:
- PostgreSQL, MariaDB, MySQL: Creates a
test_int_32table with columns forid(auto-increment),datetime(timestamp), andintValue(integer), along with an index on thedatetimecolumn - Elasticsearch: Creates an index
test_int_32with mappings fordatetimeandintValuefields - ClickHouse: Creates a
test_int_32table using the MergeTree engine, ordered bydatetime
Usage:
bin/console benchmark:setup-tables [--force]The --force option drops existing tables/indices before creating new ones.
Generates and inserts time-series data across all databases:
- Generates data points evenly distributed across a time range (default: last 24 hours)
- Each data point contains a timestamp and a random integer value
- Supports configurable batch sizes for efficient bulk inserts
- Measures insertion performance (duration and points/second) for each database
Performance optimizations:
- SQL databases: Drops indexes before insertion, uses multi-value INSERT statements with transactions, then recreates indexes
- Elasticsearch: Uses the Bulk API with parallel requests (up to 100 concurrent requests)
- ClickHouse: Uses JSONEachRow format for efficient bulk inserts with parallel requests
Usage:
bin/console benchmark:fill-tables \
--datetime-from="-1 day" \
--quantity=1_000_000 \
--batch-size=1_000 \
--db=clickhouse,postgresqlOptions:
--datetime-from: Start datetime (e.g., "-1 day", "-2 hours", "2025-11-12 14:30:00")--quantity: Number of data points to insert (supports underscore formatting, max 10,000,000)--batch-size: Number of rows per batch (supports underscore formatting)--db: Specific database(s) to fill (comma-separated). Options: postgresql, mariadb, mysql, elasticsearch, clickhouse
Tests basic aggregation performance by calculating the average of intValue across all databases:
- Executes
AVG()andCOUNT()queries with a date range filter - Measures query execution time in milliseconds
- Compares performance across all engines
Usage:
bin/console benchmark:query-avg --datetime-from="-1 day"Tests time-series aggregation by calculating averages grouped by time windows:
- Uses database-specific time bucketing functions:
- PostgreSQL:
to_timestamp(floor(extract(epoch from datetime) / window) * window) - MySQL/MariaDB:
FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(datetime) / window) * window) - Elasticsearch: Date histogram aggregation with
fixed_interval - ClickHouse:
toStartOfInterval(datetime, INTERVAL window SECOND)
- PostgreSQL:
- Groups data into time windows (e.g., 1 hour, 15 minutes)
- Returns average value and count for each window
Usage:
bin/console benchmark:query-avg-by-window \
--datetime-from="-1 day" \
--window="1 hour"Window format examples: "1 day", "2 hours", "30 minutes", "15 minutes"
Use -v flag to see detailed results for each window.
After each benchmark, a performance summary table is displayed showing:
- Engine name
- Duration (in milliseconds)
- Throughput (for inserts: points/second, for queries: rows scanned)
- vs Fastest (percentage comparison to the fastest engine)
This makes it easy to visually compare the performance characteristics of different databases for analytical workloads.
This project includes a real-time monitoring system that demonstrates ClickHouse's capabilities for observability use cases:
Creates ClickHouse tables optimized for monitoring data:
- Main table (
monitoring_data): Stores raw request monitoring data (timestamp, duration, memory usage, controller, URI, status code) - Aggregation table (
monitoring_data_hourly): Uses AggregatingMergeTree engine to store pre-aggregated statistics - Materialized view (
monitoring_data_hourly_mv): Automatically aggregates data as it's inserted
The materialized view calculates:
- Request counts
- Duration statistics (avg, min, max, percentiles: p50, p90, p95, p99)
- Memory usage statistics (avg, min, max, percentiles: p50, p90, p95, p99)
Usage:
bin/console app:setup-monitoring-table [--force]Queries aggregated monitoring statistics with flexible filtering:
Usage:
bin/console app:query-monitoring-stats \
--hours=24 \
--controller="App\Controller\RootController::index" \
--status-code=200 \
--limit=50 \
--format=tableOptions:
--hours: Number of hours to look back (default: 24)--controller: Filter by controller name--status-code: Filter by HTTP status code--limit: Limit number of results (default: 50)--format: Output format:tableorjson(default: table)
The monitoring system uses ClickHouse's AggregateFunction types and merge functions (countMerge, avgMerge, quantilesMerge) to efficiently query pre-aggregated data, demonstrating the power of materialized views for real-time analytics.
The monitoring system is built using Symfony's event system and Messenger component to capture and store HTTP request metrics asynchronously. Here's the architecture:
-
RequestMonitoringListener (
src/EventListener/RequestMonitoringListener.php)- Listens to
KernelEvents::REQUESTandKernelEvents::TERMINATE - Captures metrics for each HTTP request:
- Request duration (in seconds)
- Peak memory usage (in bytes)
- Controller name
- Request URI
- HTTP status code
- Dispatches a
MonitoringDatamessage to Symfony Messenger
- Listens to
-
MonitoringData (
src/Messenger/Message/MonitoringData.php)- A simple readonly DTO (Data Transfer Object) that holds the monitoring metrics
- Passed through Symfony Messenger for asynchronous processing
-
MonitoringDataHandler (
src/Messenger/MessageHandler/MonitoringDataHandler.php)- Message handler that processes
MonitoringDatamessages - Implements buffering to optimize ClickHouse inserts:
- Buffers up to 100 monitoring entries before flushing
- Also flushes every 30 seconds to prevent stale data
- Uses ClickHouse's JSONEachRow format for efficient bulk inserts
- Gracefully handles errors to prevent monitoring from breaking the application
- Message handler that processes
-
MessengerWorkerSubscriber (
src/EventSubscriber/MessengerWorkerSubscriber.php)- Ensures buffered data is properly flushed when:
- The Messenger worker is stopped (
WorkerStoppedEvent) - Periodically during worker runtime every 30 seconds (
WorkerRunningEvent)
- The Messenger worker is stopped (
- Prevents data loss when the worker is restarted
- Ensures buffered data is properly flushed when:
HTTP Request
↓
RequestMonitoringListener (captures metrics)
↓
MonitoringData message → Symfony Messenger (async)
↓
MonitoringDataHandler (buffers data)
↓
ClickHouse (bulk insert via JSONEachRow)
↓
Materialized View (auto-aggregation)
↓
monitoring_data_hourly table (pre-aggregated stats)
First, create the necessary tables and materialized views:
bin/console app:setup-monitoring-table --forceThis creates:
monitoring_datatable (stores raw request data)monitoring_data_hourlytable (stores aggregated statistics)monitoring_data_hourly_mvmaterialized view (auto-aggregates on insert)
The monitoring system requires a running Messenger worker to process monitoring messages asynchronously:
bin/console messenger:consume async -vvImportant: Keep this worker running in the background. You can run it as a systemd service in production.
Make HTTP requests to your application. The RequestMonitoringListener will automatically capture metrics for every request:
# Example: Make requests to the demo controller
curl http://localhost:8000/
curl http://localhost:8000/some-other-routeOnce you have collected some data, query the aggregated statistics:
# View all monitoring stats for the last 24 hours
bin/console app:query-monitoring-stats
# Filter by specific controller
bin/console app:query-monitoring-stats \
--controller="App\Controller\RootController::index"
# Filter by status code
bin/console app:query-monitoring-stats --status-code=200
# Look back further (last 7 days)
bin/console app:query-monitoring-stats --hours=168
# Export as JSON
bin/console app:query-monitoring-stats --format=jsonThe output shows hourly aggregated statistics including:
- Total requests per hour
- Average/min/max request duration
- Duration percentiles (p50, p95)
- Average/min/max memory usage
- Memory percentiles (p50, p95)
- PHP 8.2+
- Symfony 7.3
- Composer
- Clone the repository:
git clone <repository-url>
cd clickhouse-symfony- Install dependencies:
composer install- Configure environment variables in
.env.local:
# Add connection strings for PostgreSQL, MySQL, MariaDB, Elasticsearch, and ClickHouse
# See .env for required variablesRun a complete benchmark test:
# 1. Setup tables
bin/console benchmark:setup-tables --force
# 2. Fill with 1 million data points
bin/console benchmark:fill-tables --quantity=1_000_000
# 3. Run simple aggregation benchmark
bin/console benchmark:query-avg
# 4. Run windowed aggregation benchmark
bin/console benchmark:query-avg-by-window --window="1 hour"