A distributed, vectorized query execution runtime implemented in C++20. This project implements a coordinator-worker architecture designed for high-performance analytical query processing, featuring gRPC-based communication, Apache Arrow for in-memory data representation, and fault-tolerant scheduling.
The system consists of three primary components:
The Coordinator serves as the central brain of the cluster. It manages the global state and query lifecycle:
- Query Planning: Fragments physical query plans into a Directed Acyclic Graph (DAG) of execution stages separated by shuffle boundaries.
- Scheduling: Assigns tasks to workers using a resource-aware, least-loaded scheduling policy.
- Worker Management: Tracks cluster health via a heartbeat mechanism and handles dynamic worker registration.
- Fault Tolerance: Automatically detects worker failures and reschedules affected tasks to healthy nodes.
Workers are responsible for the actual execution of query fragments:
- Vectorized Execution: Processes data in batches using Apache Arrow to maximize CPU cache locality and SIMD throughput.
- Data Exchange (Shuffle): Implements a pull-based data exchange model where downstream workers stream intermediate results from upstream workers via gRPC.
- Memory Management: Monitors local memory usage and implements proactive disk spilling to prevent Out-Of-Memory (OOM) errors during large shuffles.
- Parquet Integration: Supports scanning real Parquet files from local storage using the Parquet-cpp library.
A standalone command-line tool used to interact with the cluster:
- Query Submission: Submits serialized physical plans to the Coordinator.
- Result Retrieval: Polls the Coordinator to gather and aggregate final results from the egress stage of the query.
DQR utilizes a pull-based execution model. Downstream stages initiate gRPC streams to upstream workers to "pull" data. This naturally implements backpressure, as the speed of the producer is dictated by the consumption rate of the consumer.
The scheduler maintains a state machine for every query. If a worker stops sending heartbeats, the Coordinator marks it as unhealthy and identifies all incomplete tasks assigned to it. These tasks are then reverted to a pending state and reassigned to other workers in the cluster.
When the worker's memory manager detects pressure (defined as 80% of the allocated limit), the execution engine identifies large intermediate buffers and spills them to a local temporary directory using Arrow IPC. The retrieval logic transparently handles reading from either memory or disk.
- C++20: Utilizing modern language features for safety and performance.
- gRPC / Protocol Buffers: High-performance RPC and structured serialization for control and data planes.
- Apache Arrow: Columnar in-memory format for vectorized execution and efficient IPC.
- Abseil: Used for advanced synchronization primitives and low-level utilities.
- spdlog: Structured, high-performance logging for system observability.
- Google Test (GTest): Comprehensive unit and integration testing suite.
src/coordinator/: Query fragmentation, scheduling, and worker management logic.src/worker/: Vectorized execution engine, memory manager, and Parquet scanner.src/common/: Shared RPC clients, types, and Protobuf definitions.src/cli/: Client submission and result retrieval tool.proto/: Protobuf service and message definitions.tests/: Unit, integration, and stress tests (110+ tests).
- CMake (>= 3.15)
- gRPC and Protobuf
- Apache Arrow and Parquet-cpp
- spdlog and Abseil
- GTest
mkdir build && cd build
cmake ..
make -j$(nproc)To spin up a coordinator and two workers automatically:
docker-compose up --buildThe project includes a comprehensive test suite of over 110 tests covering the planner, scheduler, engine, and full integration cycles.
cd build
ctest -V- Start the Coordinator:
./build/src/dqr_coordinator 50050 - Start a Worker:
./build/src/dqr_worker worker-1 50051 localhost:50050 - Submit via CLI:
./build/src/dqr_cli localhost:50050 submit q1 my_table - Fetch results:
./build/src/dqr_cli localhost:50050 results q1