From dcddb2261ffe67c476ae32981b55e9e261c1d062 Mon Sep 17 00:00:00 2001 From: Ken Chen Date: Thu, 14 Aug 2025 00:25:29 -0700 Subject: [PATCH 1/2] refine architecture --- docs/architecture.md | 70 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 54 insertions(+), 16 deletions(-) diff --git a/docs/architecture.md b/docs/architecture.md index fb2cf692..762511a1 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,36 +1,74 @@ # Architecture -## High Level Architecture +## High Level Architecture -The following diagram depicts the high level architecture of Timeplus SQL engine, starting from a single node deployment. +The following diagram depicts the high level components of Timeplus core engine. ![Architecture](/img/proton-high-level-arch.gif) -All of the components / functionalities are built into one single binary. +### The Flow of Data -## Data Storage +When data is ingested into Timeplus, it first lands in the NativeLog. As soon as the log commit completes, the data becomes immediately available for streaming queries. -Users can create a stream by using `CREATE STREAM ...` [DDL SQL](/sql-create-stream). Every stream has 2 parts at storage layer by default: +In the background, dedicated threads continuously tail new entries from the NativeLog and flush them to the Historical Store in larger, optimized batches. -1. the real-time streaming data part, backed by Timeplus NativeLog -2. the historical data part, backed by ClickHouse historical data store. +### NativeLog -Fundamentally, a stream in Proton is a regular database table with a replicated Write-Ahead-Log (WAL) in front but is streaming queryable. +The **Timeplus NativeLog** is the system’s write-ahead log (WAL) or journal: an append-only, high-throughput store optimized for low-latency, highly concurrent data ingestion. In a cluster deployment, it is replicated using **Multi-Raft** for fault tolerance. By enforcing a strict ordering of records, NativeLog forms the backbone of streaming processing in **Timeplus Core**. -## Data Ingestion +NativeLog uses its own record format, consisting of two high-level types: -When users `INSERT INTO ...` data to Proton, the data always first lands in NativeLog which is immediately queryable. Since NativeLog is in essence a replicated Write-Ahead-Log (WAL) and is append-only, it can support high frequent, low latency and large concurrent data ingestion work loads. +- **Control records** (a.k.a. meta records) – store metadata and operational information. +- **Data records** – columnar-encoded for fast serialization/deserialization and efficient vectorized streaming execution. -In background, there is a separate thread tailing the delta data from NativeLog and commits the data in bigger batch to the historical data store. Since Proton leverages ClickHouse for the historical part, its historical query processing is blazing fast as well. +Each record is assigned a monotonically increasing sequence number—similar to a Kafka offset—which guarantees ordering. -## External Stream +Lightweight indexes are maintained to support rapid rewind and replay operations by **timestamp** or **sequence number** in streaming queries. -In quite lots of scenarios, data is already in Kafka / Redpanda or other streaming data hubs, users can create [external streams](/external-stream) to point to the streaming data hub and do streaming query processing directly and then either materialize them in Proton or send the query results to external systems. +### Historical Store +The **Historical Store** in Timeplus stores data **derived** from the **NativeLog**. It powers use cases such as: +- **Historical queries** (a.k.a. *table queries* in Timeplus) +- **Fast backfill** into streaming queries +- Acting as a **serving layer** for downstream applications -## Learn More +Timeplus supports two storage encodings for the Historical Store: **columnar** and **row**. -Interested users can refer [How Timeplus Unifies Streaming and Historical Data Processing](https://www.timeplus.com/post/unify-streaming-and-historical-data-processing) blog for more details regarding its academic foundation and latest industry developments. You can also check the video below from [Kris Jenkins's Developer Voices podcast](https://www.youtube.com/watch?v=TBcWABm8Cro). Jove shared our key decision choices, how Timeplus manages data and state, and how Timeplus achieves high performance with single binary. +#### 1. Columnar Encoding (*Append Stream*) +Optimized for **append-most workloads** with minimal data mutation, such as telemetry or event logs. Benefits include: - +- High data compression ratios +- Blazing-fast scans for analytical workloads +- Backed by the **ClickHouse MergeTree** engine + +This format is ideal when the dataset is largely immutable and query speed over large volumes is a priority. + +#### 2. Row Encoding (*Mutable Stream*) +Designed for **frequently updated datasets** where `UPSERT` and `DELETE` operations are common. Features include: + +- Per-row **primary indexes** +- **Secondary indexes** for flexible lookups +- Faster and more efficient **point queries** compared to columnar storage + +Row encoding is the better choice when low-latency, high-frequency updates are required. + +## Deployment Architectures + +Timeplus supports multiple deployment architectures, allowing you to fine-tune the model **per Stream** or **per Materialized View**: + +- **MPP Shared-Nothing** + Each node has its own compute and storage. + Ideal for **ultra-low-latency** workloads where performance is critical. + +- **Shared-Storage** + Compute nodes share a common storage layer (e.g., S3). + Best for **large-scale ingestion**, **high concurrency**, and **high throughput** queries. + +- **Hybrid** + Combine both models in the same cluster. + Use the right architecture for each workload to balance latency and scalability. + +## References + +[How Timeplus Unifies Streaming and Historical Data Processing](https://www.timeplus.com/post/unify-streaming-and-historical-data-processing) \ No newline at end of file From 3dc76505775f6a46fc0d46484e02951f35135535 Mon Sep 17 00:00:00 2001 From: Ken Chen Date: Fri, 15 Aug 2025 22:22:24 -0700 Subject: [PATCH 2/2] Better single instance architecture doc --- docs/architecture.md | 65 ++++++++++++++++++++++++++------------------ docs/why-timeplus.md | 2 +- 2 files changed, 40 insertions(+), 27 deletions(-) diff --git a/docs/architecture.md b/docs/architecture.md index 762511a1..62ca247e 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -1,6 +1,6 @@ # Architecture -## High Level Architecture +## High Level Architecture The following diagram depicts the high level components of Timeplus core engine. @@ -8,11 +8,40 @@ The following diagram depicts the high level components of Timeplus core engine. ### The Flow of Data +#### Ingest + When data is ingested into Timeplus, it first lands in the NativeLog. As soon as the log commit completes, the data becomes immediately available for streaming queries. In the background, dedicated threads continuously tail new entries from the NativeLog and flush them to the Historical Store in larger, optimized batches. -### NativeLog +#### Query + +Timeplus supports three query modes: **historical**, **streaming**, and **hybrid (streaming + historical)**. + +- **Historical Query (a.k.a. Table Query)** + + Works like a traditional database query. Data is fetched directly from the **Historical Store**, and all standard database optimizations like the following apply. These optimizations accelerate large-scale scans and point lookups, making historical queries fast and efficient. + - Primary index + - Skipping index + - Secondary index + - Bloom filter + - Partition pruning + +- **Streaming Query** + + Operates on the **NativeLog**, which stores records in sequence. Queries run incrementally, enabling real-time processing patterns such as **incremental ETL**, **joins**, and **aggregations**. + +- **Hybrid Query** + + Streaming queries can automatically **backfill** from the Historical Store when: + 1. Data no longer exists in the NativeLog (due to retention policies). + 2. Pulling from the Historical Store is faster than rewinding the NativeLog to replay old events. + + This allows seamless handling of scenarios like **fast backfill** and **mixed real-time + historical analysis** without breaking query continuity and also don't need yet another external batch system to load the historical data which usually introduce worse latency, inconsitency and cost. + +### The Dural Storage + +#### NativeLog The **Timeplus NativeLog** is the system’s write-ahead log (WAL) or journal: an append-only, high-throughput store optimized for low-latency, highly concurrent data ingestion. In a cluster deployment, it is replicated using **Multi-Raft** for fault tolerance. By enforcing a strict ordering of records, NativeLog forms the backbone of streaming processing in **Timeplus Core**. @@ -21,11 +50,11 @@ NativeLog uses its own record format, consisting of two high-level types: - **Control records** (a.k.a. meta records) – store metadata and operational information. - **Data records** – columnar-encoded for fast serialization/deserialization and efficient vectorized streaming execution. -Each record is assigned a monotonically increasing sequence number—similar to a Kafka offset—which guarantees ordering. +Each record is assigned a monotonically increasing sequence number—similar to a Kafka offset—which guarantees ordering. Lightweight indexes are maintained to support rapid rewind and replay operations by **timestamp** or **sequence number** in streaming queries. -### Historical Store +#### Historical Store The **Historical Store** in Timeplus stores data **derived** from the **NativeLog**. It powers use cases such as: @@ -35,16 +64,16 @@ The **Historical Store** in Timeplus stores data **derived** from the **NativeLo Timeplus supports two storage encodings for the Historical Store: **columnar** and **row**. -#### 1. Columnar Encoding (*Append Stream*) +##### 1. Columnar Encoding (*Append Stream*) Optimized for **append-most workloads** with minimal data mutation, such as telemetry or event logs. Benefits include: -- High data compression ratios -- Blazing-fast scans for analytical workloads -- Backed by the **ClickHouse MergeTree** engine +- High data compression ratios +- Blazing-fast scans for analytical workloads +- Backed by the **ClickHouse MergeTree** engine This format is ideal when the dataset is largely immutable and query speed over large volumes is a priority. -#### 2. Row Encoding (*Mutable Stream*) +##### 2. Row Encoding (*Mutable Stream*) Designed for **frequently updated datasets** where `UPSERT` and `DELETE` operations are common. Features include: - Per-row **primary indexes** @@ -53,22 +82,6 @@ Designed for **frequently updated datasets** where `UPSERT` and `DELETE` operati Row encoding is the better choice when low-latency, high-frequency updates are required. -## Deployment Architectures - -Timeplus supports multiple deployment architectures, allowing you to fine-tune the model **per Stream** or **per Materialized View**: - -- **MPP Shared-Nothing** - Each node has its own compute and storage. - Ideal for **ultra-low-latency** workloads where performance is critical. - -- **Shared-Storage** - Compute nodes share a common storage layer (e.g., S3). - Best for **large-scale ingestion**, **high concurrency**, and **high throughput** queries. - -- **Hybrid** - Combine both models in the same cluster. - Use the right architecture for each workload to balance latency and scalability. - ## References -[How Timeplus Unifies Streaming and Historical Data Processing](https://www.timeplus.com/post/unify-streaming-and-historical-data-processing) \ No newline at end of file +[How Timeplus Unifies Streaming and Historical Data Processing](https://www.timeplus.com/post/unify-streaming-and-historical-data-processing) diff --git a/docs/why-timeplus.md b/docs/why-timeplus.md index 35438268..3faf058b 100644 --- a/docs/why-timeplus.md +++ b/docs/why-timeplus.md @@ -75,4 +75,4 @@ SQL-based rules can be used to trigger or resolve alerts in systems such as Page ### Scalability and Elasticity -Timeplus supports both MPP architecture for pure on-prem deployments—ideal when ultra-low latency is critical and storage/compute separation for elastic, cloud-native setups. In the latter mode, S3 is used for the NativeLog, Historical Store, and Query State Checkpoints. Combined with Kubernetes HPA or AWS Auto Scaling Groups, this enables highly concurrent continuous queries on clusters that scale automatically with demand. +Timeplus supports three deployment models: **MPP (shared-nothing)** for on-premises setups where ultra-low latency is critical, **storage/compute separation** for elastic cloud-native environments using S3 (or similar object storage) to store the NativeLog, Historical Store, and Query State Checkpoints with zero replication overhead, and **hybrid mode** that combines both approaches. In storage/compute separation deployments, clusters integrate seamlessly with Kubernetes HPA or AWS Auto Scaling Groups, enabling highly concurrent continuous queries while scaling automatically with demand. Please refer [Timeplus Architecture](/architecture) for more details.