Skip to content

Commit

Permalink
Refactor distributed query (#321)
Browse files Browse the repository at this point in the history
* Refactor distributed query

Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
  • Loading branch information
hanahmily committed Aug 31, 2023
1 parent 8ee20fd commit 91f0b71
Showing 1 changed file with 29 additions and 46 deletions.
75 changes: 29 additions & 46 deletions docs/concept/clustering.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# BanyanDB Clustering

BanyanDB Clustering introduces a robust and scalable architecture that comprises "Query Nodes", "Liaison Nodes", "Data Nodes", and "Meta Nodes". This structure allows for effectively distributing and managing time-series data within the system.
BanyanDB Clustering introduces a robust and scalable architecture that comprises "Liaison Nodes", "Data Nodes", and "Meta Nodes". This structure allows for effectively distributing and managing time-series data within the system.

## 1. Architectural Overview

A BanyanDB installation includes four distinct types of nodes: Data Nodes, Meta Nodes, Query Nodes, and Liaison Nodes.

![clustering](https://skywalking.apache.org/doc-graph/banyandb/v0.5.0/clustring.png)
A BanyanDB installation includes three distinct types of nodes: Data Nodes, Meta Nodes, and Liaison Nodes.

### 1.1 Data Nodes

Data Nodes hold all the raw time series data, metadata, and indexed data. They handle the storage and management of data, including streams and measures, tag keys and values, as well as field keys and values.

Data Nodes also handle the local query execution. When a query is made, it is directed to a Liaison, which then interacts with Data Nodes to execute the distributed query and return results.

In addition to persistent raw data, Data Nodes also handle TopN aggregation calculation or other computational tasks.

### 1.2 Meta Nodes
Expand All @@ -21,40 +21,29 @@ Meta Nodes is implemented by etcd. They are responsible for maintaining high-lev
- All nodes in the cluster
- All database schemas

### 1.3 Query Nodes
### 1.3 Liaison Nodes

Query Nodes are responsible for handling computational tasks associated with querying the database. They build query tasks and search for data from Data Nodes.
Liaison Nodes serve as gateways, routing traffic to Data Nodes. In addition to routing, they also provide authentication, TTL, and other security services to ensure secure and effective communication without the cluster.

### 1.4 Liaison Nodes
Liaison Nodes are also responsible for handling computational tasks associated with distributed querying the database. They build query tasks and search for data from Data Nodes.

Liaison Nodes serve as gateways, routing traffic to Query Nodes and Data Nodes. In addition to routing, they also provide authentication, TTL, and other security services to ensure secure and effective communication without the cluster.

### 1.5 Standalone Mode
### 1.4 Standalone Mode

BanyanDB integrates multiple roles into a single process in the standalone mode, making it simpler and faster to deploy. This mode is especially useful for scenarios with a limited number of data points or for testing and development purposes.

In this mode, the single process performs the roles of the Liaison Node, Query Node, Data Node, and Meta Node. It receives requests, maintains metadata, processes queries, and handles data, all within a unified setup.

### 1.6 Mix Mode in Storage Nodes

Query nodes and data nodes are implemented by a same executable binary, Storage Node. With the flag "mode", the storage node can be started as a query node or a data node. The default mode is "mix", which means the storage node is both a query node and a data node.

If the workload of query is high, you can start more storage nodes with the flag "mode" set to "query". If the workload of write is high, you can start more storage nodes with the flag "mode" set to "data".

Or you can start storage nodes with the flag "mode" set to "mix" to balance the workload of query and write.
In this mode, the single process performs the roles of the Liaison Node, Data Node, and Meta Node. It receives requests, maintains metadata, processes queries, and handles data, all within a unified setup.

## 2. Communication within a Cluster

All nodes within a BanyanDB cluster communicate with other nodes according to their roles:

- Meta Nodes share high-level metadata about the cluster.
- Data Nodes store and manage the raw time series data and communicate with Meta Nodes.
- Query Nodes interact with Data Nodes to execute queries and return results to the Liaison Nodes.
- Liaison Nodes distribute incoming requests to the appropriate Query Nodes or Data Nodes.
- Liaison Nodes distribute incoming data to the appropriate Data Nodes. They also handle distributed query execution and communicate with Meta Nodes.

### Nodes Discovery

All nodes in the cluster are discovered by the Meta Nodes. When a node starts up, it registers itself with the Meta Nodes. The Meta Nodes then share this information with the Liaison Nodes and Query Nodes, which use it to route requests to the appropriate nodes.
All nodes in the cluster are discovered by the Meta Nodes. When a node starts up, it registers itself with the Meta Nodes. The Meta Nodes then share this information with the Liaison Nodes which use it to route requests to the appropriate nodes.

## 3. **Data Organization**

Expand All @@ -70,17 +59,19 @@ By storing shard allocation information, Meta Nodes help ensure that data is rou

Data Nodes store all raw time series data, metadata, and indexed data. On disk, the data is organized by `<group>/shard-<shard_id>/<segment_id>/`. The segment is designed to support retention policy.

### 3.3 Query Nodes

Query Nodes do not store data. They handle the computational tasks associated with data queries, interacting directly with Data Nodes to execute queries and return results.

### 3.4 Liaison Nodes
### 3.3 Liaison Nodes

Liaison Nodes do not store data but manage the routing of incoming requests to the appropriate Query or Data Nodes. They also provide authentication, TTL, and other security services.

They also handle the computational tasks associated with data queries, interacting directly with Data Nodes to execute queries and return results.

## 4. **Determining Optimal Node Counts**

When creating a BanyanDB cluster, choosing the appropriate number of each node type to configure and connect is crucial. The number of Meta Nodes should always be odd, for instance, “3”. The number of Data Nodes scales based on your storage and query needs. The number of Query Nodes and Liaison Nodes depends on the expected query load and routing complexity.
When creating a BanyanDB cluster, choosing the appropriate number of each node type to configure and connect is crucial. The number of Meta Nodes should always be odd, for instance, “3”. The number of Data Nodes scales based on your storage and query needs. The number of Liaison Nodes depends on the expected query load and routing complexity.

If the write and read load is from different sources, it is recommended to separate the Liaison Nodes for write and read. For instance, if the write load is from metrics, trace or log collectors and the read load is from a web application, it is recommended to separate the Liaison Nodes for write and read.

This separation allows for more efficient routing of requests and better performance. It also allows for scaling out of the cluster based on the specific needs of each type of request. For instance, if the write load is high, you can scale out the write Liaison Nodes to handle the increased load.

The BanyanDB architecture allows for efficient clustering, scaling, and high availability, making it a robust choice for time series data management.

Expand Down Expand Up @@ -142,15 +133,15 @@ This architecture allows BanyanDB to execute write requests efficiently across a

## 6. Queries in a Cluster

BanyanDB utilizes a distributed architecture that allows for efficient query processing. When a query is made, it is directed to a Query Node.
BanyanDB utilizes a distributed architecture that allows for efficient query processing. When a query is made, it is directed to a Liaison Node.

### 6.1 Query Routing

Query Nodes differ from Liaison Nodes in that they do not store shard mapping information from Meta Nodes. Instead, they access all Data Nodes to retrieve the necessary data for queries. As the query load is lower, it is practical for query nodes to access all data nodes for this purpose. It may increase network traffic, but simplifies scaling out of the cluster.
Liaison Nodes do not use shard mapping information from Meta Nodes to execute distributed queries. Instead, they access all Data Nodes to retrieve the necessary data for queries. As the query load is lower, it is practical for liaison nodes to access all data nodes for this purpose. It may increase network traffic, but simplifies scaling out of the cluster.

Compared to the write load, the query load is relatively low. For instance, in a time series database, the write load is typically 100x higher than the query load. This is because the write load is driven by the number of devices sending data to the database, while the query load is driven by the number of users accessing the data.

This strategy enables scaling out of the cluster. When the cluster scales out, the query node can access all data nodes without any mapping info changes. It eliminates the need to backup previous shard mapping information, reducing complexity of scaling out.
This strategy enables scaling out of the cluster. When the cluster scales out, the liaison node can access all data nodes without any mapping info changes. It eliminates the need to backup previous shard mapping information, reducing complexity of scaling out.

### 6.2 Query Execution

Expand All @@ -167,16 +158,9 @@ User
|
v
------------------------------------
| Liaison Node | <--- Routes the User's Request
| (Routes the request to the Query Node)|
------------------------------------
|
| API Request (Query)
|
v
------------------------------------
| Query Node | <--- Stateless Node
| (Identify relevant Data Nodes) |
| Liaison Node | <--- Stateless Node, Distributes Query
| (Access all Data nodes to |
| execute distributed queries) |
------------------------------------
| | |
v v v
Expand All @@ -188,9 +172,8 @@ User
```

1. A user makes an API request to the Liaison Node. This request may be a query for specific data.
2. The Liaison Node routes the request to the appropriate Query Node.
3. The Query Node, which is stateless, select all data nodes.
4. The query is executed in parallel across all Data Nodes. Each Data Node processes the data stored in its shard concurrently with the others.
5. The results from each shard are then returned to the Query Node, which consolidates them into a single response to the user.
2. The Liaison Node builds a distributed query to select all data nodes.
3. The query is executed in parallel across all Data Nodes. Each Data Node execute a local query plan to process the data stored in its shard concurrently with the others.
4. The results from each shard are then returned to the Liaison Node, which consolidates them into a single response to the user.

This architecture allows BanyanDB to execute queries efficiently across a distributed system, leveraging the routing capabilities of the Liaison Node, the stateless nature of Query Nodes, and the parallel processing of Data Nodes.
This architecture allows BanyanDB to execute queries efficiently across a distributed system, leveraging the distributed query capabilities of the Liaison Node and the parallel processing of Data Nodes.

0 comments on commit 91f0b71

Please sign in to comment.