Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain data structure change in 3.0 #1119

Merged
merged 3 commits into from
Feb 24, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -70,37 +70,37 @@ Therefore, Nebula Graph develops its own KVStore with RocksDB as the local stora

- One Nebula Graph KVStore cluster supports multiple graph spaces, and each graph space has its own partition number and replica copies. Different graph spaces are isolated physically from each other in the same cluster.

<!--
## Data storage formats
## Data storage structure

Nebula Graph stores vertices and edges. Efficient property filtering is critical for a Graph Database. So, Nebula Graph uses keys to store vertices and edges, while uses values to store the related properties.
Graphs consist of vertices and edges. Nebula Graph uses key-value pairs to store vertices, edges, and their properties. Vertices and edges are stored in keys and their properties are stored in values. Such structure enables efficient property filtering.

Nebula Graph {{ nebula.base20 }} has changed a lot over its releases. The following will introduce the old and new data storage formats and cover their differences.
- The storage structure of vertices

- Vertex format
Different from Nebula Graph version 2.x, version 3.x added a new key for each vertex. Compared to the old key that still exists, the new key has no `TagID` field and no value. Vertices in Nebula Graph can now live without tags owing to the new key.

![The vertex format of storage service](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/storage-vertex-format.png)
![The vertex structure of Nebula Graph](https://github.com/vesoft-inc/nebula-docs-cn/blob/{{nebula.branch}}/docs-2.0/1.introduction/3.nebula-graph-architecture/3.0-vertex-key.png?raw=true)

|Field|Description|
|:---|:---|
|`Type`|One byte, used to indicate the key type.|
|`PartID`|Three bytes, used to indicate the sharding partition and to scan the partition data based on the prefix when re-balancing the partition.|
|`VertexID`|Used to indicate vertex ID. For an integer VertexID, it occupies eight bytes. However, for a string VertexID, it is changed to `fixed_string` of a fixed length which needs to be specified by users when they create the space.|
|`VertexID`|The vertex ID. For an integer VertexID, it occupies eight bytes. However, for a string VertexID, it is changed to `fixed_string` of a fixed length which needs to be specified by users when they create the space.|
|`TagID`|Four bytes, used to indicate the tags that vertex relate with.|
|`SerializedValue`|The serialized value of the key. It stores the property information of the vertex.|

- Edge Format
- The storage structure of edges

![The edge format of storage service](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/storage-edge-format.png)
![The edge structure of Nebula Graph](https://github.com/vesoft-inc/nebula-docs-cn/blob/{{nebula.branch}}/docs-2.0/1.introduction/3.nebula-graph-architecture/3.0-edge-key.png?raw=true)

|Field|Description|
|:---|:---|
|`Type`|One byte, used to indicate the key type.|
|`PartID`|Three bytes, used to indicate the sharding partition. This field can be used to scan the partition data based on the prefix when re-balancing the partition.|
|`VertexID`|Used to indicate vertex ID. The former VID refers to source VID in out-edge and dest VID in in-edge, while the latter VID refers to dest VID in out-edge and source VID in in-edge.|
|`Edge Type`|Four bytes, used to indicate edge type. Greater than zero means out-edge, less than zero means in-edge.|
|`PartID`|Three bytes, used to indicate the partition ID. This field can be used to scan the partition data based on the prefix when re-balancing the partition.|
|`VertexID`|Used to indicate vertex ID. The former VID refers to the source VID in the outgoing edge and the dest VID in the incoming edge, while the latter VID refers to the dest VID in the outgoing edge and the source VID in the incoming edge.|
|`Edge Type`|Four bytes, used to indicate the edge type. Greater than zero indicates out-edge, less than zero means in-edge.|
|`Rank`|Eight bytes, used to indicate multiple edges in one edge type. Users can set the field based on needs and store weight, such as transaction time and transaction number.|
|`PlaceHolder`|One byte. Reserved.|
-->
|`SerializedValue`|The serialized value of the key. It stores the property information of the edge.|

### Property descriptions

Expand All @@ -114,12 +114,11 @@ Since in an ultra-large-scale relational network, vertices can be as many as ten

![data partitioning](https://www-cdn.nebula-graph.com.cn/nebula-blog/DataModel02.png)

<!--
### Edge and storage amplification
### Edge partitioning and storage amplification

In Nebula Graph, an edge corresponds to two key-value pairs on the hard disk. When there are lots of edges and each has many properties, storage amplification will be obvious. The storage format of edges is shown in the figure below.

![edge storage](https://docs-cdn.nebula-graph.com.cn/docs-2.0/1.introduction/2.nebula-graph-architecture/two-edge-format.png)
![partitioning by edge](https://github.com/vesoft-inc/nebula-docs-cn/blob/{{nebula.branch}}/docs-2.0/1.introduction/3.nebula-graph-architecture/edge-division.png?raw=true)

In this example, ScrVertex connects DstVertex via EdgeA, forming the path of `(SrcVertex)-[EdgeA]->(DstVertex)`. ScrVertex, DstVertex, and EdgeA will all be stored in Partition x and Partition y as four key-value pairs in the storage layer. Details are as follows:

Expand Down