Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update 2.graph-modeling.md #731

Merged
merged 3 commits into from
Sep 17, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 16 additions & 10 deletions docs-2.0/8.service-tuning/2.graph-modeling.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Graph data modeling suggestions

This section provides general suggestions for modeling data in Nebula Graph.
This topic provides general suggestions for modeling data in Nebula Graph.

!!! note

Expand All @@ -24,6 +24,8 @@ While creating Tags or Edge types, you need to define a set of properties. Prope

### Control changes in the business model and the data model

Changes here refer to changes in business models and data models (meta-information), not changes in the data itself.

Some graph databases are designed to be Schema-free, so their data modeling, including the modeling of the graph topology and properties, can be very flexible. Properties can be re-modeled to graph topology, and vice versa. Such systems are often specifically optimized for graph topology access.

Nebula Graph {{ nebula.release }} is a strong-Schema (row storage) system, which means that the business model should not change frequently. For example, the property Schema should not change. It is similar to avoiding `ALTER TABLE` in MySQL.
Expand All @@ -34,37 +36,41 @@ For example, in a business model, people have relatively fixed properties such a

### Breadth-first traversal over depth-first traversal

Nebula Graph has lower performance for depth-first traversal based on the Graph topology, and better performance for breadth-first traversal and obtaining properties. For example, if model A contains properties "name", "age", and "eye color", it is recommended to create a Tag `person` and add properties `name`, `age`, and `eye_color` to it. If you create a Tag `eye_color` and an Edge type `has`, and then create an edge to represent the eye color owned by the person, the traversal performance will not be high.
- Nebula Graph has lower performance for depth-first traversal based on the Graph topology, and better performance for breadth-first traversal and obtaining properties. For example, if model A contains properties "name", "age", and "eye color", it is recommended to create a tag `person` and add properties `name`, `age`, and `eye_color` to it. If you create a tag `eye_color` and an edge type `has`, and then create an edge to represent the eye color owned by the person, the traversal performance will not be high.

The performance of finding an edge by an edge property is close to that of finding a vertex by a vertex property. For some databases, it is recommended to re-model edge properties as those of the intermediate vertices. For example, model the pattern `(src)-[edge {P1, P2}]->(dst)` as `(src)-[edge1]->(i_node {P1, P2})-[edge2]->(dst)`. With Nebula Graph {{ nebula.release }}, you can use `(src)-[edge {P1, P2}]->(dst)` directly to decrease the depth of the traversal and increase the performance.
- The performance of finding an edge by an edge property is close to that of finding a vertex by a vertex property. For some databases, it is recommended to re-model edge properties as those of the intermediate vertices. For example, model the pattern `(src)-[edge {P1, P2}]->(dst)` as `(src)-[edge1]->(i_node {P1, P2})-[edge2]->(dst)`. With Nebula Graph {{ nebula.release }}, you can use `(src)-[edge {P1, P2}]->(dst)` directly to decrease the depth of the traversal and increase the performance.

### Edge directions

To query in the opposite direction of an edge, use the syntax `(dst)<-[edge]-(src)` or `GO FROM dst REVERSELY`.
To query in the opposite direction of an edge, use the following syntax:

`(dst)<-[edge]-(src)` or `GO FROM dst REVERSELY`.

If you do not care about the directions or want to query against both directions, use the following syntax:

If you don't care about the directions or want to query against both directions, use the syntax `(src)-[edge]-(dst)` or `GO FROM src BIDIRECT`.
`(src)-[edge]-(dst)` or `GO FROM src BIDIRECT`.

Therefore, there is no need to insert the same edge redundantly in the reversed direction.

### Set Tag properties appropriately
### Set tag properties appropriately

Put a group of properties that are on the same level into the same Tag. Different groups represent different concepts.
Put a group of properties that are on the same level into the same tag. Different groups represent different concepts.

### Use indexes correctly

Using property indexes helps find VIDs through properties, but can lead to performance decrease by 90% or even more. Only use an index when you need to find vertices or edges through their properties.
Using property indexes helps find VIDs through properties, but can lead to performance reduction by 90% or even more. Only use an index when you need to find vertices or edges through their properties.

### Design VIDs appropriately

See [VID](../1.introduction/3.vid.md).

### Long texts

Do not use long texts to create edge properties. Edge properties are stored twice and long texts lead to greater write amplification. For how edges properties are stored, see [Storage architecture](../1.introduction/3.nebula-graph-architecture/4.storage-service.md). It is recommended to store long texts in HBase or Elasticsearch and store its address in Nehula Graph.
Do not use long texts to create edge properties. Edge properties are stored twice and long texts lead to greater write amplification. For how edges properties are stored, see [Storage architecture](../1.introduction/3.nebula-graph-architecture/4.storage-service.md). It is recommended to store long texts in HBase or Elasticsearch and store its address in Nebula Graph.

## Dynamic graphs (sequence graphs) are not supported

In some scenarios, graphs need to have time information to describe how the structure of the entire graph changes over time.[^twitter]
In some scenarios, graphs need to have the time information to describe how the structure of the entire graph changes over time.[^twitter]

The Rank field on Edges in Nebula Graph {{ nebula.release }} can be used to store time in int64, but no field on vertices can do this because if you store the time information as property values, it will be covered by new insertion. Thus Nebula Graph does not support sequence graphs.

Expand Down