Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/docs/core/flow_def.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -348,7 +348,7 @@ It will use `Staging__doc_embeddings` as the collection name if the current app
Most time a target storage is created by calling `export()` method on a collector, and this `export()` call comes with configurations needed for the target storage, e.g. options for storage indexes.
Occasionally, you may need to specify some configurations for target storage out of the context of any specific data collector.

For example, for graph database targets like `Neo4j`, you may have a data collector to export data to Neo4j relationships, which will create nodes referenced by various relationships in turn.
For example, for graph database targets like `Neo4j` and `Kuzu`, you may have a data collector to export data to relationships, which will create nodes referenced by various relationships in turn.
These nodes don't directly come from any specific data collector (consider relationships from different data collectors may share the same nodes).
To specify configurations for these nodes, you can *declare* spec for related node labels.

Expand Down
34 changes: 31 additions & 3 deletions docs/docs/ops/storages.md
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,12 @@ graph TD
classDef node font-size:8pt,text-align:left,stroke-width:2;
```

#### Examples

You can find end-to-end examples fitting into any of supported property graphs in the following directories:
* [examples/docs_to_knowledge_graph](https://github.com/cocoindex-io/cocoindex/tree/main/examples/docs_to_knowledge_graph)
* [examples/product_recommendation](https://github.com/cocoindex-io/cocoindex/tree/main/examples/product_recommendation)

### Neo4j

If you don't have a Neo4j database, you can start a Neo4j database using our docker compose config:
Expand All @@ -407,14 +413,14 @@ Please read and agree the license before starting the instance.

:::

The `Neo4j` storage exports each row as a relationship to Neo4j Knowledge Graph. The spec takes the following fields:
The `Neo4j` target spec takes the following fields:

* `connection` (type: [auth reference](../core/flow_def#auth-registry) to `Neo4jConnectionSpec`): The connection to the Neo4j database. `Neo4jConnectionSpec` has the following fields:
* `url` (type: `str`): The URI of the Neo4j database to use as the internal storage, e.g. `bolt://localhost:7687`.
* `user` (type: `str`): Username for the Neo4j database.
* `password` (type: `str`): Password for the Neo4j database.
* `db` (type: `str`, optional): The name of the Neo4j database to use as the internal storage, e.g. `neo4j`.
* `mapping` (type: `NodeMapping | RelationshipMapping`): The mapping from collected row to nodes or relationships of the graph. 2 variations are supported:
* `mapping` (type: `Nodes | Relationships`): The mapping from collected row to nodes or relationships of the graph. For either [nodes to export](#nodes-to-export) or [relationships to export](#relationships-to-export).

Neo4j also provides a declaration spec `Neo4jDeclaration`, to configure indexing options for nodes only referenced by relationships. It has the following fields:

Expand All @@ -424,4 +430,26 @@ Neo4j also provides a declaration spec `Neo4jDeclaration`, to configure indexing
* `primary_key_fields` (required)
* `vector_indexes` (optional)

You can find an end-to-end example [here](https://github.com/cocoindex-io/cocoindex/tree/main/examples/docs_to_knowledge_graph).
### Kuzu

CocoIndex supports talking to Kuzu through its [API server](https://github.com/kuzudb/api-server).
You can bring up a Kuzu API server locally by running:

```bash
KUZU_DB_DIR=$HOME/.kuzudb
KUZU_PORT=8123
docker run -d --name kuzu -p ${KUZU_PORT}:8000 -v ${KUZU_DB_DIR}:/database kuzudb/api-server:latest
```

The `Kuzu` target spec takes the following fields:

* `connection` (type: [auth reference](../core/flow_def#auth-registry) to `KuzuConnectionSpec`): The connection to the Kuzu database. `KuzuConnectionSpec` has the following fields:
* `api_server_url` (type: `str`): The URL of the Kuzu API server, e.g. `http://localhost:8123`.
* `mapping` (type: `Nodes | Relationships`): The mapping from collected row to nodes or relationships of the graph. For either [nodes to export](#nodes-to-export) or [relationships to export](#relationships-to-export).

Kuzu also provides a declaration spec `KuzuDeclaration`, to configure indexing options for nodes only referenced by relationships. It has the following fields:

* `connection` (type: auth reference to `KuzuConnectionSpec`)
* Fields for [nodes to declare](#declare-extra-node-labels), including
* `nodes_label` (required)
* `primary_key_fields` (required)