From af80a59178fa90abe0411156d6a0d87c236115b0 Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 10:13:35 -0700 Subject: [PATCH 1/6] [Documentation] fix links --- docs/docs/core/flow_methods.mdx | 2 +- docs/docs/sources/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/docs/core/flow_methods.mdx b/docs/docs/core/flow_methods.mdx index 35682185..c9098689 100644 --- a/docs/docs/core/flow_methods.mdx +++ b/docs/docs/core/flow_methods.mdx @@ -210,7 +210,7 @@ A data source may enable one or multiple *change capture mechanisms*: * Configured with a [refresh interval](flow_def#refresh-interval), which is generally applicable to all data sources. * Specific data sources also provide their specific change capture mechanisms. - For example, [`Postgres` source](../ops/sources/#postgres) listens to PostgreSQL's change notifications, [`AmazonS3` source](../ops/sources/#amazons3) watches S3 bucket's change events, and [`GoogleDrive` source](../ops/sources#googledrive) allows polling recent modified files. + For example, [`Postgres` source](../sources/#postgres) listens to PostgreSQL's change notifications, [`AmazonS3` source](../sources/#amazons3) watches S3 bucket's change events, and [`GoogleDrive` source](../sources#googledrive) allows polling recent modified files. See documentations for specific data sources. Change capture mechanisms enable CocoIndex to continuously capture changes from the source data and update the target data accordingly, under live update mode. diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index bce063e9..ce21a19e 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -262,7 +262,7 @@ The spec takes the following fields: :::info Since it only retrieves metadata for recent modified files (up to the previous poll) during polling, - it's typically cheaper than a full refresh by setting the [refresh interval](../core/flow_def#refresh-interval) especially when the folder contains a large number of files. + it's typically cheaper than a full refresh by setting the [refresh interval](/docs/core/flow_def#refresh-interval) especially when the folder contains a large number of files. So you can usually set it with a smaller value compared to the `refresh_interval`. On the other hand, this only detects changes for files that still exist. From 3cb341bdb127d568f166a5bcb6913f5838d3568a Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 10:24:58 -0700 Subject: [PATCH 2/6] [Documentation] source overview --- docs/docs/sources/index.md | 17 +++++++++++++++++ docs/docs/targets/index.md | 2 ++ 2 files changed, 19 insertions(+) diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index ce21a19e..d0e1d8d5 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -6,6 +6,23 @@ description: CocoIndex Built-in Sources # CocoIndex Built-in Sources +In CocoIndex, a source is the data origin you import from (e.g., files, databases, APIs) that feeds into an indexing flow for transformation and retrieval. + +| Source Type | See Also | +|------------------|-------------------------| +| LocalFile | Local File System | +| AmazonS3 | Amazon S3 | +| AzureBlob | Azure Blob Storage | +| GoogleDrive | Google Drive | +| Postgres | PostgreSQL | + +Related: +- [Life cycle of a indexing flow](/docs/core/basics#life-cycle-of-an-indexing-flow) +- [Live Update Tutorial](/docs/tutorials/live_updates) +for change capture mechanisms. + + + ## LocalFile The `LocalFile` source imports files from a local file system. diff --git a/docs/docs/targets/index.md b/docs/docs/targets/index.md index 7915fe7f..3d3caea4 100644 --- a/docs/docs/targets/index.md +++ b/docs/docs/targets/index.md @@ -20,6 +20,8 @@ The way to map data from a data collector to a target depends on data model of t | [Neo4j](/docs/targets/neo4j) | [Property graph](#property-graph-targets) | | [Kuzu](/docs/targets/kuzu) | [Property graph](#property-graph-targets) | +If you are looking for targets beyond here, you can always use [custom targets](/docs/custom_ops/custom_targets) as building blocks. + ## Property Graph Targets Property graph is a widely-adopted model for knowledge graphs, where both nodes and relationships can have properties. From 45d31e3e8d2ccf364a540c578ebbc619fc10db9f Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 11:36:15 -0700 Subject: [PATCH 3/6] Update index.md --- docs/docs/sources/index.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index d0e1d8d5..1db2d69e 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -10,11 +10,11 @@ In CocoIndex, a source is the data origin you import from (e.g., files, database | Source Type | See Also | |------------------|-------------------------| -| LocalFile | Local File System | -| AmazonS3 | Amazon S3 | -| AzureBlob | Azure Blob Storage | -| GoogleDrive | Google Drive | -| Postgres | PostgreSQL | +| LocalFile | [Local File System](/docs/sources#localfile) | +| AmazonS3 | [Amazon S3](/docs/sources#amazons3) | +| AzureBlob | [Azure Blob Storage](/docs/sources#azureblob) | +| GoogleDrive | [Google Drive](/docs/sources#googledrive) | +| Postgres | [PostgreSQL](/docs/sources#postgres) | Related: - [Life cycle of a indexing flow](/docs/core/basics#life-cycle-of-an-indexing-flow) From 3a6bcb2fa78e6df82db632d1e227550ede0baa6a Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 12:00:46 -0700 Subject: [PATCH 4/6] Update index.md --- docs/docs/sources/index.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index 1db2d69e..43ae707e 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -8,13 +8,13 @@ description: CocoIndex Built-in Sources In CocoIndex, a source is the data origin you import from (e.g., files, databases, APIs) that feeds into an indexing flow for transformation and retrieval. -| Source Type | See Also | -|------------------|-------------------------| -| LocalFile | [Local File System](/docs/sources#localfile) | -| AmazonS3 | [Amazon S3](/docs/sources#amazons3) | -| AzureBlob | [Azure Blob Storage](/docs/sources#azureblob) | -| GoogleDrive | [Google Drive](/docs/sources#googledrive) | -| Postgres | [PostgreSQL](/docs/sources#postgres) | +| Source Type | Description | +|----------------|------------------------------------| +| [LocalFile](/docs/sources#localfile) | Local file system | +| [AmazonS3](/docs/sources#amazons3) | Object store (Amazon S3 bucket) | +| [AzureBlob](/docs/sources#azureblob) | Object store (Azure Blob Storage) | +| [GoogleDrive](/docs/sources#googledrive) | Cloud file system (Google Drive) | +| [Postgres](/docs/sources#postgres) | Relational database (Postgres) | Related: - [Life cycle of a indexing flow](/docs/core/basics#life-cycle-of-an-indexing-flow) From aaa3390b185bd40a793e66df213492c66306bd60 Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 12:09:18 -0700 Subject: [PATCH 5/6] Update index.md --- docs/docs/sources/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index 43ae707e..3b4476dd 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -8,7 +8,7 @@ description: CocoIndex Built-in Sources In CocoIndex, a source is the data origin you import from (e.g., files, databases, APIs) that feeds into an indexing flow for transformation and retrieval. -| Source Type | Description | +| Source Type | See Also | |----------------|------------------------------------| | [LocalFile](/docs/sources#localfile) | Local file system | | [AmazonS3](/docs/sources#amazons3) | Object store (Amazon S3 bucket) | From e5962b5786fc15ead64def2b3d7b28017236c123 Mon Sep 17 00:00:00 2001 From: Linghua Jin Date: Fri, 3 Oct 2025 12:22:13 -0700 Subject: [PATCH 6/6] update column --- docs/docs/sources/index.md | 2 +- docs/docs/targets/index.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/docs/sources/index.md b/docs/docs/sources/index.md index 3b4476dd..43ae707e 100644 --- a/docs/docs/sources/index.md +++ b/docs/docs/sources/index.md @@ -8,7 +8,7 @@ description: CocoIndex Built-in Sources In CocoIndex, a source is the data origin you import from (e.g., files, databases, APIs) that feeds into an indexing flow for transformation and retrieval. -| Source Type | See Also | +| Source Type | Description | |----------------|------------------------------------| | [LocalFile](/docs/sources#localfile) | Local file system | | [AmazonS3](/docs/sources#amazons3) | Object store (Amazon S3 bucket) | diff --git a/docs/docs/targets/index.md b/docs/docs/targets/index.md index 3d3caea4..36d117b7 100644 --- a/docs/docs/targets/index.md +++ b/docs/docs/targets/index.md @@ -12,7 +12,7 @@ The way to map data from a data collector to a target depends on data model of t ## Targets Overview -| Target Type | See Also | +| Target Type | Description | |------------------|-------------------------| | [Postgres](/docs/targets/postgres) | Relational Database, Vector Search (PGVector) | | [Qdrant](/docs/targets/qdrant) | Vector Database, Keyword Search |