From d5b7a817ddd9bd4ae66d4f4d069d066a111216c0 Mon Sep 17 00:00:00 2001 From: LJ Date: Wed, 26 Mar 2025 08:54:06 -0700 Subject: [PATCH] Add documents for UUID automatic generation. --- docs/docs/core/flow_def.mdx | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/docs/docs/core/flow_def.mdx b/docs/docs/core/flow_def.mdx index 9f4e5d7c..d8fbd348 100644 --- a/docs/docs/core/flow_def.mdx +++ b/docs/docs/core/flow_def.mdx @@ -167,8 +167,22 @@ If the data slice has `Struct` type, you can obtain a data slice on a specific s ## Data Collector -A **data collector** can be added from a specific data scope, and it collects multiple entries of data, represented by data slices in the same or children scope. +A **data collector** can be added from a specific data scope, and it collects multiple entries of data from the same or children scope. + +### Collect + Call its `collect()` method to collect a specific entry, which can have multiple fields. +Each field has a name as specified by the argument name, and a value in one of the following representations: + +* A `DataSlice`. +* An enum `cocoindex.GeneratedField.UUID` indicating its value is an UUID automatically generated by the engine. + The uuid will remain stable when other collected input values are unchanged. + + :::note + + An automatically generated UUID field is allowed to appear at most once. + + ::: For example, @@ -182,17 +196,18 @@ def demo_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataSco demo_collector = data_scope.add_collector() with data_scope["documents"].row() as document: ... - demo_collector.collect(filename=document["filename"], summary=document["summary"]) - + demo_collector.collect(id=cocoindex.GeneratedField.UUID, + filename=document["filename"], + summary=document["summary"]) ... ``` -Here the collector is in the top-level data scope, and it collects `filename` and `summary` fields from each row of `documents` field. - -To specify what to do with the collected data, call a specific method on the collector. +Here the collector is in the top-level data scope. +It collects `filename` and `summary` fields from each row of `documents`, +and generates a `id` field with UUID and remains stable when `filename` and `summary` are unchanged. ### Export