Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 21 additions & 6 deletions docs/docs/core/flow_def.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,22 @@ If the data slice has `Struct` type, you can obtain a data slice on a specific s

## Data Collector

A **data collector** can be added from a specific data scope, and it collects multiple entries of data, represented by data slices in the same or children scope.
A **data collector** can be added from a specific data scope, and it collects multiple entries of data from the same or children scope.

### Collect

Call its `collect()` method to collect a specific entry, which can have multiple fields.
Each field has a name as specified by the argument name, and a value in one of the following representations:

* A `DataSlice`.
* An enum `cocoindex.GeneratedField.UUID` indicating its value is an UUID automatically generated by the engine.
The uuid will remain stable when other collected input values are unchanged.

:::note

An automatically generated UUID field is allowed to appear at most once.

:::

For example,

Expand All @@ -182,17 +196,18 @@ def demo_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataSco
demo_collector = data_scope.add_collector()
with data_scope["documents"].row() as document:
...
demo_collector.collect(filename=document["filename"], summary=document["summary"])

demo_collector.collect(id=cocoindex.GeneratedField.UUID,
filename=document["filename"],
summary=document["summary"])
...
```

</TabItem>
</Tabs>

Here the collector is in the top-level data scope, and it collects `filename` and `summary` fields from each row of `documents` field.

To specify what to do with the collected data, call a specific method on the collector.
Here the collector is in the top-level data scope.
It collects `filename` and `summary` fields from each row of `documents`,
and generates a `id` field with UUID and remains stable when `filename` and `summary` are unchanged.

### Export

Expand Down