Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 27 additions & 1 deletion docs/docs/core/data_types.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,30 @@ A struct has a bunch of fields, each with a name and a type.

A table has a collection of rows, each of which is a struct with specified schema.

The first field of a table is always the primary key.
The first field of a table is always the primary key.

## Indexable Types

### Key Types

Currently, the following types are supported as types for key fields:

- `bytes`
- `str`
- `bool`
- `int64`
- `range`
- Struct with all fields being key types

### Vector Type

Users can create vector index on fields with `vector` types.
A vector index also needs to be configured with a similarity metric, and the index is only effective when this metric is used during retrieval.

Following metrics are supported:

| Metric Name | Description | Similarity Order |
|-------------|-------------|------------------|
| `CosineSimilarity` | [Cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) | Larger is more similar |
| `L2Distance` | [L2 distance (a.k.a. Euclidean distance)](https://en.wikipedia.org/wiki/Euclidean_distance) | Smaller is more similar |
| `InnerProduct` | [Inner product](https://en.wikipedia.org/wiki/Inner_product_space) | Larger is more similar |
4 changes: 2 additions & 2 deletions docs/docs/core/flow_def.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -198,8 +198,8 @@ Export must happen at the top level of a flow, i.e. not within any child scopes

* `name`: the name to identify the export target.
* `target_spec`: the storage spec as the export target.
* `primary_key_fields` (optional): the fields to be used as primary key.
* `vector_index` (optional): the fields to create vector index.
* `primary_key_fields` (optional): the fields to be used as primary key. Types of the fields must be supported as key fields. See [Key Types](data_types#key-types) for more details.
* `vector_index` (optional): the fields to create vector index. Each item is a tuple of a field name and a similarity metric. See [Vector Type](data_types#vector-type) for more details about supported similarity metrics.

<Tabs>
<TabItem value="python" label="Python" default>
Expand Down