Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CODEOWNERS info: https://help.github.com/en/articles/about-code-owners
# Owners are automatically requested for review for PRs that changes code
# that they own.
* @rderbier @MichelDiz @damonfeldman @rarvikar @Rajakavitha1
* @dgraph-io/committers @rderbier
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,5 +106,6 @@ Pass custom Go-GRPC example to the runnable by passing a `customExampleGoGRPC` t
**Note:** Runnable doesn't support passing a multiline string as an argument to a shortcode. Therefore, you have to create the whole custom example in a single line string by replacing newlines with `\n`.

## History
v24.0:
=======
add Hypermode banner by updating the hugo-docs repository with topbat template.
v24.0:
14 changes: 14 additions & 0 deletions content/dql/dql-schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ revenue: float .
running_time: int .
starring: [uid] .
director: [uid] .
description: string .

description_vector: float32vector @index(hnsw(metric:"cosine")) .

type Person {
name
Expand All @@ -28,6 +31,8 @@ type Film {
running_time
starring
director
description
description_vector
}
```

Expand Down Expand Up @@ -112,6 +117,15 @@ For all triples with a predicate of scalar types the object is a literal.
are RFC 3339 compatible which is different from ISO 8601(as defined in the RDF spec). You should
convert your values to RFC 3339 format before sending them to Dgraph.{{% /notice %}}

### Vector Type

The `float32vector` type denotes a vector of floating point numbers, i.e an ordered array of float32. A node type can contain more than one vector predicate.

Vectors are normaly used to store embeddings obtained from other information through an ML model. When a `float32vector` is [indexed]({{<relref "dql/predicate-indexing.md">}}), the DQL [similar_to]({{<relref "query-language/functions#vector-similarity-search">}}) function can be used for similarity search.




### UID Type

The `uid` type denotes a relationship; internally each node is identified by it's UID which is a `uint64`.
Expand Down
35 changes: 35 additions & 0 deletions content/dql/predicate-indexing.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@ weight = 4

Filtering on a predicate by applying a [function]({{< relref "query-language/functions.md" >}}) requires an index.

Indices are defined in the [Dgraph types schema]({{<relref "dql/dql-schema.md" >}}) using `@index` directive.

Here are some examples:
```
name: string @index(term) .
release_date: datetime @index(year) .
description_vector: float32vector @index(hnsw(metric:"cosine")) .
```

When filtering by applying a function, Dgraph uses the index to make the search through a potentially large dataset efficient.

All scalar types can be indexed.
Expand All @@ -17,6 +26,8 @@ Types `int`, `float`, `bool` and `geo` have only a default index each: with toke

Types `string` and `dateTime` have a number of indices.

Type `float32vector` supports `hsnw` index.

## String Indices
The indices available for strings are as follows.

Expand All @@ -34,6 +45,30 @@ transaction conflict rate. Use only the minimum number of and simplest indexes
that your application needs.
{{% /notice %}}

## Vector Indices

The indices available for `float32vector` are as follows.

| Dgraph function | Required index / tokenizer | Notes |
| :----------------------- | :------------ | :--- |
| `similar_to` | `hsnw` | HSNW index supports parameters `metric` and `exponent`. |


#

`hsnw` (**Hierarchical Navigable Small World**) index supports the following parameters
- metric : indicate the metric to use to compute vector similarity. One of `cosine`, `euclidean`, and `dotproduct`. Default is `euclidean`.

- exponent : An integer, represented as a string, roughly representing the number of vectors expected in the index in power of 10. The exponent value,is used to set "reasonable defaults" for HSNW internal tuning parameters. Default is "4" (10^4 vectors).


Here are some examples:
```
simple_vector: float32vector @index(hnsw) .
description_vector: float32vector @index(hnsw(metric:"cosine")) .
large_vector: float32vector @index(hnsw(metric:"euclidean",exponent:"6")) .
```

## DateTime Indices

The indices available for `dateTime` are as follows.
Expand Down
15 changes: 15 additions & 0 deletions content/query-language/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,21 @@ Same query with a Levenshtein distance of 3.
}
{{< /runnable >}}

## Vector Similarity Search

Syntax Examples: `similar_to(predicate, 3, "[0.9, 0.8, 0, 0]")`

Alternatively the vector can be passed as a variable: `similar_to(predicate, 3, $vec)`

This function finds the nodes that have `predicate` close to the provided vector. The search is based on the distance metric specified in the index (`cosine`, `euclidean`, or `dotproduct`). The shorter distance indicates more similarity.
The second parameter, `3` specifies that top 3 matches be returned.

Schema Types: `float32vector`

Index Required: `hnsw`



## Full-Text Search

Syntax Examples: `alloftext(predicate, "space-separated text")` and `anyoftext(predicate, "space-separated text")`
Expand Down