Skip to content

Commit

Permalink
add distribution shift documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
guimachiavelli committed Apr 23, 2024
1 parent 8b40874 commit c7f53c4
Showing 1 changed file with 29 additions and 0 deletions.
29 changes: 29 additions & 0 deletions learn/experimental/vector_search.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,35 @@ curl -X POST -H 'content-type: application/json' \
]'
```

### Distribution shift

Hybrid search results do not always use the full vector space uniformly, which may impact relevancy. Use `distributionShift` when configuring an embedder to correct result distribution with an affine transformation:

```sh
curl \
-X PATCH 'http://localhost:7700/indexes/movies/settings' \
-H 'Content-Type: application/json' \
--data-binary '{
"embedders": {
"default": {
"source": "huggingFace",
"model": "MODEL_NAME",
"distribution": {
"mean": 0.7,
"sigma": 0.3
}
}
}
}'
```

`distributionShift` must be an object with two fields:

- `mean`: a number between `0` and `1` indicating the target mean value
- `sigma`: a number between `0` and `1` indicating the allowed variance

Changing `distributionShift` does not trigger a reindexing operation.

### Vector search with auto-embeddings

Perform searches with `q` and `hybrid` to retrieve search results using both keyword and semantic search:
Expand Down

0 comments on commit c7f53c4

Please sign in to comment.