Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@

[![Codecov](https://img.shields.io/codecov/c/github/RedisVentures/RedisVL/dev?label=Codecov&logo=codecov&token=E30WxqBeJJ)](https://codecov.io/gh/RedisVentures/RedisVL)
[![License](https://img.shields.io/badge/License-BSD-3--blue.svg)](https://opensource.org/licenses/mit/)

![Language](https://img.shields.io/github/languages/top/RedisVentures/RedisVL)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
![GitHub last commit](https://img.shields.io/github/last-commit/RedisVentures/RedisVL)
![GitHub deployments](https://img.shields.io/github/deployments/RedisVentures/RedisVL/github-pages?label=doc%20build)

RedisVL provides a powerful Python client library for using Redis as a Vector Database. Leverage the speed and reliability of Redis along with vector-based semantic search capabilities to supercharge your application!

**Note:** This project is rapidly evolving, and the API may change frequently. Always refer to the most recent [documentation](https://redisvl.com/docs).
**Note:** This project is rapidly evolving, and the API may change frequently. Always refer to the most recent [documentation](https://www.redisvl.com).
## 🚀 What is RedisVL?

Vector databases have become increasingly popular in recent years due to their ability to store and retrieve vectors efficiently. However, most vector databases are complex to use and require a lot of time and effort to set up. RedisVL aims to solve this problem by providing a simple and intuitive interface for using Redis as a vector database.
Expand All @@ -32,7 +35,7 @@ RedisVL has a host of powerful features designed to streamline your vector datab

Please note that this library is still under heavy development, and while you can quickly try RedisVL and deploy it in a production environment, the API may be subject to change at any time.

`pip install redisvl`
`pip install redisvl` (Coming Soon)

## Example Usage

Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ hybrid_queries_02
:caption: Providers
:maxdepth: 3

embedding_creation
providers_03
```

```{toctree}
Expand Down
265 changes: 265 additions & 0 deletions docs/user_guide/providers_03.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,265 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Embedding Providers\n",
"\n",
"In this notebook, we will show how to use RedisVL to create embeddings using the built-in Providers. Today RedisVL supports:\n",
"1. OpenAI\n",
"2. HuggingFace\n",
"\n",
"Before running this notebook, be sure to\n",
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
"2. Have a running Redis instance with RediSearch > 2.4 running.\n"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"# import necessary modules\n",
"import os\n",
"from redisvl.utils.utils import array_to_buffer"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Creating Embeddings\n",
"\n",
"This example will show how to create an embedding from 3 simple sentences with a number of different providers\n",
"\n",
"- \"That is a happy dog\"\n",
"- \"That is a happy person\"\n",
"- \"Today is a nice day\"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Huggingface\n",
"\n",
"Huggingface is a popular NLP library that has a number of pre-trained models. RedisVL supports using Huggingface to create embeddings from these models. To use Huggingface, you will need to install the ``sentence-transformers`` library.\n",
"\n",
"```bash\n",
"pip install sentence-transformers\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0.00037813105154782534,\n",
" -0.05080341547727585,\n",
" -0.03514720872044563,\n",
" -0.023251093924045563,\n",
" -0.04415826499462128,\n",
" 0.020487893372774124,\n",
" 0.0014619074063375592,\n",
" 0.03126181662082672,\n",
" 0.056051574647426605,\n",
" 0.0188154224306345]"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"os.environ[\"TOKENIZERS_PARALLELISM\"] = \"false\"\n",
"from redisvl.providers import HuggingfaceProvider\n",
"\n",
"\n",
"# create a provider\n",
"hf = HuggingfaceProvider(model=\"sentence-transformers/all-mpnet-base-v2\")\n",
"\n",
"# embed a sentence\n",
"test = hf.embed(\"This is a test sentence.\")\n",
"test[:10]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"# You can also create many embeddings at once\n",
"\n",
"sentences = [\n",
" \"That is a happy dog\",\n",
" \"That is a happy person\",\n",
" \"Today is a sunny day\"\n",
"]\n",
"\n",
"embeddings = hf.embed_many(sentences)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Search with Provider Embeddings\n",
"\n",
"Now that we've created our embeddings, we can use them to search for similar sentences. We will use the same 3 sentences from above and search for similar sentences.\n",
"\n",
"First, we need to create the schema for our index.\n",
"\n",
"Here's what the schema for the example looks like in yaml for the HuggingFace Provider\n",
"\n",
"```yaml\n",
"index:\n",
" name: providers\n",
" prefix: rvl\n",
" storage_type: hash\n",
"\n",
"fields:\n",
" text:\n",
" - name: sentence\n",
" vector:\n",
" - name: embedding\n",
" dims: 768\n",
" algorithm: flat\n",
" distance_metric: cosine\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"from redisvl.index import SearchIndex\n",
"\n",
"# construct a search index from the schema\n",
"index = SearchIndex.from_yaml(\"./schema.yaml\")\n",
"\n",
"# connect to local redis instance\n",
"index.connect(\"redis://localhost:6379\")\n",
"\n",
"# create the index (no data yet)\n",
"index.create(overwrite=True)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[32m15:50:34\u001b[0m \u001b[35msam.partee-NW9MQX5Y74\u001b[0m \u001b[34mredisvl.cli.index[33382]\u001b[0m \u001b[1;30mINFO\u001b[0m Indices:\n",
"\u001b[32m15:50:34\u001b[0m \u001b[35msam.partee-NW9MQX5Y74\u001b[0m \u001b[34mredisvl.cli.index[33382]\u001b[0m \u001b[1;30mINFO\u001b[0m 1. providers\n"
]
}
],
"source": [
"# use the CLI to see the created index\n",
"!rvl index listall"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# load expects an iterable of dictionaries where\n",
"# the vector is stored as a bytes buffer\n",
"\n",
"data = [{\"text\": t,\n",
" \"embedding\": array_to_buffer(v)}\n",
" for t, v in zip(sentences, embeddings)]\n",
"\n",
"index.load(data)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"That is a happy dog\n",
"0.160862445831\n",
"That is a happy person\n",
"0.273598074913\n",
"Today is a sunny day\n",
"0.744559526443\n"
]
}
],
"source": [
"from redisvl.query import VectorQuery\n",
"\n",
"# use the HuggingFace Provider again to create a query embedding\n",
"query_embedding = hf.embed(\"That is a happy cat\")\n",
"\n",
"query = VectorQuery(\n",
" vector=query_embedding,\n",
" vector_field_name=\"embedding\",\n",
" return_fields=[\"text\"],\n",
" num_results=3\n",
")\n",
"\n",
"results = index.search(query.query, query_params=query.params)\n",
"for doc in results.docs:\n",
" print(doc.text)\n",
" print(doc.vector_distance)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.13 ('redisvl2')",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "9b1e6e9c2967143209c2f955cb869d1d3234f92dc4787f49f155f3abbdfb1316"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
14 changes: 14 additions & 0 deletions docs/user_guide/schema.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@

index:
name: providers
prefix: rvl
storage_type: hash

fields:
text:
- name: sentence
vector:
- name: embedding
dims: 768
algorithm: flat
distance_metric: cosine
1 change: 0 additions & 1 deletion redisvl/index.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,6 @@ def load(self, data: Iterable[Dict[str, Any]], **kwargs):
raise TypeError("data must be an iterable of dictionaries")

for record in data:
# TODO don't use colon if no prefix
key = f"{self._prefix}:{self._get_key_field(record)}"
self._redis_conn.hset(key, mapping=record) # type: ignore

Expand Down