diff --git a/.docs_static/.nojekyll b/.docs_static/.nojekyll
new file mode 100644
index 00000000..e69de29b
diff --git a/.docs_static/CNAME b/.docs_static/CNAME
new file mode 100644
index 00000000..9b81a347
--- /dev/null
+++ b/.docs_static/CNAME
@@ -0,0 +1 @@
+www.redisvl.com
\ No newline at end of file
diff --git a/.docs_static/index.html b/.docs_static/index.html
new file mode 100644
index 00000000..4e7908c6
--- /dev/null
+++ b/.docs_static/index.html
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/docs/examples/index.md b/docs/examples/index.md
index 97a2ea85..efbe3713 100644
--- a/docs/examples/index.md
+++ b/docs/examples/index.md
@@ -9,7 +9,7 @@ myst:
```{toctree}
-:caption: Getting Started
+openai_qna
```
diff --git a/docs/examples/openai_qna.ipynb b/docs/examples/openai_qna.ipynb
new file mode 100644
index 00000000..7bd29102
--- /dev/null
+++ b/docs/examples/openai_qna.ipynb
@@ -0,0 +1,1001 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Question and Answer with OpenAI and RedisVL\n",
+ "\n",
+ "This example shows how to use RedisVL to create a question and answer system using OpenAI's API.\n",
+ "\n",
+ "In this notebook we will\n",
+ "1. Download a dataset of wikipedia articles (thanks to OpenAI's CDN)\n",
+ "2. Create embeddings for each article\n",
+ "3. Create a RedisVL index and store the embeddings with metadata\n",
+ "4. Construct a simple QnA system using the index and GPT-3\n",
+ "\n",
+ "\n",
+ "The image below shows the architecture of the system we will create in this notebook.\n",
+ "\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Setup\n",
+ "\n",
+ "In order to run this example, you will need to have a Redis instance with RediSearch running locally. You can do this by running the following command in your terminal:\n",
+ "\n",
+ "```bash\n",
+ "docker run --name redis-vecdb -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest\n",
+ "```\n",
+ "\n",
+ "This will also provide the RedisInsight GUI at http://localhost:8001\n",
+ "\n",
+ "Next, we will install the dependencies for this notebook."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# first we need to install a few things\n",
+ "\n",
+ "!pip install pandas wget tenacity tiktoken openai"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'wikipedia_articles_2000.csv'"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import wget\n",
+ "import pandas as pd\n",
+ "\n",
+ "embeddings_url = 'https://cdn.openai.com/API/examples/data/wikipedia_articles_2000.csv'\n",
+ "\n",
+ "wget.download(embeddings_url)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " id | \n",
+ " url | \n",
+ " title | \n",
+ " text | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " 3661 | \n",
+ " https://simple.wikipedia.org/wiki/Photon | \n",
+ " Photon | \n",
+ " Photons (from Greek φως, meaning light), in m... | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " 7796 | \n",
+ " https://simple.wikipedia.org/wiki/Thomas%20Dolby | \n",
+ " Thomas Dolby | \n",
+ " Thomas Dolby (born Thomas Morgan Robertson; 14... | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " 67912 | \n",
+ " https://simple.wikipedia.org/wiki/Embroidery | \n",
+ " Embroidery | \n",
+ " Embroidery is the art of decorating fabric or ... | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " 44309 | \n",
+ " https://simple.wikipedia.org/wiki/Consecutive%... | \n",
+ " Consecutive integer | \n",
+ " Consecutive numbers are numbers that follow ea... | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " 41741 | \n",
+ " https://simple.wikipedia.org/wiki/German%20Empire | \n",
+ " German Empire | \n",
+ " The German Empire (\"Deutsches Reich\" or \"Deuts... | \n",
+ "
\n",
+ " \n",
+ " | ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " | 1995 | \n",
+ " 9252 | \n",
+ " https://simple.wikipedia.org/wiki/Relativity | \n",
+ " Relativity | \n",
+ " The word relativity usually means two things i... | \n",
+ "
\n",
+ " \n",
+ " | 1996 | \n",
+ " 14 | \n",
+ " https://simple.wikipedia.org/wiki/Alanis%20Mor... | \n",
+ " Alanis Morissette | \n",
+ " Alanis Nadine Morissette (born June 1, 1974) i... | \n",
+ "
\n",
+ " \n",
+ " | 1997 | \n",
+ " 49769 | \n",
+ " https://simple.wikipedia.org/wiki/Brontosaurus | \n",
+ " Brontosaurus | \n",
+ " Brontosaurus is a genus of sauropod dinosaur.... | \n",
+ "
\n",
+ " \n",
+ " | 1998 | \n",
+ " 55998 | \n",
+ " https://simple.wikipedia.org/wiki/Work%20%28ph... | \n",
+ " Work (physics) | \n",
+ " In physics, a force does work when it acts on ... | \n",
+ "
\n",
+ " \n",
+ " | 1999 | \n",
+ " 6293 | \n",
+ " https://simple.wikipedia.org/wiki/Syllable | \n",
+ " Syllable | \n",
+ " A syllable is a unit of pronunciation uttered ... | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2000 rows × 4 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " id url \\\n",
+ "0 3661 https://simple.wikipedia.org/wiki/Photon \n",
+ "1 7796 https://simple.wikipedia.org/wiki/Thomas%20Dolby \n",
+ "2 67912 https://simple.wikipedia.org/wiki/Embroidery \n",
+ "3 44309 https://simple.wikipedia.org/wiki/Consecutive%... \n",
+ "4 41741 https://simple.wikipedia.org/wiki/German%20Empire \n",
+ "... ... ... \n",
+ "1995 9252 https://simple.wikipedia.org/wiki/Relativity \n",
+ "1996 14 https://simple.wikipedia.org/wiki/Alanis%20Mor... \n",
+ "1997 49769 https://simple.wikipedia.org/wiki/Brontosaurus \n",
+ "1998 55998 https://simple.wikipedia.org/wiki/Work%20%28ph... \n",
+ "1999 6293 https://simple.wikipedia.org/wiki/Syllable \n",
+ "\n",
+ " title text \n",
+ "0 Photon Photons (from Greek φως, meaning light), in m... \n",
+ "1 Thomas Dolby Thomas Dolby (born Thomas Morgan Robertson; 14... \n",
+ "2 Embroidery Embroidery is the art of decorating fabric or ... \n",
+ "3 Consecutive integer Consecutive numbers are numbers that follow ea... \n",
+ "4 German Empire The German Empire (\"Deutsches Reich\" or \"Deuts... \n",
+ "... ... ... \n",
+ "1995 Relativity The word relativity usually means two things i... \n",
+ "1996 Alanis Morissette Alanis Nadine Morissette (born June 1, 1974) i... \n",
+ "1997 Brontosaurus Brontosaurus is a genus of sauropod dinosaur.... \n",
+ "1998 Work (physics) In physics, a force does work when it acts on ... \n",
+ "1999 Syllable A syllable is a unit of pronunciation uttered ... \n",
+ "\n",
+ "[2000 rows x 4 columns]"
+ ]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df = pd.read_csv('wikipedia_articles_2000.csv')\n",
+ "df = df.drop(columns=['Unnamed: 0'])\n",
+ "df"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Data Preparation\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Text Chunking\n",
+ "\n",
+ "In order to create embeddings for the articles, we will need to chunk the text into smaller pieces. This is because there is a maximum length of text that can be sent to the OpenAI API. The code that follows pulls heavily from this [notebook](https://github.com/openai/openai-cookbook/blob/main/apps/enterprise-knowledge-retrieval/enterprise_knowledge_retrieval.ipynb) by OpenAI\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "TEXT_EMBEDDING_CHUNK_SIZE = 1000\n",
+ "EMBEDDINGS_MODEL = \"text-embedding-ada-002\"\n",
+ "\n",
+ "\n",
+ "def chunks(text, n, tokenizer):\n",
+ " tokens = tokenizer.encode(text)\n",
+ " \"\"\"Yield successive n-sized chunks from text.\n",
+ "\n",
+ " Split a text into smaller chunks of size n, preferably ending at the end of a sentence\n",
+ " \"\"\"\n",
+ " i = 0\n",
+ " while i < len(tokens):\n",
+ " # Find the nearest end of sentence within a range of 0.5 * n and 1.5 * n tokens\n",
+ " j = min(i + int(1.5 * n), len(tokens))\n",
+ " while j > i + int(0.5 * n):\n",
+ " # Decode the tokens and check for full stop or newline\n",
+ " chunk = tokenizer.decode(tokens[i:j])\n",
+ " if chunk.endswith(\".\") or chunk.endswith(\"\\n\"):\n",
+ " break\n",
+ " j -= 1\n",
+ " # If no end of sentence found, use n tokens as the chunk size\n",
+ " if j == i + int(0.5 * n):\n",
+ " j = min(i + n, len(tokens))\n",
+ " yield tokens[i:j]\n",
+ " i = j\n",
+ "\n",
+ "def get_unique_id_for_file_chunk(title, chunk_index):\n",
+ " return str(title+\"-!\"+str(chunk_index))\n",
+ "\n",
+ "def chunk_text(record, tokenizer):\n",
+ " chunked_records = []\n",
+ "\n",
+ " url = record['url']\n",
+ " title = record['title']\n",
+ " file_body_string = record['text']\n",
+ "\n",
+ " \"\"\"Return a list of tuples (text_chunk, embedding) for a text.\"\"\"\n",
+ " token_chunks = list(chunks(file_body_string, TEXT_EMBEDDING_CHUNK_SIZE, tokenizer))\n",
+ " text_chunks = [f'Title: {title};\\n'+ tokenizer.decode(chunk) for chunk in token_chunks]\n",
+ "\n",
+ " for i, text_chunk in enumerate(text_chunks):\n",
+ " doc_id = get_unique_id_for_file_chunk(title, i)\n",
+ " chunked_records.append(({\"id\": doc_id,\n",
+ " \"url\": url,\n",
+ " \"title\": title,\n",
+ " \"content\": text_chunk,\n",
+ " \"file_chunk_index\": i}))\n",
+ " return chunked_records"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Initialise tokenizer\n",
+ "import tiktoken\n",
+ "oai_tokenizer = tiktoken.get_encoding(\"cl100k_base\")\n",
+ "\n",
+ "records = []\n",
+ "for _, record in df.iterrows():\n",
+ " records.extend(chunk_text(record, oai_tokenizer))\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " id | \n",
+ " url | \n",
+ " title | \n",
+ " content | \n",
+ " file_chunk_index | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Photon-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Photon | \n",
+ " Photon | \n",
+ " Title: Photon;\\nPhotons (from Greek φως, mean... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Photon-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Photon | \n",
+ " Photon | \n",
+ " Title: Photon;\\nElementary particles | \n",
+ " 1 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Thomas Dolby-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Thomas%20Dolby | \n",
+ " Thomas Dolby | \n",
+ " Title: Thomas Dolby;\\nThomas Dolby (born Thoma... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Embroidery-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Embroidery | \n",
+ " Embroidery | \n",
+ " Title: Embroidery;\\nEmbroidery is the art of d... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Consecutive integer-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Consecutive%... | \n",
+ " Consecutive integer | \n",
+ " Title: Consecutive integer;\\nConsecutive numbe... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " | 2688 | \n",
+ " Alanis Morissette-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Alanis%20Mor... | \n",
+ " Alanis Morissette | \n",
+ " Title: Alanis Morissette;\\nTwin people from Ca... | \n",
+ " 1 | \n",
+ "
\n",
+ " \n",
+ " | 2689 | \n",
+ " Brontosaurus-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Brontosaurus | \n",
+ " Brontosaurus | \n",
+ " Title: Brontosaurus;\\nBrontosaurus is a genus... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 2690 | \n",
+ " Work (physics)-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Work%20%28ph... | \n",
+ " Work (physics) | \n",
+ " Title: Work (physics);\\nIn physics, a force do... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 2691 | \n",
+ " Syllable-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Syllable | \n",
+ " Syllable | \n",
+ " Title: Syllable;\\nA syllable is a unit of pron... | \n",
+ " 0 | \n",
+ "
\n",
+ " \n",
+ " | 2692 | \n",
+ " Syllable-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Syllable | \n",
+ " Syllable | \n",
+ " Title: Syllable;\\nGrammar | \n",
+ " 1 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2693 rows × 5 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " id \\\n",
+ "0 Photon-!0 \n",
+ "1 Photon-!1 \n",
+ "2 Thomas Dolby-!0 \n",
+ "3 Embroidery-!0 \n",
+ "4 Consecutive integer-!0 \n",
+ "... ... \n",
+ "2688 Alanis Morissette-!1 \n",
+ "2689 Brontosaurus-!0 \n",
+ "2690 Work (physics)-!0 \n",
+ "2691 Syllable-!0 \n",
+ "2692 Syllable-!1 \n",
+ "\n",
+ " url title \\\n",
+ "0 https://simple.wikipedia.org/wiki/Photon Photon \n",
+ "1 https://simple.wikipedia.org/wiki/Photon Photon \n",
+ "2 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby \n",
+ "3 https://simple.wikipedia.org/wiki/Embroidery Embroidery \n",
+ "4 https://simple.wikipedia.org/wiki/Consecutive%... Consecutive integer \n",
+ "... ... ... \n",
+ "2688 https://simple.wikipedia.org/wiki/Alanis%20Mor... Alanis Morissette \n",
+ "2689 https://simple.wikipedia.org/wiki/Brontosaurus Brontosaurus \n",
+ "2690 https://simple.wikipedia.org/wiki/Work%20%28ph... Work (physics) \n",
+ "2691 https://simple.wikipedia.org/wiki/Syllable Syllable \n",
+ "2692 https://simple.wikipedia.org/wiki/Syllable Syllable \n",
+ "\n",
+ " content file_chunk_index \n",
+ "0 Title: Photon;\\nPhotons (from Greek φως, mean... 0 \n",
+ "1 Title: Photon;\\nElementary particles 1 \n",
+ "2 Title: Thomas Dolby;\\nThomas Dolby (born Thoma... 0 \n",
+ "3 Title: Embroidery;\\nEmbroidery is the art of d... 0 \n",
+ "4 Title: Consecutive integer;\\nConsecutive numbe... 0 \n",
+ "... ... ... \n",
+ "2688 Title: Alanis Morissette;\\nTwin people from Ca... 1 \n",
+ "2689 Title: Brontosaurus;\\nBrontosaurus is a genus... 0 \n",
+ "2690 Title: Work (physics);\\nIn physics, a force do... 0 \n",
+ "2691 Title: Syllable;\\nA syllable is a unit of pron... 0 \n",
+ "2692 Title: Syllable;\\nGrammar 1 \n",
+ "\n",
+ "[2693 rows x 5 columns]"
+ ]
+ },
+ "execution_count": 25,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "chunked_data = pd.DataFrame(records)\n",
+ "chunked_data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Embedding Creation\n",
+ "\n",
+ "With the text broken up into chunks, we can create embedding with the RedisVL OpenAIProvider. This provider uses the OpenAI API to create embeddings for the text. The code below shows how to create embeddings for the text chunks."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 47,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " id | \n",
+ " url | \n",
+ " title | \n",
+ " content | \n",
+ " file_chunk_index | \n",
+ " embedding | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Photon-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Photon | \n",
+ " Photon | \n",
+ " Title: Photon;\\nPhotons (from Greek φως, mean... | \n",
+ " 0 | \n",
+ " b'\\xc2\\xf8\\xc9;\\xa7]\\xfb;\\x88\\x90P\\xbc`\\xcc\\x9... | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Photon-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Photon | \n",
+ " Photon | \n",
+ " Title: Photon;\\nElementary particles | \n",
+ " 1 | \n",
+ " b'\\x03\\x1d#\\xbc\\x00c\\x8d<\\xae\\xcam\\xbc\\xc5\\x1f... | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Thomas Dolby-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Thomas%20Dolby | \n",
+ " Thomas Dolby | \n",
+ " Title: Thomas Dolby;\\nThomas Dolby (born Thoma... | \n",
+ " 0 | \n",
+ " b'k\\xaf\\xcc\\xbc\\x89\\xe5\\xad;3\\xea\\xd8\\xbc+\\x81... | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Embroidery-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Embroidery | \n",
+ " Embroidery | \n",
+ " Title: Embroidery;\\nEmbroidery is the art of d... | \n",
+ " 0 | \n",
+ " b'07\\xf5\\xbc\\xaf\\xcb\\x02\\xbc\\x90\\xe6N\\xbc\\x84\\... | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " Consecutive integer-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Consecutive%... | \n",
+ " Consecutive integer | \n",
+ " Title: Consecutive integer;\\nConsecutive numbe... | \n",
+ " 0 | \n",
+ " b'0(\\xfa\\xbb\\x81\\xd2\\xd9;\\xaf\\x92\\x9a;\\xd3FL\\x... | \n",
+ "
\n",
+ " \n",
+ " | ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " | 2688 | \n",
+ " Alanis Morissette-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Alanis%20Mor... | \n",
+ " Alanis Morissette | \n",
+ " Title: Alanis Morissette;\\nTwin people from Ca... | \n",
+ " 1 | \n",
+ " b'\\xc1K5\\xbc\\xb8\"\\xe0\\xbc\\x17A\\x07\\xbb\\xb0\\xbc... | \n",
+ "
\n",
+ " \n",
+ " | 2689 | \n",
+ " Brontosaurus-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Brontosaurus | \n",
+ " Brontosaurus | \n",
+ " Title: Brontosaurus;\\nBrontosaurus is a genus... | \n",
+ " 0 | \n",
+ " b'3\\xf0\\xda\\xbcY\\xc0\\xb4:\\x1cN\\x81\\xbc\\xe9\\xcc... | \n",
+ "
\n",
+ " \n",
+ " | 2690 | \n",
+ " Work (physics)-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Work%20%28ph... | \n",
+ " Work (physics) | \n",
+ " Title: Work (physics);\\nIn physics, a force do... | \n",
+ " 0 | \n",
+ " b'\\x97\\x82\\xb9\\xbbL\\x90d\\xbc\\xb7G\\x9c\\xba\\x94g... | \n",
+ "
\n",
+ " \n",
+ " | 2691 | \n",
+ " Syllable-!0 | \n",
+ " https://simple.wikipedia.org/wiki/Syllable | \n",
+ " Syllable | \n",
+ " Title: Syllable;\\nA syllable is a unit of pron... | \n",
+ " 0 | \n",
+ " b'\\xe4\\xa3\\x1c:\\x83g\\x90<\\x99=s;*[E\\xbb\\x10 \"\\... | \n",
+ "
\n",
+ " \n",
+ " | 2692 | \n",
+ " Syllable-!1 | \n",
+ " https://simple.wikipedia.org/wiki/Syllable | \n",
+ " Syllable | \n",
+ " Title: Syllable;\\nGrammar | \n",
+ " 1 | \n",
+ " b'\\x17U+\\xbb\\xe4\\xea\\x86;;\\\\\\x9a:^\\x82\\xc6:\\x1... | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2693 rows × 6 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " id \\\n",
+ "0 Photon-!0 \n",
+ "1 Photon-!1 \n",
+ "2 Thomas Dolby-!0 \n",
+ "3 Embroidery-!0 \n",
+ "4 Consecutive integer-!0 \n",
+ "... ... \n",
+ "2688 Alanis Morissette-!1 \n",
+ "2689 Brontosaurus-!0 \n",
+ "2690 Work (physics)-!0 \n",
+ "2691 Syllable-!0 \n",
+ "2692 Syllable-!1 \n",
+ "\n",
+ " url title \\\n",
+ "0 https://simple.wikipedia.org/wiki/Photon Photon \n",
+ "1 https://simple.wikipedia.org/wiki/Photon Photon \n",
+ "2 https://simple.wikipedia.org/wiki/Thomas%20Dolby Thomas Dolby \n",
+ "3 https://simple.wikipedia.org/wiki/Embroidery Embroidery \n",
+ "4 https://simple.wikipedia.org/wiki/Consecutive%... Consecutive integer \n",
+ "... ... ... \n",
+ "2688 https://simple.wikipedia.org/wiki/Alanis%20Mor... Alanis Morissette \n",
+ "2689 https://simple.wikipedia.org/wiki/Brontosaurus Brontosaurus \n",
+ "2690 https://simple.wikipedia.org/wiki/Work%20%28ph... Work (physics) \n",
+ "2691 https://simple.wikipedia.org/wiki/Syllable Syllable \n",
+ "2692 https://simple.wikipedia.org/wiki/Syllable Syllable \n",
+ "\n",
+ " content file_chunk_index \\\n",
+ "0 Title: Photon;\\nPhotons (from Greek φως, mean... 0 \n",
+ "1 Title: Photon;\\nElementary particles 1 \n",
+ "2 Title: Thomas Dolby;\\nThomas Dolby (born Thoma... 0 \n",
+ "3 Title: Embroidery;\\nEmbroidery is the art of d... 0 \n",
+ "4 Title: Consecutive integer;\\nConsecutive numbe... 0 \n",
+ "... ... ... \n",
+ "2688 Title: Alanis Morissette;\\nTwin people from Ca... 1 \n",
+ "2689 Title: Brontosaurus;\\nBrontosaurus is a genus... 0 \n",
+ "2690 Title: Work (physics);\\nIn physics, a force do... 0 \n",
+ "2691 Title: Syllable;\\nA syllable is a unit of pron... 0 \n",
+ "2692 Title: Syllable;\\nGrammar 1 \n",
+ "\n",
+ " embedding \n",
+ "0 b'\\xc2\\xf8\\xc9;\\xa7]\\xfb;\\x88\\x90P\\xbc`\\xcc\\x9... \n",
+ "1 b'\\x03\\x1d#\\xbc\\x00c\\x8d<\\xae\\xcam\\xbc\\xc5\\x1f... \n",
+ "2 b'k\\xaf\\xcc\\xbc\\x89\\xe5\\xad;3\\xea\\xd8\\xbc+\\x81... \n",
+ "3 b'07\\xf5\\xbc\\xaf\\xcb\\x02\\xbc\\x90\\xe6N\\xbc\\x84\\... \n",
+ "4 b'0(\\xfa\\xbb\\x81\\xd2\\xd9;\\xaf\\x92\\x9a;\\xd3FL\\x... \n",
+ "... ... \n",
+ "2688 b'\\xc1K5\\xbc\\xb8\"\\xe0\\xbc\\x17A\\x07\\xbb\\xb0\\xbc... \n",
+ "2689 b'3\\xf0\\xda\\xbcY\\xc0\\xb4:\\x1cN\\x81\\xbc\\xe9\\xcc... \n",
+ "2690 b'\\x97\\x82\\xb9\\xbbL\\x90d\\xbc\\xb7G\\x9c\\xba\\x94g... \n",
+ "2691 b'\\xe4\\xa3\\x1c:\\x83g\\x90<\\x99=s;*[E\\xbb\\x10 \"\\... \n",
+ "2692 b'\\x17U+\\xbb\\xe4\\xea\\x86;;\\\\\\x9a:^\\x82\\xc6:\\x1... \n",
+ "\n",
+ "[2693 rows x 6 columns]"
+ ]
+ },
+ "execution_count": 47,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import os\n",
+ "from redisvl.providers.openai import OpenAIProvider\n",
+ "from redisvl.utils.utils import array_to_buffer\n",
+ "\n",
+ "api_key = os.environ.get(\"OPENAI_API_KEY\", \"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\")\n",
+ "oaip = OpenAIProvider(EMBEDDINGS_MODEL, api_config={\"api_key\": api_key})\n",
+ "\n",
+ "chunked_data[\"embedding\"] = oaip.embed_many(chunked_data[\"content\"].tolist())\n",
+ "chunked_data[\"embedding\"] = chunked_data[\"embedding\"].apply(lambda x: array_to_buffer(x))\n",
+ "chunked_data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Construct the ``SearchIndex``\n",
+ "\n",
+ "Now that we have the embeddings, we can create a ``SearchIndex`` to store them in Redis. We will use the ``SearchIndex`` to store the embeddings and metadata for each article."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 32,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Writing wiki_schema.yaml\n"
+ ]
+ }
+ ],
+ "source": [
+ "%%writefile wiki_schema.yaml\n",
+ "\n",
+ "index:\n",
+ " name: wiki\n",
+ " prefix: oaiWiki\n",
+ " key_field: id\n",
+ " storage_type: hash\n",
+ "\n",
+ "fields:\n",
+ " text:\n",
+ " - name: content\n",
+ " - name: title\n",
+ " tag:\n",
+ " - name: id\n",
+ " vector:\n",
+ " - name: embedding\n",
+ " dims: 1536\n",
+ " distance_metric: cosine\n",
+ " algorithm: flat"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from redisvl.index import AsyncSearchIndex\n",
+ "\n",
+ "index = AsyncSearchIndex.from_yaml(\"wiki_schema.yaml\")\n",
+ "index.connect(\"redis://localhost:6379\")\n",
+ "\n",
+ "await index.create()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\u001b[32m17:55:08\u001b[0m \u001b[35msam.partee-NW9MQX5Y74\u001b[0m \u001b[34mredisvl.cli.index[44333]\u001b[0m \u001b[1;30mINFO\u001b[0m Indices:\n",
+ "\u001b[32m17:55:08\u001b[0m \u001b[35msam.partee-NW9MQX5Y74\u001b[0m \u001b[34mredisvl.cli.index[44333]\u001b[0m \u001b[1;30mINFO\u001b[0m 1. wiki\n"
+ ]
+ }
+ ],
+ "source": [
+ "!rvl index listall"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 48,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "await index.load(chunked_data.to_dict(orient=\"records\"))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Build the QnA System\n",
+ "\n",
+ "Now that we have the data and the embeddings, we can build the QnA system. The system will perform three actions\n",
+ "\n",
+ "1. Embed the user question and search for the most similar content\n",
+ "2. Make a prompt with the query and retrieved content\n",
+ "3. Send the prompt to the OpenAI API and return the answer\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 46,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import openai\n",
+ "from redisvl.query import VectorQuery"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 56,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "CHAT_MODEL = \"gpt-3.5-turbo\"\n",
+ "\n",
+ "def make_prompt(query, content):\n",
+ " retrieval_prompt = f'''Use the content to answer the search query the customer has sent.\n",
+ " If you can't answer the user's question, do not guess. If there is no content, respond with \"I don't know\".\n",
+ "\n",
+ " Search query:\n",
+ "\n",
+ " {query}\n",
+ "\n",
+ " Content:\n",
+ "\n",
+ " {content}\n",
+ "\n",
+ " Answer:\n",
+ " '''\n",
+ " return retrieval_prompt\n",
+ "\n",
+ "async def retrieve_context(query):\n",
+ " # Embed the query\n",
+ " query_embedding = oaip.embed(query)\n",
+ "\n",
+ " # Get the top result from the index\n",
+ " vector_query = VectorQuery(\n",
+ " vector=query_embedding,\n",
+ " vector_field_name=\"embedding\",\n",
+ " return_fields=[\"content\"],\n",
+ " num_results=1\n",
+ " )\n",
+ "\n",
+ " results = await index.search(vector_query.query, query_params=vector_query.params)\n",
+ " content = \"\"\n",
+ " if len(results.docs) > 1:\n",
+ " content = results.docs[0][\"content\"]\n",
+ " return content\n",
+ "\n",
+ "\n",
+ "async def answer_question(query):\n",
+ "\n",
+ " # Retrieve the context\n",
+ " content = await retrieve_context(query)\n",
+ " \n",
+ " prompt = make_prompt(query, content)\n",
+ " retrieval = await openai.ChatCompletion.acreate(\n",
+ " model=CHAT_MODEL,\n",
+ " messages=[{'role':\"user\",\n",
+ " 'content': prompt}],\n",
+ " max_tokens=500)\n",
+ "\n",
+ " # Response provided by GPT-3.5\n",
+ " return retrieval['choices'][0]['message']['content']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 65,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['A Brontosaurus is a genus of large, herbivorous dinosaurs that lived during the',\n",
+ " 'Late Jurassic period, around 150 million years ago. They were characterized by',\n",
+ " 'their long necks and tails, and were among the largest land animals to have ever',\n",
+ " 'lived. However, the name \"Brontosaurus\" is no longer considered valid in modern',\n",
+ " 'scientific classification. In the early 20th century, it was discovered that',\n",
+ " 'Brontosaurus fossils were actually the same as those of another dinosaur species',\n",
+ " 'called Apatosaurus. As a result, the name Brontosaurus was dropped and the',\n",
+ " 'species was officially classified as Apatosaurus excelsus.']"
+ ]
+ },
+ "execution_count": 65,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import textwrap\n",
+ "\n",
+ "question = \"What is a Brontosaurus?\"\n",
+ "textwrap.wrap(await answer_question(question), width=80)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 55,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "\"I don't know.\""
+ ]
+ },
+ "execution_count": 55,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "# Question that makes no sense\n",
+ "question = \"What is a trackiosamidon?\"\n",
+ "await answer_question(question)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 66,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "['Alanis Morissette is a Canadian-American singer-songwriter, known for',\n",
+ " 'her powerful and emotive vocal style. She gained international fame in',\n",
+ " 'the 1990s with her groundbreaking album \"Jagged Little Pill.\" Born on',\n",
+ " 'June 1, 1974, in Ottawa, Canada, Morissette started her career in',\n",
+ " 'music at a young age and released her first studio album at the age of',\n",
+ " '16. However, it was her third album, \"Jagged Little Pill,\" released in',\n",
+ " '1995, that propelled her to superstardom. The album became a cultural',\n",
+ " 'phenomenon, selling millions of copies worldwide and earning her',\n",
+ " \"numerous awards, including Grammy Awards. Morissette's music often\",\n",
+ " 'explores themes of love, anger, and personal growth. She has continued',\n",
+ " 'to release albums and tour extensively, maintaining a dedicated fan',\n",
+ " 'base. Additionally, Morissette has also delved into acting and',\n",
+ " 'activism, using her platform to raise awareness and advocate for',\n",
+ " \"various causes. Overall, Alanis Morissette's life has been marked by\",\n",
+ " 'her musical talent, successful career, and passion for using her voice',\n",
+ " 'to make a difference.']"
+ ]
+ },
+ "execution_count": 66,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "question = \"Tell me about the life of Alanis Morissette\"\n",
+ "textwrap.wrap(await answer_question(question))"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "rvl",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.13"
+ },
+ "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/user_guide/index.md b/docs/user_guide/index.md
index e039d7b9..bddfa55d 100644
--- a/docs/user_guide/index.md
+++ b/docs/user_guide/index.md
@@ -8,11 +8,6 @@ myst:
# User Guide
-```{danger}
-RedisVL is still under active development and is subject to change at any time.
-```
-
-
```{toctree}
:caption: Introduction
:maxdepth: 3