<a href="https://github.com/Bennykillua/Build_a_RAG_Milvus/blob/main/build_RAG_with_milvus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Preparation
### Dependencies and Environment

In [None]:
! pip install --upgrade pymilvus openai requests tqdm

Collecting pymilvus
  Downloading pymilvus-2.4.7-py3-none-any.whl.metadata (5.6 kB)
Collecting openai
  Downloading openai-1.50.1-py3-none-any.whl.metadata (24 kB)
Collecting environs<=9.5.0 (from pymilvus)
  Downloading environs-9.5.0-py2.py3-none-any.whl.metadata (14 kB)
Collecting ujson>=2.0.0 (from pymilvus)
  Downloading ujson-5.10.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.3 kB)
Collecting milvus-lite<2.5.0,>=2.4.0 (from pymilvus)
  Downloading milvus_lite-2.4.10-py3-none-manylinux2014_x86_64.whl.metadata (9.0 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Downloading jiter-0.5.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Collecting marshmallow>=3.0.0 (from environs<=9.5.0->pymilvus)
  Downloading marshmallow-3.22.0-py3-none-any.whl.metadata (7.2 kB)
Collecting python-dotenv (from environs<=9.5.0->pymilvus)
  Downl

In [None]:
import os

os.environ["OPENAI_API_KEY"] = 'hello'

## Preparing the Data and Embedding Model

In [None]:
import requests
import os

# Base URL for GitHub API to fetch file information
api_url = "https://api.github.com/repos/milvus-io/milvus/contents/docs/developer_guides"
raw_base_url = "https://raw.githubusercontent.com/milvus-io/milvus/master/docs/developer_guides/"
docs_path = "milvus_docs"

# Create a folder to save the downloaded documentation files
if not os.path.exists(docs_path):
    os.makedirs(docs_path)

# Send a request to GitHub API to list all files in the developer_guides directory
response = requests.get(api_url)

# Check if the request was successful
if response.status_code == 200:
    files = response.json()

    # Loop through the files and filter only markdown files
    for file in files:
        if file['name'].endswith('.md'):  # Only select markdown files
            file_url = raw_base_url + file['name']

            # Download each markdown file
            file_response = requests.get(file_url)
            if file_response.status_code == 200:
                # Save the content to a local markdown file
                with open(os.path.join(docs_path, file['name']), "wb") as f:
                    f.write(file_response.content)
                print(f"Downloaded: {file['name']}")
            else:
                print(f"Failed to download: {file_url} (Status code: {file_response.status_code})")
else:
    print(f"Failed to fetch file list from {api_url} (Status code: {response.status_code})")

Downloaded: appendix_a_basic_components.md
Downloaded: appendix_b_api_reference.md
Downloaded: appendix_c_system_configurations.md
Downloaded: appendix_d_error_code.md
Downloaded: appendix_e_statistics.md
Downloaded: chap01_system_overview.md
Downloaded: chap02_schema.md
Downloaded: chap03_index_service.md
Downloaded: chap04_message_stream.md
Downloaded: chap05_proxy.md
Downloaded: chap06_root_coordinator.md
Downloaded: chap07_query_coordinator.md
Downloaded: chap08_binlog.md
Downloaded: chap09_data_coord.md
Downloaded: developer_guides.md
Downloaded: how-guarantee-ts-works-cn.md
Downloaded: how-guarantee-ts-works.md
Downloaded: how_to_develop_with_local_milvus_proto.md
Downloaded: proxy-reduce-cn.md
Downloaded: proxy-reduce.md


In [None]:
from glob import glob

text_lines = []

for file_path in glob(os.path.join(docs_path, "*.md"), recursive=True):
    with open(file_path, "r", encoding="utf-8") as file:
        file_text = file.read()

    # Split text at each heading (assuming # for major sections)
    text_lines += file_text.split("# ")

## Prepare the Embedding Model with OpenAI

In [None]:
from openai import OpenAI

openai_client = OpenAI()

In [None]:
def emb_text(text):
    return (
        openai_client.embeddings.create(input=text, model="text-embedding-3-small")
        .data[0]
        .embedding
    )

## Loading Data into Milvus

In [None]:
pip install -U pymilvus



## Create the Collection

In [None]:
from pymilvus import MilvusClient

milvus_client = MilvusClient(uri="./milvus_demo.db")

collection_name = "my_rag_collection"

DEBUG:pymilvus.milvus_client.milvus_client:Created new connection using: c0e007eb7b19405f84c3907ae4fd22fb


In [None]:
test_embedding = emb_text("This is a test")
embedding_dim = len(test_embedding)
print(embedding_dim)
print(test_embedding[:10])

1536
[0.009889289736747742, -0.005578675772994757, 0.00683477520942688, -0.03805781528353691, -0.01824733428657055, -0.04121600463986397, -0.007636285852640867, 0.03225184231996536, 0.018949154764413834, 9.352207416668534e-05]


Check if the collection already exists and drop it if it does.

In [None]:
if milvus_client.has_collection(collection_name):
    milvus_client.drop_collection(collection_name)

In [None]:
milvus_client.create_collection(
    collection_name=collection_name,
    dimension=embedding_dim,
    metric_type="IP",  # Inner product distance
    consistency_level="Strong",  # Strong consistency level
)

DEBUG:pymilvus.milvus_client.milvus_client:Successfully created collection: my_rag_collection
DEBUG:pymilvus.milvus_client.milvus_client:Successfully created an index on collection: my_rag_collection


### Insert data
Iterate through the text lines, create embeddings, and then insert the data into Milvus.

Here is a new field `text`, which is a non-defined field in the collection schema. It will be automatically added to the reserved JSON dynamic field, which can be treated as a normal field at a high level.

In [None]:
from tqdm import tqdm

data = []

for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")):
    data.append({"id": i, "vector": emb_text(line), "text": line})

milvus_client.insert(collection_name=collection_name, data=data)

Creating embeddings: 100%|██████████| 246/246 [00:54<00:00,  4.52it/s]


{'insert_count': 246, 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 

## Build RAG

In [None]:
question = "What are the key features of Milvus that make it suitable for handling vector databases in AI applications?"

In [None]:
search_res = milvus_client.search(
    collection_name=collection_name,
    data=[
        emb_text(question)
    ],  # Use the `emb_text` function to convert the question to an embedding vector
    limit=3,  # Return top 3 results
    search_params={"metric_type": "IP", "params": {}},  # Inner product distance
    output_fields=["text"],  # Return the text field
)

In [None]:
import json

retrieved_lines_with_distances = [
    (res["entity"]["text"], res["distance"]) for res in search_res[0]
]
print(json.dumps(retrieved_lines_with_distances, indent=4))

[
    [
        "1. System Overview\n\nIn this section, we sketch the system design of Milvus, including the data model, data organization, architecture, and state synchronization.\n\n###",
        0.5789735913276672
    ],
    [
        "1.1 Data Model\n\nMilvus exposes the following set of data features to applications:\n\n- a data model based on schematized relational tables, in that rows must have primary keys,\n\n- a query language specifies data definition, data manipulation, and data query, where data definition includes create, drop, and data manipulation includes insert, upsert, delete, and data query falls into three types, primary key search, approximate nearest neighbor search (ANNS), ANNS with predicates.\n\nThe requests' execution order is strictly in accordance with their issue-time order. We take Proxy's issue time as a request's issue time. For a batch request, all its sub-requests share the same issue time. In cases there are multiple proxies, issue time from differen

## Use LLM to get a RAG response

In [None]:
context = "\n".join(
    [line_with_distance[0] for line_with_distance in retrieved_lines_with_distances]
)

In [None]:
SYSTEM_PROMPT = """
Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided.
"""
USER_PROMPT = f"""
Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.
<context>
{context}
</context>
<question>
{question}
</question>
"""

In [None]:
response = openai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_PROMPT},
    ],
)
print(response.choices[0].message.content)

Milvus has key features that make it suitable for handling vector databases in AI applications:

1. Milvus provides a data model based on schematized relational tables where rows must have primary keys.
2. It supports a query language that specifies data definition, data manipulation, and data query, including operations like create, drop, insert, upsert, delete, primary key search, approximate nearest neighbor search (ANNS), and ANNS with predicates.
3. Requests' execution order follows their issue-time order, ensuring consistency.
4. Milvus guarantees atomic visibility for batch insert/delete operations.
5. Milvus implements a clock mechanism using timestamps to maintain consistency in the reading process by marking data with synchronization timestamps.


## Deploying the System

In [None]:
!pip install streamlit
!pip install pymilvus

Collecting streamlit
  Downloading streamlit-1.38.0-py2.py3-none-any.whl.metadata (8.5 kB)
Collecting tenacity<9,>=8.1.0 (from streamlit)
  Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)
Collecting gitpython!=3.1.19,<4,>=3.0.7 (from streamlit)
  Downloading GitPython-3.1.43-py3-none-any.whl.metadata (13 kB)
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Collecting watchdog<5,>=2.1.5 (from streamlit)
  Downloading watchdog-4.0.2-py3-none-manylinux2014_x86_64.whl.metadata (38 kB)
Collecting gitdb<5,>=4.0.1 (from gitpython!=3.1.19,<4,>=3.0.7->streamlit)
  Downloading gitdb-4.0.11-py3-none-any.whl.metadata (1.2 kB)
Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->gitpython!=3.1.19,<4,>=3.0.7->streamlit)
  Downloading smmap-5.0.1-py3-none-any.whl.metadata (4.3 kB)
Downloading streamlit-1.38.0-py2.py3-none-any.whl (8.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.7/8.7 MB[0m [31m65.5 MB