<h1>RAG + Vector Search</h1>

<h3>
    
    1. RAG with min search as done in module 1.
    
    2. RAG with vector search continuing module 2. 
</h3>

<h2> Step 1 : Compiling RAG with minSearch as a setup for vector search: </h2>

In [3]:
import requests
import json

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for i in documents_raw:
    course_name = i['course']

    for j in i['documents']:
        j['course'] = course_name
        documents.append(j)

In [4]:
documents[1]

{'text': 'GitHub - DataTalksClub data-engineering-zoomcamp#prerequisites',
 'section': 'General course-related questions',
 'question': 'Course - What are the prerequisites for this course?',
 'course': 'data-engineering-zoomcamp'}

In [5]:
import minsearch

index = minsearch.Index(
    text_fields=["question", "text", "section"],
    keyword_fields=["course"]
)

index.fit(documents)

<minsearch.minsearch.Index at 0x7deef329bc80>

In [6]:
import google.generativeai as genai

import os

api_key = os.environ.get("GOOGLE_API_KEY")

genai.configure(api_key = api_key)

llmclient = genai.GenerativeModel('gemini-1.5-flash-latest')

In [7]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query=query,
        # filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5
    )

    return results

In [8]:
def build_prompt(query, search_results):
    prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

QUESTION: {question}

CONTEXT: 
{context}
""".strip()

    context = ""
    
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"
    
    prompt = prompt_template.format(question=query, context=context).strip()
    # print(prompt)
    return prompt

In [9]:
def llm(prompt):
    llmclient = genai.GenerativeModel('gemini-1.5-flash-latest')
    response = llmclient.generate_content(contents = prompt)

    return response

In [10]:
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt).text
    return answer

In [11]:
rag('how do I run kafka?')

'The provided text gives instructions for running Kafka producers and consumers in Java and Python.  For Java,  run `java -cp build/libs/<jar_name>-1.0-SNAPSHOT.jar:out src/main/java/org/example/JsonProducer.java` in the project directory. For Python, first create a virtual environment (`python -m venv env`), activate it (`source env/bin/activate` or `env/Scripts/activate` on Windows), install requirements (`pip install -r ../requirements.txt`), and then run the Python files within that environment.  Before running the Python files, ensure all Docker images are running.  If you encounter a  `./build.sh: Permission denied` error when running a Python Kafka script, use `chmod +x build.sh` in the `/docker/spark` directory.\n'

In [12]:
rag('the course has already started, can I still enroll?')

'Yes, you can still join the course even though it has already started.  You may miss some homework assignments, but you can still participate in the course and potentially earn a certificate by completing two out of three projects and reviewing three peer projects by the deadline.\n'

<h2>Step 2: Setting up Vector Search to compile together with already build RAG :</h2>

In [13]:
from qdrant_client import QdrantClient, models

In [14]:
qdclient = QdrantClient("http://localhost:6333")

In [15]:
embedding_dimensionality = 512
model_handle = "jinaai/jina-embeddings-v2-small-en"

In [16]:
collection_name = "zoomcamp-rag"

In [17]:
#qdclient.delete_collection(collection_name = collection_name)

In [None]:
# creating a collection 
qdclient.create_collection(
    collection_name = collection_name,
    vectors_config = models.VectorParams(
        size = embedding_dimensionality,
        distance = models.Distance.COSINE                  
    )
)

In [19]:
# indexing the payload
qdclient.create_payload_index (
    collection_name = collection_name,
    field_name = "course",
    field_schema = "keyword"
)

UpdateResult(operation_id=4, status=<UpdateStatus.COMPLETED: 'completed'>)

In [20]:
# creating points in the vector db
points = []

for i, doc in enumerate(documents):
    text = doc['question'] + ' ' + doc['text']
    vector = models.Document(text = text, model = model_handle)
    point = models.PointStruct(
        id = i,
        vector = vector,
        payload = doc
    )
    points.append(point)

In [21]:
# upserting the data into the points created

qdclient.upsert(
    collection_name = collection_name,
    points = points
)

UpdateResult(operation_id=5, status=<UpdateStatus.COMPLETED: 'completed'>)

In [22]:
# creating the vector search function

def vector_search(question):

    course = 'data-engineering-zoomcamp'
    query_points = qdclient.query_points(
        collection_name = collection_name,
        query = models.Document(
            text = question,
            model = model_handle,
        ),
        query_filter = models.Filter(
            must = [
                models.FieldCondition(
                    key = 'course',
                    match = models.MatchValue(value = course)
                )
            ]
        ),
        limit = 5,
        with_payload = True 
    )

    results = []

    for point in query_points.points:
        results.append(point.payload)

    return results

In [23]:
# encapsulating all into the RAG function

def rag(query):
    search_results = vector_search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt).text
    return answer

In [24]:
rag("how do i run kafka?")

'The provided text describes running Kafka producers and consumers using Java and Python, and troubleshooting related issues.  For Java, you run  `java -cp build/libs/<jar_name>-1.0-SNAPSHOT.jar:out src/main/java/org/example/JsonProducer.java` from the project directory.  For Python,  you need to create a virtual environment (`python -m venv env`), activate it (`source env/bin/activate`), install requirements (`pip install -r ../requirements.txt`), and then run your Python scripts. Ensure your Kafka broker Docker container is running (`docker ps`; `docker compose up -d`) if you encounter `kafka.errors.NoBrokersAvailable`.  Correct server URLs and cluster keys/secrets in your code are also crucial for successful execution.\n'

<h3> 
Conclusion : Vector Search has been integrated with the RAG. Next, Hybrid Search - a mix of keyword search and vector(semantic) search.
</h3>