# Introduction

In this notebook, I will use the langchain ollama integration ()


In [1]:
# Import required libraries
from langchain_ollama import OllamaEmbeddings, OllamaLLM
import chromadb
from pathlib import Path
from typing import List

In [2]:
# Define the LLM model to be used
llm_model = "steamdj/llama3.1-cpu-only"
print(f"embedding model name: {llm_model}")

# define chroma db storage path
current_dir:Path = Path.cwd()
chroma_db_path:Path = current_dir.joinpath("vector_db")

# define the max context documents which the vector db can return
MAX_CONTEXT_DOC_SIZE:int = 5

print(f"vector db path: {chroma_db_path}")

embedding model name: steamdj/llama3.1-cpu-only
vector db path: /home/pliu/git/LLMPlayplace/ollama/02_advance/ollama_rag/vector_db


## Step1. Prepare indexing system

In this step, we prepare an indexing system which contains two parts:
  - An embedding model: which can transfer documents into vectors.
  - A vector DB: which stores the embeddings (The list of the vectors)

In this tutorial
- We use `langchain_ollama` api to connect to an ollama model runtime to implement the embedding model.
- We use `ChromaDB` as the vector db.


### 1.1 Building the embedding model

In [3]:
# create an embedding model which can transform the input document into vector.
# In this example, we use the api of `langchain_ollama` which links to a local ollama model
embedding_model = OllamaEmbeddings(
        model=llm_model,
        base_url="http://localhost:11434"  # Adjust the base URL as per your Ollama server configuration
    )

print(type(embedding_model))

<class 'langchain_ollama.embeddings.OllamaEmbeddings'>


In [4]:
# Define a custom embedding function for the vector db(ChromaDB)
class EmbeddingFunction:
    """
    Custom embedding function which takes an embedding model(compatible with ChromaDB).
    """
    def __init__(self, embeddings_model: OllamaEmbeddings):
        self.langchain_embeddings = embeddings_model

    def __call__(self, input:List[str]):
        # the signature of this function must be __call__(self, input). You can not modify the argument name
        # Ensure the input is in a list format for processing
        if isinstance(input, str):
            input:List[str] = [input]
        return self.langchain_embeddings.embed_documents(input)

# Initialize the embedding function with Ollama embeddings
embedding = EmbeddingFunction(embedding_model)

### 1.2 Building the vector db

In [5]:
# Configure ChromaDB
# Initialize the ChromaDB client with persistent storage in the current directory
chroma_client = chromadb.PersistentClient(path=chroma_db_path.as_posix())

# Define a collection for the RAG workflow
collection_name = "casd_doc_collection"
collection = chroma_client.get_or_create_collection(
    name=collection_name,
    metadata={"description": "A collection for RAG for casd"},
    embedding_function=embedding  # Use the custom embedding function
)

# Function to add documents to the ChromaDB collection
def add_documents_to_collection(documents:List[str], ids:List[str]):
   """
   This function takes two list of strings and adds them to the ChromaDB collection.
   :param documents: A list of documents, each document represents the content of the document.
   :param ids: A list of ids, each item is the id of the document.
   :return: 
   """
   collection.add(
        documents=documents,
        ids=ids
   )


## Step 2: Prepare retrieval system

In this step, we prepare a system which can retrive context based on the giving user query.

In [6]:
# Function to query the ChromaDB collection
def query_chromadb(query_text:str, n_results=MAX_CONTEXT_DOC_SIZE):
    """
    The function takes a query text and returns a list of related documents in the vector db.
    :param query_text: user input query text
    :param n_results: The number of results to return
    :return:
    """
    results = collection.query(
        query_texts=[query_text],
        n_results=n_results
    )
    return results["documents"], results["metadatas"]



In [7]:
def build_compact_context(retrieved_docs:List[List[str]])->str:
    """
    This function takes a list of retrieved documents and builds a compact context in a single string.
    :param retrieved_docs:
    :return:
    """
    if retrieved_docs and len(retrieved_docs)>0:
        return ";".join(retrieved_docs[0])
    else:
        return "No relevant documents found."


In [8]:
sample_doc = [
    "Pengfei loves to eat appels, Pengfei loves to eat meats",
    "Pengfei playes lots of video games",
    "Pengfei want to learn swimming"
]
sample_doc_id = ["1","2","3"]

add_documents_to_collection(sample_doc, sample_doc_id)


In [9]:
sample_query="What food does pengfei like the most?"
docs, metadatas = query_chromadb(sample_query)
print(docs)
print(metadatas)

Number of requested results 5 is greater than number of elements in index 3, updating n_results = 3


[['Pengfei playes lots of video games', 'Pengfei loves to eat appels, Pengfei loves to eat meats', 'Pengfei want to learn swimming']]
[[None, None, None]]


In [10]:
# Function to interact with the Ollama LLM
def query_ollama(prompt):
    """
    This function can send query to ollama llm model and retrive response.
    :param prompt:
    :return:
    """
    llm = OllamaLLM(model=llm_model)
    return llm.invoke(prompt)

In [11]:
# RAG pipeline: Combine ChromaDB and Ollama for Retrieval-Augmented Generation
def rag_pipeline(query_text: str):
    """
    This function implements the RAG pipeline.
    :param query_text:
    :return:
    """
    # Step 1: Retrieve relevant documents from ChromaDB
    retrieved_docs, metadata = query_chromadb(query_text)
    context = build_compact_context(retrieved_docs)

    # Step 2: Send the query along with the context to Ollama
    augmented_prompt = f"Context: {context}\n\nQuestion: {query_text}\nAnswer:"
    print("######## Augmented Prompt ########")
    print(augmented_prompt)

    response = query_ollama(augmented_prompt)
    return response

In [12]:
# Example: Add sample documents to the collection
documents2 = [
    "Pengfei is a data scientist who works for CASD",
]
doc_ids2 = ["doc4"]

# Documents only need to be added once or whenever an update is required.
# This line of code is included for demonstration purposes:
add_documents_to_collection(documents2, doc_ids2)


In [13]:
# Example usage
# Define a query to test the RAG pipeline
query1 = ("who is Pengfei?")  # Change the query as needed
response1 = rag_pipeline(query1)
print("######## Response from LLM ########\n", response1)

Number of requested results 5 is greater than number of elements in index 4, updating n_results = 4


######## Augmented Prompt ########
Context: Pengfei playes lots of video games;Pengfei loves to eat appels, Pengfei loves to eat meats;Pengfei want to learn swimming;Pengfei is a data scientist who works for CASD

Question: who is Pengfei?
Answer:
######## Response from LLM ########
 Based on the context provided, it appears that "Pengfei" refers to an individual with several characteristics and interests. Here's what can be inferred:

1. **Gamer**: The person plays lots of video games.
2. **Fruit lover**: They enjoy eating apples.
3. **Meat enthusiast**: They have a preference for meats.
4. **Aspiring swimmer**: There is an interest in learning how to swim.
5. **Data scientist**: They work as a data scientist, specifically working at CASD (which stands for Computer Applications and Systems Division or similar, but more context would be ideal).

Without additional information about what "Pengfei" refers to directly (e.g., it's someone known personally), the most accurate response based

In [14]:
# Define a query to test the RAG pipeline
query2 = ("What Pengfei loves to do?")  # Change the query as needed
response2 = rag_pipeline(query2)
print("######## Response from LLM ########\n", response2)

Number of requested results 5 is greater than number of elements in index 4, updating n_results = 4


######## Augmented Prompt ########
Context: Pengfei playes lots of video games;Pengfei loves to eat appels, Pengfei loves to eat meats;Pengfei want to learn swimming;Pengfei is a data scientist who works for CASD

Question: What Pengfei loves to do?
Answer:
######## Response from LLM ########
 It seems there are multiple things that Pengfei enjoys doing! 

Based on the provided context, here's what we can infer:

1. **Play video games**: It's mentioned that Pengfei plays lots of video games.
2. **Eat apples**: Pengfei loves to eat apples.
3. **Eat meats**: Another thing Pengfei loves to do is eat meats.

These are the specific activities/enjoyments that were explicitly stated in the context.
