# Goal of this Notebook

In this notebook we use langchain to build a simple RAG to Ollama and we ask the llama3 model for 
data from the slides context fed from Milvus.

### 🚕 Simple Retrieval-Augmented Generation (RAG) with LangChain:

Build a simple Python [RAG](https://milvus.io/docs/integrate_with_langchain.md) application to use Milvus for 
asking about Tim's slides via OLLAMA.   W=

### 🔍 Summary
By the end of this application, you’ll have a comprehensive understanding of using Milvus, data ingest object semi-structured and unstructured data, and using Open Source models to build a robust and efficient data retrieval system.  


#### 🐍 AIM Stack - Easy Local Free Open Source RAG

* Ollama
* Python
* Jupyter Notebooks
* Llama 3.2
* Milvus-Lite
* LangChain

![milvuslogo](https://milvus.io/images/milvus_logo.svg)

#### 🎃 Resources

* https://zilliz.com/blog/a-beginners-guide-to-using-llama-3-with-ollama-milvus-langchain
* https://github.com/stephen37/ollama_local_rag/blob/main/rag_berlin_parliament.py
* https://python.langchain.com/docs/integrations/vectorstores/milvus/
* https://api.python.langchain.com/en/latest/vectorstores/langchain_milvus.vectorstores.milvus.Milvus.html
* https://developer.nvidia.com/blog/rag-101-demystifying-retrieval-augmented-generation-pipelines/
* https://zilliz.com/learn/build-rag-with-milvus-lite-llama3-and-llamaindex
  


In [1]:
import warnings
warnings.filterwarnings('ignore')
warnings.filterwarnings("ignore", category=DeprecationWarning)

In [8]:
!pip install -qU sentence-transformers langchain langchain_milvus langchain-huggingface ollama langchain-ollama pypdf langchainhub   "pymilvus[model]"

## 🔥 Install Ollama

![ollama](https://ollama.com/public/ollama.png)

### 🛩️ Download for Mac, Linux, Windows

https://ollama.com/download

### 👽 Install Open Source Llama 3.2 model from Meta

https://ollama.com/library/llama3.2

```
ollama run llama3.2

>>> /bye
```

Running the model will download many gigabytes of model and weights for you.   When it is complete it will put you interactive chat mode.   
You can test it, or type */bye* to exit.


###  🙅 List all of your models

````
ollama list

NAME                         ID              SIZE      MODIFIED
llava:7b                     8dd30f6b0cb1    4.7 GB    40 hours ago
mistral-nemo:latest          994f3b8b7801    7.1 GB    9 days ago
gemma2:2b                    8ccf136fdd52    1.6 GB    10 days ago
nomic-embed-text:latest      0a109f422b47    274 MB    10 days ago
llama3.2:3b-instruct-fp16    195a8c01d91e    6.4 GB    2 weeks ago
llama3.2:latest              a80c4f17acd5    2.0 GB    2 weeks ago
reader-lm:latest             33da2b9e0afe    934 MB    3 weeks ago
````

### 📊 Let's use it

### 🦙 First, let's import all the libraries we will need


In [2]:
import os
from pymilvus import MilvusClient
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_milvus import Milvus
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain import hub

#### Constant For PDF Downloads, If you Change This, Change in Section Below As Well
path_pdfs = "talks/"

#### Initialize Our Documents
documents = []

### 🐦 Download some PDFs of talks

1. Build a directory for the talks
2. Download PDFs

#### Note:

You can use your own PDFs, download more from 

* https://github.com/tspannhw/SpeakerProfile
* https://www.slideshare.net/bunkertor/presentations


In [25]:
!mkdir talks
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/01-oct-2024pes-vectordatabasesandai-241001142959-0510fbe6.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-04-2024-nyctechweek-discussiononvectordatabasesunstructureddataandai-240605125759-d80c571c.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-04-2024-nyctechweek-localragwithllama3andmilvus-240605113619-6e91032b.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/08-13-2024nycmeetup-unstructureddataprocessingfromcloudtoedge-240812185343-3ae3ff2b.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-12-2024milvussensordatarag-240907202906-ea0e6890.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-18-2024nycmeetup-vectordatabases102-240917124641-19bae3b0.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-19-2024aicamphybridseach-240919215006-76282317.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/09-25-2024njxventuresummitintroduction-240926012816-5fbfcc78.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf
!wget -P talks/ https://raw.githubusercontent.com/tspannhw/SpeakerProfile/main/2024/06-18-2024-princetonmeetup-introductiontomilvus-240619130633-2ad701db.pdf

!ls talks/

### Iterate through PDFs and load into documents


In [3]:
for file in os.listdir(path_pdfs):
    if file.endswith(".pdf"):
        pdf_path = os.path.join(path_pdfs, file)
        print(pdf_path)
        loader = PyPDFLoader(pdf_path)
        documents.extend(loader.load())

talks/09-18-2024nycmeetup-vectordatabases102-240917124641-19bae3b0.pdf


could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead


talks/01-oct-2024pes-vectordatabasesandai-241001142959-0510fbe6.pdf


could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead
could not convert string to float: b'0.00-30' : FloatObject (b'0.00-30') invalid; use 0.0 instead


talks/06-04-2024-nyctechweek-localragwithllama3andmilvus-240605113619-6e91032b.pdf
talks/08-13-2024nycmeetup-unstructureddataprocessingfromcloudtoedge-240812185343-3ae3ff2b.pdf
talks/06-04-2024-nyctechweek-discussiononvectordatabasesunstructureddataandai-240605125759-d80c571c.pdf
talks/06-18-2024-princetonmeetup-introductiontomilvus-240619130633-2ad701db.pdf
talks/09-12-2024milvussensordatarag-240907202906-ea0e6890.pdf
talks/09-25-2024njxventuresummitintroduction-240926012816-5fbfcc78.pdf
talks/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf


could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead
could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead
could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead
could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead
could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead
could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead
could not convert string to float: b'0.00-20' : FloatObject (b'0.00-20') invalid; use 0.0 instead
could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead
could not convert string to float: b'0.00-40' : FloatObject (b'0.00-40') invalid; use 0.0 instead


talks/09-19-2024aicamphybridseach-240919215006-76282317.pdf


### Connect to Milvus

Use Milvus-Lite for local database

This is ./milvusrag101.db

You can easily switch to Docker for a more advance Milvus or Zilliz Cloud

You could drop the collection if you are starting new

In [3]:
MILVUS_URL = "./rag101.db"

client = MilvusClient(uri=MILVUS_URL)

if client.has_collection("LangChainCollection"):
    print("Collection exists")
else:
    client.drop_collection("LangChainCollection")

Collection exists


### 🐍 AIM Stack - Easy Local Free Open Source RAG

#### Choose Your Model

https://zilliz.com/ai-models

#### Free, Hugging Face Hosted, Open Source Apache Licensed 

https://zilliz.com/ai-models/all-minilm-l12-v2

````
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

````

#### Powerful JINA AI Model for Text Embedding

https://zilliz.com/ai-models/jina-embeddings-v2-base-en

#### Next step, chunk up our big documents


In [7]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)

all_splits = text_splitter.split_documents(documents)


#### We load our JINA AI text embedding model via HuggingFace

#### Then we load all of the splits and embeddings to Milvus

This will take some town and will download the model and meta data

````
    drop_old=True 
````

That will delete any previously loaded documents.   If you set it to False, you don't have to reload your documents.  You would be adding more and these could 
be duplicates.


#### Verify Documents are Loaded

We run a *similarity_search* on the newly loaded vector store


#### Reference

https://milvus.io/docs/integrate_with_langchain.md



In [4]:
from langchain_huggingface import HuggingFaceEmbeddings
model_kwargs = {"device": "cpu", "trust_remote_code": True}

embeddings = HuggingFaceEmbeddings(model_name="jinaai/jina-embeddings-v2-base-de",  model_kwargs=model_kwargs)

vectorstore = Milvus.from_documents( 
    documents=documents,
    embedding=embeddings,
    connection_args={
        "uri": MILVUS_URL,
    },
    drop_old=False,  
)

vectorstore.similarity_search("What is Milvus?", k=1)

[Document(metadata={'page': 23, 'pk': 453229992137722003, 'source': 'talks/06-20-2024-aicampmeetup-unstructureddataandvectordatabases-240620160248-1964efbc.pdf'}, page_content='24\n|   © Copyright Zilliz 24\nMilvus: From Dev to Prod \nAI Powered Search made easy \nMilvus is an Open-Source Vector \nDatabase  to store, index, manage, and \nuse the massive number of embedding  \nvectors  generated by deep neural \nnetworks and LLMs. \ncontributors \n267+\nstars\n27K+\ndownloads \n25M+\nforks\n2K+\n')]


#### Code our loop to call LLama 3.2


We will pull the RAG prompt information from LLama's hug and connect the documents loaded into Milvus with our LLM
chat with LLama 3.2


In [5]:
from langchain_ollama import OllamaLLM

def run_query() -> None:
    llm = OllamaLLM(
        model="llama3.2",
        callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
        stop=["<|eot_id|>"],
    )

    query = input("\nQuery: ")
    prompt = hub.pull("rlm/rag-prompt")

    qa_chain = RetrievalQA.from_chain_type(
        llm, retriever=vectorstore.as_retriever(), chain_type_kwargs={"prompt": prompt}
    )

    result = qa_chain.invoke({"query": query})
    # print(result)


In [None]:
if __name__ == "__main__":
    while True:
        run_query()


Query:  What indexes are available in Milvus?


I don't know what indexes are available in Milvus, as the provided context only mentions that it is an Open-Source Vector Database without specifying the types of indexes. The context also does not provide any information about indexing capabilities or options.


Query:  What is Milvus?


Milvus is an Open-Source Vector Database used to store, index, manage, and use the massive number of embedding vectors generated by deep neural networks and LLMs. It provides an AI-powered search solution made easy. Milvus is developed in a collaborative environment with over 267 contributors.


Query:  What is a vector database?


A vector database is a type of data storage system designed for efficient storage and retrieval of dense vectors, often used in applications such as recommendation systems and computer vision. It stores and indexes large amounts of numerical data, allowing for fast querying and similarity searches. Vector databases are particularly useful for tasks that require computing distances or similarities between high-dimensional vectors.