## Ghost RAG



# Goal of this Notebook

In this notebook we use langchain to build a simple RAG to Ollama and we ask the llama3 model for weather reports from the weather context fed from Milvus.

### Simple Retrieval-Augmented Generation (RAG) with LangChain:

Build a simple Python [RAG](https://milvus.io/docs/integrate_with_langchain.md) application (streetcamrag.py) to use Milvus for asking about the current weather via OLLAMA.   While outputing to the screen we also send the results to Slack formatted as Markdown.

### 🔍 Summary
By the end of this application, you’ll have a comprehensive understanding of using Milvus, data ingest object semi-structured and unstructured data, and using Open Source models to build a robust and efficient data retrieval system.  


In [16]:
!pip3 install -Uq langchain-huggingface pdf2image

In [38]:
!pip3 install -Uq layoutparser torchvision slack-sdk

In [1]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
warnings.filterwarnings('ignore')
warnings.filterwarnings("ignore", category=DeprecationWarning)
import os
from pymilvus import MilvusClient
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_milvus import Milvus
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain import hub
import requests
import base64
from dotenv import load_dotenv
load_dotenv(override=True)

# Constants
MILVUS_URL = "http://192.168.1.166:19530" 
DIMENSION = 512
TEXTDIMENSION = 768
COLLECTION = "ghosts"
EMBEDDING_MODEL = "clip-ViT-B-32"

os.environ["TOKENIZERS_PARALLELISM"]  = "true"

model_kwargs = {"device": "cpu", "trust_remote_code": True}

embeddings = HuggingFaceEmbeddings(model_name=EMBEDDING_MODEL,  model_kwargs=model_kwargs)

In [2]:
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

### Turn off slack warnings
os.environ["SKIP_SLACK_SDK_WARNING"] = "false"

slack_token = os.environ.get("SLACK_BOT_TOKEN")
slackclient = WebClient(token=slack_token)

In [19]:
# Create the Milvus vector store
vector_store = Milvus(
    embedding_function=embeddings,
    collection_name="ghosts",
    primary_field = "id",
    vector_field="vector",
    text_field="description",
    connection_args={"uri": MILVUS_URL},
)

results = vector_store.similarity_search("Describe any ghosts in the photo", k=1)

print(len(results))
print(results[0].page_content) 
print(results[0].metadata["category"])
print(results[0].metadata["ghostclass"])
print(results[0].metadata["s3path"])

1
 In the image, a ghostly figure is standing in the center of a room. The figure is dressed in a long blue sheet, with eye holes cut out, giving it an ethereal appearance. It stands with its arms outstretched, as if reaching for something or someone beyond our view. The background of the image reveals a dimly lit room with a hint of another person in the far distance. The overall atmosphere is one of solitude and mystery. 
Unstable
Class I
http://192.168.1.166:9000/images/victorian600.jpg


In [20]:
from langchain_ollama import OllamaLLM

def run_query() -> None:
    llm = OllamaLLM(
        model="llama3.2",
        callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),
        stop=["<|eot_id|>"],
    )

    query = input("\nQuery: ")
    prompt = hub.pull("rlm/rag-prompt")

    qa_chain = RetrievalQA.from_chain_type(
        llm, retriever=vector_store.as_retriever(), chain_type_kwargs={"prompt": prompt}
    )

    result = qa_chain.invoke({"query": query})
    # print(result)

    resultforslack = str(result["result"])
    print(resultforslack)

    try:
        response = slackclient.chat_postMessage(mrkdwn=True, channel="C06NE1FU6SE", text="", 
                                            blocks=[{"type": "section","text": {"type": "mrkdwn","text": "*" + str(query) + "*  \n\n" + str(resultforslack) +"\n" }}])

    
    except SlackApiError as e:
        # You will get a SlackApiError if "ok" is False
        print("Slack failed")

In [21]:
if __name__ == "__main__":
    while True:
        run_query()

  run_query()



Query:  What do you see?




I don't see anything explicitly stated in the provided context. The description focuses on the atmosphere and setting of the scene, but doesn't provide any information about what is being seen.I don't see anything explicitly stated in the provided context. The description focuses on the atmosphere and setting of the scene, but doesn't provide any information about what is being seen.


  run_query()


KeyboardInterrupt: Interrupted by user

In [22]:
# Create the Milvus vector store
vector_store = Milvus(
    embedding_function=embeddings,
    collection_name="ghosts",
    primary_field = "id",
    text_field="description",
    vector_field="vector",
    connection_args={"uri": MILVUS_URL},
)

results = vector_store.similarity_search("ghost", k=100)
print(len(results))
#print(results)

# Create a retriever from the vector store
retriever = vector_store.as_retriever()

# https://github.com/AlaGrine/RAG_chatabot_with_Langchain/blob/main/RAG_notebook.ipynb
# https://colab.research.google.com/drive/1X16irfbWboi7BdyYhnF6QYirRba5doAi?usp=sharing#scrollTo=pyveyZh_LWVJ
# https://github.com/langchain-ai/langchain/blob/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb?ref=blog.langchain.dev
# https://github.com/langchain-ai/langchain/blob/8fea07f92e5c5b80a659b4915f7349babd36fdc6/docs/docs/integrations/retrievers/milvus_hybrid_search.ipynb#L8

# build cookbooks like langchain for milvus 101

# Use the retriever
query = "Describe the ghost in the photo"
retrieved_docs = retriever.invoke(query) # , limit=10

print(len(retrieved_docs))

if ( len(retrieved_docs) > 0 ):
    #print(retrieved_docs[0])
    print(retrieved_docs[0].metadata["s3path"] )
    #print(retrieved_docs[1].metadata["s3path"] )

49
4
http://192.168.1.166:9000/images/victorian600.jpg


In [23]:
from IPython.display import HTML, display
from PIL import Image
import base64
from pdf2image import convert_from_path
import layoutparser as lp
import cv2
import numpy as np

def plt_img_base64(img):
    image_html = f'<img  width="200px" height="200px" src="{img}" />'
    display(HTML(image_html))

In [24]:
plt_img_base64(retrieved_docs[0].metadata["s3path"])

In [25]:
from langchain_core.messages import HumanMessage
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_milvus.vectorstores import Milvus
from langchain.schema import Document
from langchain_ollama import OllamaLLM

llm_model = OllamaLLM(model="llava:7b",temperature=1, top_p=0.85)

def prepare_image_context(docs):
    images = []
    for doc in docs:
        images.append(doc.metadata["s3path"])
    return {"images": images}

def img_prompt(data_dict):
    messages = []

    if data_dict["context"]["images"]:
        for image in data_dict["context"]["images"]:
            image_message = {
                "type": "images",
                "images": {"images": image},
            }
            messages.append(image_message)

    text_message = {
        "type": "text",
        "text": (
            "Use information contained in the image to provide contextualized answer related to the user question. \n"
    f"User-provided question: {data_dict['question']}\n\n"
        ),
    }

    messages.append(text_message)

    return [HumanMessage(content=messages)]

def multi_modal_rag_chain(retriever):

    # RAG pipeline
    chain = (
        {
            "context": retriever | RunnableLambda(prepare_image_context),
            "question": RunnablePassthrough(),
        }
        | RunnableLambda(img_prompt)
        | llm_model
        | StrOutputParser()
    )

    return chain

# Create RAG chain
chain_multimodal_rag = multi_modal_rag_chain(retriever)

In [27]:
# Run RAG chain
chain_multimodal_rag.invoke(query)

" I'm sorry, but the image you provided does not depict a ghost or any other supernatural entities. The images show Victorian-style interiors with various decorations and furniture. If you have specific questions about the style of the rooms or the types of objects in the photos, please let me know! "