<a href="https://colab.research.google.com/github/Luciesprogram/Gen-AI/blob/main/RAG_Application_using_Langchain_Mistral_and_Weviate_P3_12.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -U langchain-huggingface langchain-weaviate weaviate-client langchain-text-splitters pypdf

Collecting langchain-huggingface
  Downloading langchain_huggingface-1.2.0-py3-none-any.whl.metadata (2.8 kB)
Collecting langchain-weaviate
  Downloading langchain_weaviate-0.0.6-py3-none-any.whl.metadata (2.6 kB)
Collecting weaviate-client
  Downloading weaviate_client-4.19.0-py3-none-any.whl.metadata (3.7 kB)
Collecting langchain-text-splitters
  Downloading langchain_text_splitters-1.1.0-py3-none-any.whl.metadata (2.7 kB)
Collecting pypdf
  Downloading pypdf-6.5.0-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<2.0.0,>=1.2.0 (from langchain-huggingface)
  Downloading langchain_core-1.2.4-py3-none-any.whl.metadata (3.7 kB)
Collecting validators<1.0.0,>=0.34.0 (from weaviate-client)
  Downloading validators-0.35.0-py3-none-any.whl.metadata (3.9 kB)
Collecting authlib<2.0.0,>=1.6.5 (from weaviate-client)
  Downloading authlib-1.6.6-py2.py3-none-any.whl.metadata (9.8 kB)
Collecting pydantic<3.0.0,>=2.12.0 (from weaviate-client)
  Downloading pydantic-2.12.5-py3-none-any.whl

In [2]:
import os
import locale
from google.colab import userdata

In [3]:
locale.getpreferredencoding = lambda: "UTF-8"

In [64]:
WEAVIATE_CLUSTER = userdata.get('WEAVIATE_CLUSTER')
WEAVIATE_API_KEY = userdata.get('WEAVIATE_API_KEY')
hf_token = userdata.get('HUGGINGFACE_API_TOKEN')

In [65]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

In [66]:
import weaviate
from langchain_weaviate import WeaviateVectorStore

In [67]:
client = weaviate.connect_to_weaviate_cloud(
    cluster_url=WEAVIATE_CLUSTER,
    auth_credentials=weaviate.auth.AuthApiKey(WEAVIATE_API_KEY)
)

In [68]:
vector_db = WeaviateVectorStore(
    client=client,
    index_name="RAG_Notebook_Collection", # Collection name
    embedding=embeddings,
    text_key="text"
)

In [69]:
!pip install langchain_community



In [70]:
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter

In [71]:
loader = PyPDFLoader("/content/2005.11401v4.pdf", extract_images=True)
pages = loader.load()

In [72]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
docs = text_splitter.split_documents(pages)

In [73]:
for doc in docs:
    # Create a copy of keys to avoid 'dictionary changed size during iteration' errors
    bad_keys = [k for k in doc.metadata.keys() if "." in k]
    for key in bad_keys:
        # Replace dot with underscore or just delete the property
        new_key = key.replace(".", "_")
        doc.metadata[new_key] = doc.metadata.pop(key)

In [74]:
vector_db.add_documents(docs)

['bc23a272-4cdf-4e5e-b37f-4c8c5e100ef4',
 'f99681c5-f4a1-43f3-b033-9283186d6f20',
 'a5c7bb8b-4c2f-42cb-8dc5-f17238e71358',
 'f02df0ec-df7a-4f41-b406-061b8ed95dee',
 '3126ac1f-9fdd-417d-b74b-650542fab5fc',
 'f9d88641-c1e4-477a-a4c0-47d9d74c84ec',
 '34d41292-8784-4609-823a-f91011056de5',
 'e5baaa8a-2e82-499f-be77-d9873fe22f2f',
 '41c076d5-f63e-4fc1-8638-8a1c9270f129',
 '0e7f4e12-d609-4799-b5f7-a1e187a57e3f',
 'd8e0579e-249e-4be8-a7d3-b1831cd578bf',
 '76c93569-466e-469c-afc4-bf657f9b5e4b',
 '8b376242-a445-4d34-8835-ccd11a6cd412',
 'e6909623-b778-4721-af16-3964ac372722',
 'ac618a90-8485-4701-942e-fef070097e4c',
 '844956d2-3503-4373-b434-3cc503245088',
 'cb3372c1-f9f4-49d6-9738-c557f27f30f4',
 '183ce8da-a075-4a63-9f4c-f26979045635',
 '189f913c-0f40-4652-88bb-5ed096c0e7ae',
 '6e1e5853-23f3-4501-b46b-c4d5f29b6825',
 '98ec35d4-9df2-470d-b521-72a688d7a0ed',
 'f7de1a6e-4e6f-4d36-aa5a-dd19345484b4',
 '364e396f-d5b2-4afb-b7a8-1e9e7de14ff9',
 '313d5296-3eba-4620-bd26-58c97fa2f923',
 '5c405317-2987-

In [75]:
from langchain_huggingface import HuggingFaceEndpoint
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

In [76]:
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
from langchain_core.messages import HumanMessage, SystemMessage

In [77]:
llm = HuggingFaceEndpoint(
    repo_id="mistralai/Mistral-7B-Instruct-v0.2",
    huggingfacehub_api_token=hf_token,
    temperature=0.2,
    max_new_tokens=256,
)

In [78]:
chat_model = ChatHuggingFace(llm=llm)

In [79]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an assistant for question-answering tasks. Use the context to answer."),
    ("human", "Context: {context}\n\nQuestion: {question}")
    ])

In [80]:
rag_chain = (
    {"context": vector_db.as_retriever(), "question": RunnablePassthrough()}
    | prompt
    | chat_model # Use the chat_model wrapper here!
    | StrOutputParser()
)

In [81]:
response = rag_chain.invoke("What is Retrieval-Augmented Generation (RAG)?")
print(response)

 Retrieval-Augmented Generation (RAG) is a model for knowledge-intensive natural language processing tasks that uses a retrieval component to gather relevant information from a large corpus before generating an answer or text. The model combines parametric and non-parametric memory to obtain state-of-the-art results on open-domain question answering. RAG models have been shown to obtain more factual and specific generations than purely parametric models, and the learned retrieval component has been validated to be effective. The retrieval index can be hot-swapped to update the model without requiring any retraining. RAG models use retrieved documents to generate more diverse and informative responses than baseline models.


In [82]:
print(rag_chain.invoke("How does the RAG model differ from traditional language generation models?"))

 The RAG model differs from traditional language generation models in several ways, as described in the context:

1. Access to Gold Passages: Traditional language generation models often have access to gold passages with specific information required to generate the reference answer. In contrast, the RAG model does not have access to these gold passages and must generate answers based on the given question alone.
2. Unanswerable Questions: Many questions in the datasets used to evaluate the models are unanswerable without the gold passages. The RAG model is able to handle these questions and generate appropriate responses, while traditional models may struggle or fail.
3. Answers from Wikipedia: Not all questions are answerable from Wikipedia alone. Traditional models may rely heavily on the gold passages to generate accurate answers, while the RAG model is able to generate answers based on the given question and its knowledge base, even if the question cannot be answered directly from

In [83]:
from IPython.display import Markdown, display
import textwrap

In [84]:
def gen(text):
  return Markdown(textwrap.indent(rag_chain.invoke(text), '> ', predicate=lambda _: True))

In [85]:
gen("what is RAG Sequence model, its formula and its is based on which mathematical concept")

>  The RAG Sequence model is a generative model for question answering that uses a retriever to identify the top K relevant documents and a generator to produce the answer sequence based on the retrieved documents. The formula for the RAG Sequence model is given by:
> 
> p(y|x) ≈ ∑z∈top-k(p(z|x))pη(z|x)pθ(y|x,z,y 1:i-1)
> 
> Here, x is the input question, y is the output sequence of tokens, z is a latent document, p(z|x) is the probability of document z given the input question x, pη(z|x) is the probability of the document z being in the top K retrieved documents, and pθ(yi|x,z,y 1:i-1) is the probability of the ith token in the output sequence y given the input question x, the latent document z, and the previous tokens in the output sequence y 1:i-1.
> 
> The RAG Sequence model is based on the concept of sequence generation using a probabilistic model. It uses a gener