# RAG

## 1_rag_basics

In [None]:
%pip install --quiet --upgrade python-dotenv langchain langchain-community langchain-ollama langchain-chroma chromadb

In [None]:
import os

from dotenv import load_dotenv
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_ollama import OllamaEmbeddings

In [None]:
# Load environment variables from .env
load_dotenv()

In [None]:
def print_relevante_docs(docs):
    # Display the relevant results with metadata
    print("\n--- Relevant Documents ---")
    for i, doc in enumerate(docs, 1):
        print(f"Document {i}:\n{doc.page_content}\n")
        if doc.metadata:
            print(f"Source: {doc.metadata.get('source', 'Unknown')}\n")

In [None]:
# Define the directory containing the text file and the persistent directory
current_dir = os.path.dirname(os.getcwd())
file_path = os.path.join(current_dir, "notebooks", "documents", "langchain_demo.txt")
persistent_directory = os.path.join(current_dir, "notebooks", "chroma_db")

In [None]:
# Read the text content from the file
loader = TextLoader(file_path)
documents = loader.load()

In [None]:
len(documents)

In [201]:
# Split the document into chunks
text_splitter = CharacterTextSplitter(separator="\n",
    chunk_size=1000,
    chunk_overlap=100,
    length_function=len,
    is_separator_regex=False,)
docs_split = text_splitter.split_documents(documents)

In [202]:
# Display information about the split documents
print("\n--- Document Chunks Information ---")
print(f"Number of document chunks: {len(docs_split)}")
print(f"Sample chunk:\n{docs_split[-1].page_content}\n")


--- Document Chunks Information ---
Number of document chunks: 4
Sample chunk:
LangChain Documentation: https://python.langchain.com/



In [203]:
# Create embeddings
print("\n--- Creating embeddings ---")
embeddings = OllamaEmbeddings(model="nomic-embed-text")
print("\n--- Finished creating embeddings ---")


--- Creating embeddings ---

--- Finished creating embeddings ---


In [204]:
response = embeddings.embed_documents(["Sample test for embedding"])

In [None]:
len(response)

In [None]:
response[0]

In [205]:
# Create the vector store and persist it automatically
print("\n--- Creating vector store ---")
vectorstore = Chroma.from_documents(documents=docs_split, embedding=embeddings, persist_directory=persistent_directory)
print("\n--- Finished creating vector store ---")


--- Creating vector store ---

--- Finished creating vector store ---


In [206]:
# Retrieve relevant documents based on the query
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2},
)

In [207]:
# Define the user's question
query = "What is LangChain?"

In [208]:
relevant_docs = retriever.invoke(query)

In [209]:
# Display the relevant results with metadata
print_relevante_docs(relevant_docs)


--- Relevant Documents ---
Document 1:
LangChain is a powerful and flexible framework designed to simplify the development of applications that harness the capabilities of large language models (LLMs). It provides a wide range of tools, abstractions, and integrations that help developers build, customize, and optimize applications that leverage LLMs for tasks like text generation, question answering, summarization, chatbots, and more.

Source: /home/mrego/Projects/workspace/langchain-notebook/notebooks/documents/langchain_demo.txt

Document 2:
LangChain Documentation: https://python.langchain.com/

Source: /home/mrego/Projects/workspace/langchain-notebook/notebooks/documents/langchain_demo.txt



## 2_web_scrape_basic

In [210]:
from langchain_community.document_loaders import WebBaseLoader

# Load, chunk and index the contents of the blog.
loader = WebBaseLoader(
    web_paths=("https://medium.com/@drjulija/what-are-naive-rag-advanced-rag-modular-rag-paradigms-edff410c202e",)
)
documents = loader.load()

In [211]:
len(documents)

1

In [None]:
documents[0]

In [215]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, 
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,)
docs_split = text_splitter.split_documents(documents=documents)

In [216]:
# Display information about the split documents
print("\n--- Document Chunks Information ---")
print(f"Number of document chunks: {len(docs_split)}")
print(f"Sample chunk:\n{docs_split[-1].page_content}\n")


--- Document Chunks Information ---
Number of document chunks: 12
Sample chunk:
are stored in the systems memory.Fusion — involves parallel vector searches of both original and expanded queries, intelligent reranking to optimize results, and pairing the best outcomes with new queries.Routing — query routing decides the subsequent action to a user’s query for example summarization, searching specific databases, etc.🔗 Read about how I built a Naive RAG pipeline HERE.RagArtificial IntelligenceLlmNLPRag Optimization----2FollowWritten by Dr Julija134 Followers·6 FollowingFounder of MiniMe ai [myminime.ai] and Networky [networky.co] | AI Engineer | Entrepreneur | PhD | Machine Learning | NLP | Artificial IntelligenceFollowResponses (2)See all responsesHelpStatusAboutCareersPressBlogPrivacyTermsText to speechTeams



In [217]:
# Create the vector store and persist it automatically
print("\n--- Creating vector store ---")
vectorstore = Chroma.from_documents(documents=docs_split, embedding=embeddings, persist_directory=persistent_directory)
print("\n--- Finished creating vector store ---")


--- Creating vector store ---

--- Finished creating vector store ---


In [277]:
# Retrieve relevant documents based on the query
retriever = vectorstore.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k": 3, "score_threshold": 0.3},
)

In [278]:
relevant_docs = retriever.invoke("Naive RAG")

In [279]:
# Display the relevant results with metadata
print_relevante_docs(relevant_docs)


--- Relevant Documents ---
Document 1:
LLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAG | by Dr Julija | MediumOpen in appSign upSign inWriteSign upSign inLLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAGDr Julija·Follow6 min read·Mar 10, 2024--2ListenShareThree RAG Paradigms | 📔 DrJulija’s Notebook📝 OverviewHere I describe my key learnings on how RAG systems evolved over the last few years. I share the differences between Naive RAG, Advanced RAG and Modular RAG frameworks. I summarize key insights from a great RAG technology survey paper Gao et al. 2024.🛠 What is a RAG Framework?Large Language Models (LLMs) such as the GPT series from OpenAI, LLama series by Meta, and Gemini by Google have achieved significant achievements in the generative AI field.But these models are non deterministic. Often, LLMs may produce content that is either inaccurate or irrelevant (known as hallucinations), rely on outdated information, and their decision-making processes are not transparen

In [280]:
relevant_docs = retriever.invoke("Tell me about Naive RAG, Advanced RAG & Modular RAG in LLM RAG Paradigms")

In [281]:
# Display the relevant results with metadata
print_relevante_docs(relevant_docs)


--- Relevant Documents ---
Document 1:
LLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAG | by Dr Julija | MediumOpen in appSign upSign inWriteSign upSign inLLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAGDr Julija·Follow6 min read·Mar 10, 2024--2ListenShareThree RAG Paradigms | 📔 DrJulija’s Notebook📝 OverviewHere I describe my key learnings on how RAG systems evolved over the last few years. I share the differences between Naive RAG, Advanced RAG and Modular RAG frameworks. I summarize key insights from a great RAG technology survey paper Gao et al. 2024.🛠 What is a RAG Framework?Large Language Models (LLMs) such as the GPT series from OpenAI, LLama series by Meta, and Gemini by Google have achieved significant achievements in the generative AI field.But these models are non deterministic. Often, LLMs may produce content that is either inaccurate or irrelevant (known as hallucinations), rely on outdated information, and their decision-making processes are not transparen

## 3_rag_conversational

In [255]:
from langchain_ollama.llms import OllamaLLM
model = OllamaLLM(base_url="http://localhost:11434", model="llama3.2", temperature=0)

In [254]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import HumanMessage
from langchain_core.messages.ai import AIMessage
prompt = ChatPromptTemplate.from_messages([
    ("system", """
You are an assistant for question-answering tasks. 
Your name is {name}. 
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Context: {context}
"""),
    ("placeholder", "{context}"),
    ("human", "Question: {question}")
])

In [None]:
prompt

In [256]:
prompt_with_partial = prompt.partial(name="R2D2")

In [None]:
prompt_with_partial

In [257]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

chain = (
      prompt_with_partial
    | model
    | StrOutputParser()
)

In [269]:
answer = chain.invoke({
    "question":"Tell me about Naive RAG, Advanced RAG & Modular RAG in LLM RAG Paradigms",
    "context": []
})

In [270]:
print(answer)

Beep boop, I don't have specific information on Naive RAG, Advanced RAG, and Modular RAG within the provided context. However, I can tell you that these are related to Large Language Model (LLM) paradigms, specifically in the context of RAG (Recurrent Autoencoder for Graphs). Beep boop, more information would be needed to provide a detailed answer.


In [271]:
def format_docs(documents):
    return [AIMessage(doc.page_content) for doc in documents]

In [272]:
format_docs(relevant_docs)

[AIMessage(content='LLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAG | by Dr Julija | MediumOpen in appSign upSign inWriteSign upSign inLLM RAG Paradigms: Naive RAG, Advanced RAG & Modular RAGDr Julija·Follow6 min read·Mar 10, 2024--2ListenShareThree RAG Paradigms | 📔 DrJulija’s Notebook📝 OverviewHere I describe my key learnings on how RAG systems evolved over the last few years. I share the differences between Naive RAG, Advanced RAG and Modular RAG frameworks. I summarize key insights from a great RAG technology survey paper Gao et al. 2024.🛠 What is a RAG Framework?Large Language Models (LLMs) such as the GPT series from OpenAI, LLama series by Meta, and Gemini by Google have achieved significant achievements in the generative AI field.But these models are non deterministic. Often, LLMs may produce content that is either inaccurate or irrelevant (known as hallucinations), rely on outdated information, and their decision-making processes are not transparent, leading to black-

In [273]:
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt_with_partial
    | model
    | StrOutputParser()
)

In [282]:
answer = rag_chain.invoke("Summarize anything you know about about Naive RAG, Advanced RAG & Modular RAG in LLM RAG Paradigms")

In [275]:
print(answer)

Here's a summary of the three paradigms mentioned:

**Naive RAG**: The Naive RAG paradigm is one of the three categories of RAG (Retrieval-Augmented Generation) systems. It involves importing all relevant documents or information, splitting them into smaller pieces, converting the data into vector form using an embedding model, and storing these vector embeddings in a database for easy retrieval.

**Advanced RAG**: Unfortunately, there is no detailed explanation provided about Advanced RAG in the text snippet. However, based on general knowledge, it's likely that Advanced RAG builds upon the Naive RAG paradigm by incorporating additional features or techniques to improve its performance and efficiency.

**Modular RAG**: Similarly, Modular RAG is not explained in detail in the text snippet. However, based on general knowledge, it's possible that Modular RAG involves a more modular approach to building RAG systems, where different components are designed to work together seamlessly to ac