# RAG Introduction

<h4> <font size="5px" color="purple"><b>|</b></font> Retrieval-augmented generation (RAG)</h4>

RAG is <font color="lightGreen">a technique for augmenting LLM knowledge with additional data</font> (often private or real-time).

## RAG Concepts

To enable AI applications to handle <font color="lightGreen">private or post-cutoff data, augmenting the model's knowledge</font> is necessary, perform a process called Retrieval Augmented Generation (RAG).

- Language models (LMs) are <font color="lightGreen">proficient at reasoning about diverse subjects</font>.

- <font color="orange">However</font>, their knowledge is constrained to <font color="orange">public data up to the training cutoff date</font>.



<h3>Architecture</h3>

<h4>A typical RAG application has two main components:</h4>


```mermaid
graph LR
   Indexing
```

Indexing: a pipeline for ingesting data from a source and indexing it. This usually happen offline.





```mermaid
graph LR
   Retrieval
```

Retrieval and generation

Retrieve: Given a user input, relevant splits are retrieved from storage using a Retriever.
Generate: A ChatModel / LLM produces an answer using a prompt that includes the question and the retrieved data

```mermaid
graph LR
   Indexing
```


<img src="./img/indexing.png" width=700/>

```mermaid
graph LR
   Retrieval
```
<img src="./img/retrieval.png" width=700/>


Source: https://python.langchain.com/docs/use_cases/question_answering/

## ⚙️ Setup

In [1]:
import bs4
from PyPDF2 import PdfReader

from langchain import hub
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import ChatPromptTemplate

from langchain.embeddings import OpenAIEmbeddings
from langchain.schema import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma

## About Chroma



<img src="https://www.mlq.ai/content/images/2023/08/1_admwyPyR6v_IZI0EYE--eA-1.webp" width="50px"/>

In this example we will use Chroma 

Chroma is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs.

Links:
- https://docs.trychroma.com/getting-started
- https://docs.trychroma.com/

<img src="https://docs.trychroma.com/img/hrm4.svg" width="500px"/>

## Example 1: From Text

In [2]:
texts =  [
    "Harrison worked at YZW",
    "bears like to eat honey",
    "Each day is a new opportunity for growth and success.",
    "I believe in my abilities and trust the journey I am on.",
    "I radiate positivity and attract abundance into my life.",
    "I am deserving of love, happiness, and all the good things life has to offer.",
]

vectorstore = Chroma.from_texts(texts, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

prompt = hub.pull("rlm/rag-prompt")
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain_1 = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [3]:
rag_chain_1.invoke("Where did Harrison work?")

'Harrison worked at YZW.'

## Example 2: From Files (PDF)

In [4]:

pdf_data = PdfReader("./context/biographies.pdf")

pdf_text = ""

for i, page in enumerate(pdf_data.pages):
    text = page.extract_text()
    if text:
        pdf_text += text

print(len(pdf_text))

text_splitter = CharacterTextSplitter(
    separator = "\n",
    chunk_size = 1000,
    chunk_overlap = 100
)

final_data = text_splitter.split_text(pdf_text)

print(f"""
    # of Chunks: {len(final_data)}
    Chunk 0: {final_data[0]}
    Chunk 1: {final_data[1]}
""")

16005

    # of Chunks: 19
    Chunk 0: Ferdinando Langchain: Pioneering AI Engineer and Visionary  
 
Ferdinando Langchain was born on a brisk winter day in 1985 in Milan, Italy. From an early 
age, he displayed an insatiable curiosity about technology and a keen interest in 
understanding the mysteries of artificial intelligence. His journey into AI began when he 
stumbled upon an old computer in his father's study, sparking a fascination that would 
shape his future.  
 
After completing his undergraduate studies in computer science at the University of Milan, 
Ferdinando ventured to the United States to pursue a Ph.D. in Artificial Intelligence at the 
Massachusetts Institute of Technology (MIT). Under the mentorship of ren owned AI 
researchers, he quickly established himself as a rising star in the field.  
 
Ferdinando's breakthroughs in artificial general intelligence (AGI) were revolutionary. His 
doctoral thesis, titled "Synaptic Horizons: A Journey Towards AGI," laid the fou

In [5]:
from langchain.prompts import HumanMessagePromptTemplate
from langchain.schema.messages import SystemMessage

chat_template = ChatPromptTemplate.from_messages(
    [
        SystemMessage(
            content=(
                "You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use information from context {context} to answer the question"
            )
        ),
        HumanMessagePromptTemplate.from_template("{context} {question}"),
    ]
)


In [6]:
vectorstore = Chroma.from_texts(final_data, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

prompt = hub.pull("rlm/rag-prompt")

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain_2 = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | chat_template
    | llm
    | StrOutputParser()
)

In [7]:
question = "Who is Ferdinando Langchain?"
chat_template.format_messages(
    context=vectorstore.similarity_search(question, k=4),
    question = question
)

[SystemMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use information from context {context} to answer the question"),
 HumanMessage(content='[Document(page_content=\'Ferdinando Langchain: Pioneering AI Engineer and Visionary  \\n \\nFerdinando Langchain was born on a brisk winter day in 1985 in Milan, Italy. From an early \\nage, he displayed an insatiable curiosity about technology and a keen interest in \\nunderstanding the mysteries of artificial intelligence. His journey into AI began when he \\nstumbled upon an old computer in his father\\\'s study, sparking a fascination that would \\nshape his future.  \\n \\nAfter completing his undergraduate studies in computer science at the University of Milan, \\nFerdinando ventured to the United States to pursue a Ph.D. in Artificial Intelligence at the \\nMassachusetts Institute of Technol

In [8]:
rag_chain_2.invoke("Who is Ferdinando Langchain?" )

'Ferdinando Langchain is a pioneering AI engineer and visionary. He was born in Milan, Italy in 1985 and displayed a strong interest in technology and artificial intelligence from a young age. He completed his undergraduate studies in computer science at the University of Milan and went on to pursue a Ph.D. in Artificial Intelligence at the Massachusetts Institute of Technology (MIT) in the United States. \n\nFerdinando\'s breakthroughs in artificial general intelligence (AGI) were revolutionary, and his doctoral thesis titled "Synaptic Horizons: A Journey Towards AGI" laid the foundation for a new era in machine learning. He received several prestigious awards for his contributions to the field, including the Turing Award, the Lighthaven Medal, and the Global Innovation Prize. \n\nIn addition to his research, Ferdinando was a sought-after speaker at conferences and events worldwide, where he made complex AI concepts accessible to a broader audience. He also advocated for ethical AI pr

In [9]:
rag_chain_2.invoke("How many awards won Ferdinando Langchain?" )

'Ferdinando Langchain won three prestigious international prizes for his outstanding contributions to the field of AI. The awards include the Turing Award, the Lighthaven Medal, and the Global Innovation Prize.'

## Example 3: From web

In [10]:
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

prompt = hub.pull("rlm/rag-prompt")

print(prompt)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain_3 = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))]


In [11]:
rag_chain_3.invoke("What is Task Decomposition?")

"Task decomposition is a technique used to break down complex tasks into smaller and simpler steps. It can be done through various methods such as using prompting techniques, task-specific instructions, or human inputs. The goal is to make the task more manageable and facilitate the interpretation of the model's thinking process."

https://lilianweng.github.io/posts/2023-06-23-agent/