<div class='flex gap-1 items-center'>
    <img alt='self llama' src='../images/self-rag.jpeg' width='128' height='128' class='rounded'>
    <h1>Self Corrective RAG with LangGraph and Groq Llama 3</h1>
</div>

---


## Install dependencies

```bash
pip install langchain langchain-core langchain-community langchain-groq langgraph wikipedia
```


## Import libraries


In [1]:
from langchain_core.documents import Document
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.utils.function_calling import convert_to_openai_tool
from langchain_core.output_parsers import StrOutputParser, PydanticOutputParser
from langchain_groq.chat_models import ChatGroq
from langchain_community.retrievers import WikipediaRetriever

## Basic RAG Components


### Language Model


In [2]:
llm = ChatGroq(
    model='llama3-groq-8b-8192-tool-use-preview'
)

res = llm.invoke('Hi! How are you today?')

res.content

"I'm doing well, thank you! How can I assist you today?"

### Documents Retriever


In [3]:
retriever = WikipediaRetriever(top_k_results=6)

docs = retriever.invoke("Meta AI")

docs[0].metadata

{'title': 'Meta AI',
 'summary': 'Meta AI is an American company owned by Meta (formerly Facebook) that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta\'s Applied Machine Learning (AML) team, which focuses on the practical applications of its products. \n\nThe laboratory was founded as Facebook Artificial Intelligence Research (FAIR) with locations at the headquarters in Menlo Park, California, London, United Kingdom, and a new laboratory in Manhattan. FAIR was officially announced in September 2013. FAIR was first directed by New York University\'s Yann LeCun, a deep learning professor and Turing Award winner. Working with NYU\'s Center for Data Science, FAIR\'s initial goal was to research data science, machine learning, and artificial intelligence and to "understand intelligence, to discover its fundament

In [4]:
print(docs[0].page_content)

Meta AI is an American company owned by Meta (formerly Facebook) that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta's Applied Machine Learning (AML) team, which focuses on the practical applications of its products. 

The laboratory was founded as Facebook Artificial Intelligence Research (FAIR) with locations at the headquarters in Menlo Park, California, London, United Kingdom, and a new laboratory in Manhattan. FAIR was officially announced in September 2013. FAIR was first directed by New York University's Yann LeCun, a deep learning professor and Turing Award winner. Working with NYU's Center for Data Science, FAIR's initial goal was to research data science, machine learning, and artificial intelligence and to "understand intelligence, to discover its fundamental principles, and to make machines sign

## Tools


### Format documents


In [5]:
def format_docs(docs: list[Document]) -> str:
    formatted = [
        (
            f"Source ID: {i+1}\n"
            f"Article Title: {doc.metadata['title']}\n"
            f"Article URL: {doc.metadata['source']}\n"
            f"Article Content: {doc.page_content}"
        )
        for i, doc in enumerate(docs)
    ]
    return "\n\n" + "\n\n".join(formatted)

In [6]:
print(format_docs(docs[:2]))



Source ID: 1
Article Title: Meta AI
Article URL: https://en.wikipedia.org/wiki/Meta_AI
Article Content: Meta AI is an American company owned by Meta (formerly Facebook) that develops artificial intelligence and augmented and artificial reality technologies. Meta AI deems itself an academic research laboratory, focused on generating knowledge for the AI community, and should not be confused with Meta's Applied Machine Learning (AML) team, which focuses on the practical applications of its products. 

The laboratory was founded as Facebook Artificial Intelligence Research (FAIR) with locations at the headquarters in Menlo Park, California, London, United Kingdom, and a new laboratory in Manhattan. FAIR was officially announced in September 2013. FAIR was first directed by New York University's Yann LeCun, a deep learning professor and Turing Award winner. Working with NYU's Center for Data Science, FAIR's initial goal was to research data science, machine learning, and artificial intel

## Chains


### Answer question with citations


#### CitedAnswer Output Model


In [7]:
class CitedAnswer(BaseModel):
    """Answer the user question based only on the given sources, and cite the sources used."""

    answer: str = Field(
        ...,
        description="The answer to the user question, which is based only on the given sources.",
    )
    citations: list[int] = Field(
        ...,
        description="The integer IDs of the SPECIFIC sources which justify the answer.",
    )

#### Prompt


In [8]:
RAG_SYSTEM_PROMPT = (
    "You are a helpful AI assistant. Your task is to answer questions based SOLELY on the information provided in the given articles.\n\n"
    "Main instructions:\n"
    "1. Answer the user's question using ONLY the information from the provided articles.\n"
    "2. Cite the source for EVERY statement you make, using the format [source_id] at the end of each sentence.\n"
    "3. If the information needed to answer the question is not in the provided articles, respond: \"I don't have enough information in the provided sources to answer this question.\"\n"
    "4. DO NOT use external knowledge or information not in the given articles, even if you believe it to be correct.\n"
    "5. If you can only partially answer the question, provide the information you have and then indicate that the rest of the question cannot be answered with the available information.\n"
    "6. Be concise and direct in your responses.\n\n"
    "Reference articles:\n"
    "{documents}\n\n"
)


RAG_PROMPT = ChatPromptTemplate.from_messages(
    [
        ("system", RAG_SYSTEM_PROMPT),
        ("human", "{question}"),
    ]
)

RAG_PROMPT.pretty_print()


You are a helpful AI assistant. Your task is to answer questions based SOLELY on the information provided in the given articles.

Main instructions:
1. Answer the user's question using ONLY the information from the provided articles.
2. Cite the source for EVERY statement you make, using the format [source_id] at the end of each sentence.
3. If the information needed to answer the question is not in the provided articles, respond: "I don't have enough information in the provided sources to answer this question."
4. DO NOT use external knowledge or information not in the given articles, even if you believe it to be correct.
5. If you can only partially answer the question, provide the information you have and then indicate that the rest of the question cannot be answered with the available information.
6. Be concise and direct in your responses.

Reference articles:
[33;1m[1;3m{documents}[0m




[33;1m[1;3m{question}[0m


#### RAG Chain


In [9]:
rag_llm = llm.with_structured_output(CitedAnswer)

rag_response_chain = (
    RunnablePassthrough.assign(documents=(
        lambda x: format_docs(x["documents"]))
    )
    | RAG_PROMPT
    | rag_llm
)

retrieve_docs = (lambda x: x['question']) | retriever

rag_chain = RunnablePassthrough.assign(
    documents=retrieve_docs).assign(response=rag_response_chain)

In [10]:
result = rag_chain.invoke({"question": "What is Meta AI?"})

result['response']

CitedAnswer(answer='Meta AI is an American company owned by Meta (formerly Facebook) that develops artificial intelligence and augmented and artificial reality technologies.', citations=[1])

In [11]:
result = rag_chain.invoke({"question": "Who is Jann LeCun?"})

result['response']

CitedAnswer(answer='Jann LeCun is a French computer scientist and director of AI Research at Facebook. He is also a professor at New York University and is known for his work on convolutional neural networks and his role in the development of the LeNet-5 and LeNet-7 algorithms.', citations=[1])

## To be continued