# CRAG-ReAct: Intelligent Retrieval & Reasoning
## Robust Retrieval-Augmented Generation with Step-by-Step Reasoning

Welcome to this Jupyter notebook exploring cutting-edge techniques in artificial intelligence. We'll be diving into a powerful combination of two advanced approaches:

1. **Corrective Retrieval Augmented Generation (CRAG)**
2. **Reasoning+Acting (ReAct)**

![Local image](./assets/RAG.png "Flow Diagram")

Let's get started!

<small>

**SOURCES:**

[The paper](https://arxiv.org/abs/2210.03629) introduces ReAct, an approach that combines reasoning traces and task-specific actions in large language models (LLMs) to enhance their problem-solving capabilities. ReAct demonstrates improved performance, interpretability, and trustworthiness across various tasks, including question answering, fact verification, and interactive decision making, by allowing LLMs to generate reasoning steps and actions in an interleaved manner while interacting with external sources.


[The Corrective Retrieval Augmented Generation (CRAG)](https://arxiv.org/abs/2401.15884) is a proposed approach to improve the robustness of language model generation by incorporating a lightweight retrieval evaluator, large-scale web searches, and a decompose-then-recompose algorithm for retrieved documents. CRAG aims to enhance the performance of RAG-based approaches by assessing retrieval quality, augmenting results with web searches when necessary, and selectively focusing on key information while filtering out irrelevant content.
</small>


In [None]:
# Make sure you're running this in an IPython environment that supports top-level await
# You might need to run '%autoawait asyncio' at the start of your notebook if it's not enabled by default

from dotenv import load_dotenv
load_dotenv()
from paginx.graphs.vectrix import RAGWorkflowGraph
from langchain_core.messages import HumanMessage
from IPython.display import Image, display
from paginx.db.postgresql import PostgresSaver, BaseCheckpointSaver
import os
from langsmith import Client
from psycopg_pool import AsyncConnectionPool

In [None]:


# Enable langsmith tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"

DB_URI = os.getenv("DB_URI")

# Create the pool without opening it
pool = AsyncConnectionPool(
    conninfo=DB_URI,
    max_size=20,
    open=False  # This prevents the pool from opening in the constructor
)

# Explicitly open the pool
await pool.open()

checkpointer = PostgresSaver(async_connection=pool)
await checkpointer.acreate_tables(pool)

config = {"configurable": {"thread_id": "999999999"}}

demo_graph = RAGWorkflowGraph(DB_URI=DB_URI)
graph = demo_graph.create_graph(checkpointer=checkpointer)

display(Image(graph.get_graph().draw_mermaid_png()))

input = HumanMessage(content="How's the weather in Antwerp ?")

run_id = ""
langgraph_node = ""

async for event in graph.astream_events({"question": input}, version="v1", config=config):
    run_id = event['run_id']
    kind = event["event"]
    if kind == "on_chat_model_stream":
        if langgraph_node != event['metadata']['langgraph_triggers']:
            langgraph_node = event['metadata']['langgraph_triggers']
            print(f"Answering Step: {langgraph_node[0].split(':')[-1]}")
        if event['metadata']['langgraph_node'] == "generate_response":
            content = event["data"]["chunk"].content
            if content:
                print(content, end="|")
    # Print the documents used to generate the answer, if any
    if kind == "on_chain_end":
        if event["name"] == "generate_response":
            sources = []
            for doc in event["data"]["input"]["documents"]:
                sources.append(doc.dict())
            print(sources)


client = Client()
run = client.read_run(run_id)
print('\n\n', run.url)

# Don't forget to close the pool when you're done
# You can run this in a separate cell when you're finished
await pool.close()

In [None]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["LANGCHAIN_TRACING_V2"] = "true"


from langchain_core.pydantic_v1 import BaseModel, Field
from langchain.prompts import ChatPromptTemplate

llm = OllamaFunctions(model="llama3-groq-tool-use:8b-q8_0", format="json")


class GradeDocuments(BaseModel):
    binary_score: str = Field(
        description="Documents are relevant to the question, 'yes' or 'no'"
    )


system = """You are a grader assessing relevance of a retrieved document to a user question. \n 
    If the document contains keyword(s) or semantic meaning related to the question, grade it as relevant. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."""
grade_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "Retrieved document: \n\n {document} \n\n User question: {question}"),
    ]
)

structured_llm_grader = llm.with_structured_output(GradeDocuments)
grade_chain =  grade_prompt | structured_llm_grader

response = grade_chain.ainvoke({"document": "25 degrees currently in NYC", "question": "How's the weather in Antwerp ?"})
answer = await response
print(answer.binary_score)


In [37]:
from langchain_ollama import ChatOllama
from langchain_core.messages import AIMessage
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain.output_parsers.openai_tools import JsonOutputKeyToolsParser
from langchain.output_parsers.openai_tools import PydanticToolsParser




class GradeDocuments(BaseModel):
    binary_score: str = Field(
        description="Documents are relevant to the question, 'yes' or 'no'"
    )


llm = ChatOllama(
    model="llama3-groq-tool-use:8b-q8_0",
    temperature=0,)

llm = llm.bind_tools([GradeDocuments])

#parser = JsonOutputToolsParser()
#parser = JsonOutputKeyToolsParser(key_name="GradeDocuments")
parser = PydanticToolsParser(tools=[GradeDocuments])


system = """You are a grader assessing relevance of a retrieved document to a user question. \n 
    If the document contains keyword(s) or semantic meaning related to the question, grade it as relevant. \n
    Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question."""
grade_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "Retrieved document: \n\n {document} \n\n User question: {question}"),
    ]
)

grade_chain =  grade_prompt | llm | parser

response = grade_chain.ainvoke({"document": "25 degrees currently in NYC", "question": "How's the weather in Antwerp ?"})
answer = await response
answer[0].binary_score

'no'