# **Important**: Run the cell below to load the OpenAI API key for the rest of the notebook

In [None]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
from typing import Optional
_ = load_dotenv(find_dotenv()) # read local .env file
doc_address = 'resources/Noria_eBook_The_Insurance_Industry_2025_171205_v8.pdf' #PDF file address for question answering.

# What is Retrieval Augmented Generation (RAG)

LLMs are limited in their knowldege, meaning that they don't have access to data outside their training set. This limits their ability to present new and updated info out of the box wihtout having a context about certain topics. In this section, we cover how can one provide outside context for LLMs to take advantage of their reasoning to answer questions based on accurate sources of information.

Retrieval Augmented Generation (RAG) involves enhancing the knowledge of Language Model (LLM) by incorporating additional data. Although LLMs can handle a diverse range of topics, their understanding is confined to publicly available information up to a designated training point. To enable AI applications to reason about private or post-cutoff date data, it is necessary to supplement the model's knowledge with the specific information required. This process, referred to as Retrieval Augmented Generation (RAG), involves retrieving and incorporating relevant information into the model prompt.

LangChain encompasses various components specifically designed to facilitate the development of Q&A applications and, more broadly, RAG applications.

## A RAG consists of two main parts:
### Retrieval and generation
#### 1. Retrieve: Given a user input, relevant splits are retrieved from storage using a Retriever.
#### 2. Generate: A ChatModel / LLM produces an answer using a prompt that includes the question and the retrieved data


<div style="display: flex;  height: 500px;">
    <img src="resources/RAG.png"  style="margin-left:auto; margin-right:auto"/>
</div>

We will go through each step and build a RAG system using langchain to chat with a PDF file.

# 1. Retrieval

In this usecase we take a look at retrieval from a pdf file. The same technique can be extended for data in a database, or a website, etc.
The first step is to preprocess the pdf file. This process is usually done in three steps:

1. **Load**: First we need to load our data. This is done with ```DocumentLoaders```.
2. **Split**: Text splitters break large ```Documents``` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won’t fit in a model’s finite context window.
3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a VectorStore and Embeddings model.

We will go through each step and write a code block to perform them.

In [None]:
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.output_parsers import StrOutputParser

A sample PDF document is provided. The document is 25 pages and contains lots of text. The length of the documents prevents it to be fed directly to an LLM. Therefore some splitting and indexing is required

In [None]:
from IPython.display import display, IFrame

# Specify the path to your PDF file
pdf_path = doc_address

# Create an IFrame to embed the PDF viewer
pdf_viewer_iframe = IFrame(src=pdf_path, width=1100, height=900)

# Display the IFrame
pdf_viewer_iframe

## 1.1 Load the file

<div style="display: flex;  height: 500px;">
    <img src="resources/doc_load.png"  style="margin-left:auto; margin-right:auto"/>
</div>

In [None]:
from langchain_community.document_loaders import PyPDFLoader


# Decode the file
loader = PyPDFLoader(doc_address)

# Check out the text from the PDF
loader.load()[:3]

## 1.2 Split the text

<div style="display: flex;  height: 500px;">
    <img src="resources/doc_split.png"  style="margin-left:auto; margin-right:auto"/>
</div>

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

pages = loader.load()


text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)



# Split the document into chunks
splits = text_splitter.split_documents(pages)

splits[:3]

## 1.3 Embed and store the text

<div style="display: flex;  height: 500px;">
    <img src="resources/doc_embed_store.png"  style="margin-left:auto; margin-right:auto"/>
</div>

In [None]:
from langchain_community.embeddings.openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma


# Define the embedding function for the document text
embeddings = OpenAIEmbeddings()

# Create a Chroma vector for searching the documents
docsearch = Chroma.from_documents(splits,
                                 embeddings)

A sample embedding vector for one of the splits is shown in the cell below:

In [None]:
docsearch.get([docsearch.get()['ids'][5]], include=['embeddings', 'documents'])

You can search for different questions ans the nearest splits with context will be queried using the ```Chroma``` API.

In [None]:
docsearch.as_retriever().get_relevant_documents('What is Peer-to-Peer (P2P) Insurance?')

# 2. RAG Agent

<div style="display: flex;  height: 500px;">
    <img src="resources/RAG_agent.png"  style="margin-left:auto; margin-right:auto"/>
</div>

Next, we use the ```LangGraph``` showcased in previous notebook to build a langchain agent which has access to the retrieval engine defined as a tool. To achieve this we follow two steps:

1. Define the retrieval tool using ```langchain``` built-in functions.
2. Create the agent pipline graph using ```LangGraph```.

The code for steps above is written in the following cell:

In [None]:
import json
import operator
from typing import Annotated, Sequence, TypedDict

from langgraph.prebuilt import ToolExecutor
from langchain.tools.retriever import create_retriever_tool
from langchain_core.messages import BaseMessage
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import PromptTemplate
from langchain.tools.render import format_tool_to_openai_function
from langchain_core.messages import BaseMessage, FunctionMessage
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import ToolInvocation
from langgraph.graph import END, StateGraph

prompt = PromptTemplate.from_template("Page {page}: {page_content}")
doc_sep = '========='

tool = create_retriever_tool(
    docsearch.as_retriever(),
    "retrieve_insurance_doc",
    "Use this tool to answer any question about the insurance and finance using the Noria documents",
    document_prompt= prompt,
    document_separator=doc_sep
)

tools = [tool]

# We will set streaming=True so that we can stream tokens
model = ChatOpenAI(temperature=0, streaming=True)

functions = [format_tool_to_openai_function(t) for t in tools]

model_with_functions = model.bind_functions(functions)

tool_executor = ToolExecutor(tools)

class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]



### Edges


def should_retrieve(state):
    """
    Decides whether the agent should retrieve more information or end the process.

    This function checks the last message in the state for a function call. If a function call is
    present, the process continues to retrieve information. Otherwise, it ends the process.

    Args:
        state (messages): The current state of the agent, including all messages.

    Returns:
        str: A decision to either "continue" the retrieval process or "end" it.
    """
    print("---DECIDE TO RETRIEVE---")
    messages = state["messages"]
    last_message = messages[-1]
    # If there is no function call, then we finish
    if "function_call" not in last_message.additional_kwargs:
        print("---DECISION: DO NOT RETRIEVE / DONE---")
        return "end"
    # Otherwise there is a function call, so we continue
    else:
        print("---DECISION: RETRIEVE---")
        return "continue"


### Nodes


# Define the function that calls the model
def call_model(state):
    """
    Invokes the agent model to generate a response based on the current state.

    This function calls the agent model to generate a response to the current conversation state.
    The response is added to the state's messages.

    Args:
        state (messages): The current state of the agent, including all messages.

    Returns:
        dict: The updated state with the new message added to the list of messages.
    """
    print("---CALL AGENT---")
    messages = state["messages"]
    response = model_with_functions.invoke(messages)
    # We return a list, because this will get added to the existing list
    return {"messages": [response]}


# Define the function to execute tools
def call_tool(state):
    """
    Executes a tool based on the last message's function call.

    This function is responsible for executing a tool invocation based on the function call
    specified in the last message. The result from the tool execution is added to the conversation
    state as a new message.

    Args:
        state (messages): The current state of the agent, including all messages.

    Returns:
        dict: The updated state with the new function message added to the list of messages.
    """
    print("---EXECUTE RETRIEVAL---")
    messages = state["messages"]
    # Based on the continue condition
    # we know the last message involves a function call
    last_message = messages[-1]
    # We construct an ToolInvocation from the function_call
    action = ToolInvocation(
        tool=last_message.additional_kwargs["function_call"]["name"],
        tool_input=json.loads(
            last_message.additional_kwargs["function_call"]["arguments"]
        ),
    )
    # We call the tool_executor and get back a response
    response = tool_executor.invoke(action)
    # print(type(response))
    # We use the response to create a FunctionMessage
    function_message = FunctionMessage(content=str(response), name=action.tool)

    # We return a list, because this will get added to the existing list
    return {"messages": [function_message]}

# Define a new graph
workflow = StateGraph(AgentState)

# Define the nodes we will cycle between
workflow.add_node("agent", call_model)  # agent
workflow.add_node("action", call_tool)  # retrieval


# Call agent node to decide to retrieve or not
workflow.set_entry_point("agent")

# Decide whether to retrieve
workflow.add_conditional_edges(
    "agent",
    # Assess agent decision
    should_retrieve,
    {
        # Call tool node
        "continue": "action",
        "end": END,
    },
)

workflow.add_edge('action', 'agent')

# Compile
agent = workflow.compile()

In [None]:
import pprint

from langchain_core.messages import HumanMessage

inputs = {
    "messages": [
        HumanMessage(
            content="What is the future of insurance?"
        )
    ]
}

for output in agent.stream(inputs):
    for key, value in output.items():
        pprint.pprint(f"Output from node '{key}':")
        pprint.pprint("---")
        pprint.pprint(value, indent=2, width=80, depth=None)
    pprint.pprint("\n---\n")



## Try out different questions with the document provided above.

In [None]:
import pprint

from langchain_core.messages import HumanMessage

inputs = {
    "messages": [
        HumanMessage(
            content="..." ## write the question instead of three dots.
        )
    ]
}

for output in agent.stream(inputs):
    for key, value in output.items():
        pprint.pprint(f"Output from node '{key}':")
        pprint.pprint("---")
        pprint.pprint(value, indent=2, width=80, depth=None)
    pprint.pprint("\n---\n")

# TODO: Chatbot with RAG capability

Finally, we want to create a chat bot interface for our agent. To do so, we use the ***[holoviz Panel library](https://panel.holoviz.org/index.html)*** and create a class for the chat bot. Fill in the TODO parts of the code to get the chatbot up and running and answer user questions regarding the PDF file. The chatbot class uses the same agent that is defined above. Therefore, 
#### !!Make sure to run all the cells above for the agent to work properly

Fill in the blank where indicated with # comment sign and complete the chatbot class and functions to see the chatbot in action

In [None]:
import panel as pn
import fitz
from PIL import Image
from functools import partial
import param
import re
from langchain_core.messages.function import FunctionMessage
from langchain.schema.messages import AIMessage
from panel.chat import ChatMessage

pn.extension()

def chat_handler(contents, user, instance):
    # TODO: Try playing around with ChatMessage class to create fancy responses.
    # use the instance.generate_response(contents) function to return the answer to the user. 

    return instance.generate_response(contents)

class PDFChatBot(pn.chat.ChatInterface):
    """
    A HoloViz Panel extension providing a front end for a chatbot equipped with RAG tool.

    This class extends the `pn.chat.ChatInterface` widget to integrate with a chatbot interface
    implemented in Python. It provides a user-friendly chat interface within a Panel
    application, allowing users to interact with the underlying chatbot.

    Attributes:
    -----------
    callback: function
        The function to handle the response to the user chat message.
        
    callback_user: str
        The name of the chatbot user.

    

    Methods:
    --------
    switch_source_tab:
        Switches the active tab to show the source page.
    layout:
        Returns the Panel layout containing the chat interface and other components.
    generate_response:
        Generates a response based on the user query and chat history.
    render_page:
        Renders a specific page of a PDF file as an image.

    """
    def __init__(self, *objects, **params):
        """
        Initialize the PDFChatBot.
        """
        self.callback = chat_handler
        self.callback_user = 'Assistant'
        super(PDFChatBot, self).__init__( *objects, **params)



        ##############################################################
        ## The page numbers of the returned source material from retrieval engine
        ## an array to keep track of chat history
        ## the langchain agent defined above
        
        self.srouce_page_num = [0, 1, 2]
        self.chat_history = []
        self.agent = agent
        ##############################################################

        
        
        self.source_images = [pn.pane.Image(width=500) for _ in range(3)]
        

        self.source_pane = pn.layout.Row(*self.source_images)

        
        
        self.chage_source_tab_button = pn.widgets.Button(name="Show source", button_type='primary')
        # Connect the button click event to the method
        self.chage_source_tab_button.on_click(self.switch_source_tab)


        self.tabs = pn.Tabs(('Conversation', self), ('Show source page', self.source_pane))
        
        self.render_page()

    ##############################################################
    ### Compelete the below function definition.
    ## the aim of the function is to answer the input user query using self.agent.
    ## if the function call has happened, make the necessary adjustments to update the source pages.
    
    def generate_response(self, query):
        ##TODO: complete this function to generate the response to the user query, 
        ## Determine whether the openAI function has been called.
        """
        Generate a response based on user query and chat history.

        Parameters:
        -----------
        query : str
            User's query.

        Returns:
        --------
        answer : str
            Returned output from the agent.
        function_called : bool
            Indicates if a function was called in the response.
        """
        inputs = {
            "messages": [
                HumanMessage(
                    content=query
                )
            ]
        }

        # TODO: check for function calls and adjust the reponse of the model accordingly.
        # The langchain agent is in class property self.agent, you can call it with self.agent.invoke(inputs) 
        # If the retrieval function is called you need to update the source page numbers from the reponse.
        answer = 'default response'
        return answer


    

    def switch_source_tab(self, event):
        """
        Switches the active tab to show the source page.

        Parameters:
        -----------
        event : Event
            The event object representing the button click event.
        """
        self.tabs.active = 1 if self.tabs.active == 0 else 1

    def layout(self):
        """
        Returns the Panel layout containing the chat interface and other components.

        Returns:
        --------
        Panel: The Panel layout containing the chat interface and other components.
        """
        return self.tabs

    def render_page(self):
        """
        Renders source pages of a PDF file in the source tab.

        Returns:
        --------
        None
        """
        doc = fitz.open(doc_address)

        
        for i, pdf_page in enumerate(self.srouce_page_num[:3]):
            page = doc[pdf_page]
            pix = page.get_pixmap(matrix=fitz.Matrix(300 / 72, 300 / 72))
            image = Image.frombytes('RGB', [pix.width, pix.height], pix.samples)
            self.source_images[i].param.update(object=image)

    


In [None]:
chatbot = PDFChatBot()

In [None]:
chatbot.layout()

# You can try with a new pdf file. Upload your pdf file and change the file address at the top of this notebook.