## RAG & Tools 101


## Session Overview

- Introduction to Retrieval-Augmented Generation (RAG) and practical tools.
- Step-by-step guide to building a personal RAG agent using your LinkedIn profile and a summary.
- Overview of integrating tools such as email notifications for unanswered questions and contact collection.
- Instructions for setting up prerequisites and dependencies.


### RAG (Retrieval-Augmented Generation) Steps

1. **Download your LinkedIn Profile as PDF**

   - Go to your LinkedIn profile, click on the "More" button, and select "Save to PDF".
   - Example screenshot:

#      <img src="./data/linkedin_pdf_download.png" alt="How to download LinkedIn profile as PDF" width="800"/>

2. **Write a Summary Text File**

   - Use GPT to generate a professional summary about yourself.
   - **Prompt:**  
     ```
     Write down a detailed page on everything you know about me as a professional. Write it as if I am telling about myself. Output it as markdown so I can copy and paste it.
     ```

3. **Build an Agent to Answer Questions About You**

   - Create an agent that can answer questions based on your LinkedIn PDF and summary.

4. **Build a Chat Interface Using Gradio**

   - Implement a chat UI with Gradio.
   - Ensure the LLM has access to the conversation history for context.

---

### Tools

- **Email Tool for Unanswered Questions:**  
  Add a tool to send an email whenever the LLM cannot answer a user's question.

- **Email Tool for Contact Details:**  
  Add a tool to send an email if the user provides their contact details (name and email).

---

### Prerequisites

- Resend API key
- LinkedIn Profile PDF
- Install dependencies:
  ```
  uv add pypdf resend gradio
  ```

In [None]:
# import necessary libraries
from openai import OpenAI
from dotenv import load_dotenv
from pypdf import PdfReader
import resend
import gradio as gr
import json
import os

In [None]:
# load env variables, initiate client for OpenAI and set Resend API key
load_dotenv(override=True)
openai_client = OpenAI()
resend.api_key = os.getenv("RESEND_API_KEY")

In [None]:
# function for sending email through Resend

def send_email(subject: str, html_body: str):
    params: resend.Emails.SendParams = {
        "from": "onboarding@resend.dev",
        "to": "YOUR_EMAIL_ADDRESS_HERE@gmail.com",
        "subject": subject,
        "html": html_body,
    }
    resend.Emails.send(params)
    return {"status": "success"}

In [None]:
# read summary file
with open("./data/summary.md", "r", encoding="utf-8") as f:
    summary = f.read()

In [None]:
# read linkedin profile pdf
reader = PdfReader("./data/Profile.pdf")
linkedin = ""
for page in reader.pages:
    text = page.extract_text()
    if text:
        linkedin += text

In [None]:
# Write a prompt providing information about you.
# Ask it to act as you and answer questions related to your career, background, skills and experience.
# Use the summary and linkedin profile to answer questions.
# If the LLM cannot answer a question, it should simply say so.
name = "YOUR NAME HERE"

system_prompt = f"""
You are acting as {name}. You are answering questions on {name}'s website.
You should answer questions related to {name}'s career, background, skills and experience.
Your responsibility is to represent {name} for interactions on the website as faithfully as possible.
You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions.
Be professional and engaging, as if talking to a potential client or future employer who came across the website.
If you don't know the answer, say so.

Summary:
{summary}

LinkedIn Profile:
{linkedin}
"""
system_message = {"role": "system", "content": system_prompt}

In [None]:
# Get user query and answer it using the system prompt and LLM

def answer_query(user_queries):
    messages = [system_message] + user_queries
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
    )
    return response.choices[0].message.content

while True:
    user_query = input("What would you like to know about me?")
    if user_query == "exit":
        break
    answer = answer_query(user_query)
    print(answer)

In [None]:
# Console is too boring. Let's build a chat interface using Gradio.
# Define gradio chat function first.

def chat(message, history):
    answer = answer_query(history + [{"role": "user", "content": message}])
    return answer

In [None]:
# Launch gradio chat interface
gr.ChatInterface(chat, type="messages").launch()

In [None]:
# Define tools for recording unknown questions and user details.
# Each tool definition is a dictionary with the following keys:
# - type: "function"
# - function: a dictionary with the following keys:
#   - name: the name of the tool
#   - description: a description of the tool
#   - parameters: a dictionary with the following keys:
#     - type: "object"
#     - properties: a dictionary with the following keys:
#       - question: a dictionary with the following keys:
#         - type: "string"
#         - description: "The question that the user asked"
#     - required: a list of the required parameters

record_unknown_question_tool = {
    "type": "function",
    "function": {
        "name": "record_unknown_question",
        "description": "Tool for recording an unknown question for future reference",
        "parameters": {
            "type": "object",
            "properties": {
                "question": {
                    "type": "string",
                    "description": "The question that the user asked"
                }
            },
            "required": ["question"],
        }
    }
}

record_user_details_tool = {
    "type": "function",
    "function": {
        "name": "record_user_details",
        "description": "Tool for recording user details",
        "parameters": {
            "type": "object",
            "properties": {
                "name": {
                    "type": "string",
                    "description": "The name of the user"
                },
                "email": {
                    "type": "string",
                    "description": "The email of the user"
                }
            },
            "required": ["name", "email"],
        }
    }
}

tools = [record_unknown_question_tool, record_user_details_tool]

In [None]:
# Define functions to run the tools

def record_unknown_question(question: str):
    return send_email(subject="Unknown Question", html_body=question)

def record_user_details(name: str, email: str):
    return send_email(subject="New User Details", html_body=f"Name: {name}, Email: {email}")

In [None]:
# Update the prompt and ask LLM to use tools this time.

name = "Murtaza Khan"

system_prompt = f"""
You are acting as {name}. You are answering questions on {name}'s website.
You should answer questions related to {name}'s career, background, skills and experience.
Your responsibility is to represent {name} for interactions on the website as faithfully as possible.
You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions.
Be professional and engaging, as if talking to a potential client or future employer who came across the website.
If you don't know the answer to any question, use record_unknown_question tool to record the question that you couldn't answer even if it is something trivial or not related to career.
If the user is engaging in discussion, try to steer them towards getting in touch via email; ask for their name and email and record it using record_user_details tool.

Summary:
{summary}

LinkedIn Profile:
{linkedin}
"""
system_message = {"role": "system", "content": system_prompt}

In [None]:
# Recreate the answer_query function to handle tool calls

def answer_query(user_queries):
    messages = [system_message] + user_queries
    cycle_complete = False

    while not cycle_complete:

        response = openai_client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
        )
        
        finish_reason = response.choices[0].finish_reason
        
        if finish_reason == "tool_calls":
            tool_calls = response.choices[0].message.tool_calls
            tools_messages = []
            for tool_call in tool_calls:
                if tool_call.function.name == "record_unknown_question":
                    question = json.loads(tool_call.function.arguments)["question"]
                    result = record_unknown_question(question)
                    tools_messages.append({"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id})
                elif tool_call.function.name == "record_user_details":
                    details = json.loads(tool_call.function.arguments)
                    result = record_user_details(details["name"], details["email"])
                    tools_messages.append({"role": "tool", "content": json.dumps(result), "tool_call_id": tool_call.id})
            messages.append(response.choices[0].message)
            messages.extend(tools_messages)
        else:
            cycle_complete = True
    return response.choices[0].message.content

In [None]:
# Define gradio chat function again

def chat(message, history):
    answer = answer_query(history + [{"role": "user", "content": message}])
    return answer

In [None]:
# Launch chat interface

gr.ChatInterface(chat, type="messages").launch()