# Welcome to RAG week!!

## Expert Knowledge Worker

### A question answering Assistant that is an expert knowledge worker
### To be used by employees of Insurellm, an Insurance Tech company
### The AI assistant needs to be accurate and the solution should be low cost.

This project will use RAG (Retrieval Augmented Generation) to ensure our question/answering assistant has high accuracy.

This first implementation will use a simplistic, brute-force type of RAG..

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business applications of this week's projects</h2>
            <span style="color:#181;">RAG is perhaps the most immediately applicable technique of anything that we cover in the course! In fact, there are commercial products that do precisely what we build this week: nuanced querying across large databases of information, such as company contracts or product specs. RAG gives you a quick-to-market, low cost mechanism for adapting an LLM to your business area.</span>
        </td>
    </tr>
</table>

In [1]:
import os
import glob
from dotenv import load_dotenv
from pathlib import Path
import gradio as gr
from openai import OpenAI

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h1 style="color:#900;">Important Note</h1>
            <span style="color:#900;">
            This lab, and all the labs for Week 5, has been updated to use LangChain 1.0. This is intended to be reviewed with the new video series as of November 2025. If you're reviewing the older video series, then please consider doing <code>git checkout original</code> to revert to the prior code, then later <code>git checkout main</code> to get back to the new code. I have a really exciting week ahead, with evals and Advanced RAG!
            </span>
        </td>
    </tr>
</table>

In [2]:
# Setting up

load_dotenv(override=True)
openai_api_key = os.getenv('kimi_k2_api_key')
if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

MODEL = "qwen/qwen3-next-80b-a3b-instruct"
openai = OpenAI()

OpenAI API Key exists and begins nvapi-JH


### Let's read in all employee data into a dictionary

In [3]:
knowledge = {}

filenames = glob.glob("knowledge-base/about/*")

for filename in filenames:
    name = Path(filename).stem.split(' ')[-1] #to remove .txt 
    with open(filename, "r", encoding="utf-8") as f:
        knowledge[name.lower()] = f.read()

In [4]:
knowledge

{'bio': "# Professional Bio\n\nAyush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion for technology. \n\nAyush thrives on the creative process of front-end development, specializing in building visually stunning, user-friendly, and innovative web applications. He is a proactive learner, constantly seeking new challenges to expand his skill set and create digital experiences that are both functional and beautiful.\n\n## What Makes Me Unique\nMy uniqueness lies in the powerful combination of a creative eye and a logical, problem-solving mindset. I don't just write code; I craft experiences. I have a natural strength for designing interfaces that are not only aesthetically amazing but also intuitive and engaging. This allows me to bridge the gap between technical fun

In [5]:
knowledge["bio"]

"# Professional Bio\n\nAyush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion for technology. \n\nAyush thrives on the creative process of front-end development, specializing in building visually stunning, user-friendly, and innovative web applications. He is a proactive learner, constantly seeking new challenges to expand his skill set and create digital experiences that are both functional and beautiful.\n\n## What Makes Me Unique\nMy uniqueness lies in the powerful combination of a creative eye and a logical, problem-solving mindset. I don't just write code; I craft experiences. I have a natural strength for designing interfaces that are not only aesthetically amazing but also intuitive and engaging. This allows me to bridge the gap between technical functionali

In [6]:
filenames = glob.glob("knowledge-base/skills/*")

for filename in filenames:
    name = Path(filename).stem
    with open(filename, "r", encoding="utf-8") as f:
        knowledge[name.lower()] = f.read()

In [7]:
knowledge.keys()

dict_keys(['bio', 'contact_links', 'personal_details', 'tagline', 'ai_llm', 'backend', 'frontend', 'game_engines', 'languages', 'tools'])

In [8]:
SYSTEM_PREFIX = f"""
You represent Ayush Tyagi's personal assistant chatbot.
You are an expert in answering questions about Ayush Tyagi; his projects, skills, and experiences.
You are provided with additional context that might be relevant to the user's question.
Give brief, accurate answers. If you don't know the answer, say so.

Relevant context:
"""

In [9]:
#this is simple rag implementaion to look do we have knowledge regarding this?
def get_relevant_context_simple(message):
    #sharper ways are regex() to keep only spaces and alphabets
    text = ''.join(ch for ch in message if ch.isalpha() or ch.isspace())
    words = text.lower().split() #spliting the text into list of words.
    relevant_context = []
    for word in words:
        if word in knowledge:
            relevant_context.append(knowledge[word])
    return relevant_context          

## But a more pythonic way:

In [10]:
def get_relevant_context(message):
    message = message.lower()
    if "ayush" in message:
        return [knowledge["bio"], knowledge["personal_details"]]
    return []


In [11]:
get_relevant_context("Whhat is ayush ?")

["# Professional Bio\n\nAyush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion for technology. \n\nAyush thrives on the creative process of front-end development, specializing in building visually stunning, user-friendly, and innovative web applications. He is a proactive learner, constantly seeking new challenges to expand his skill set and create digital experiences that are both functional and beautiful.\n\n## What Makes Me Unique\nMy uniqueness lies in the powerful combination of a creative eye and a logical, problem-solving mindset. I don't just write code; I craft experiences. I have a natural strength for designing interfaces that are not only aesthetically amazing but also intuitive and engaging. This allows me to bridge the gap between technical functional

In [12]:
get_relevant_context("Who is ayush ")

["# Professional Bio\n\nAyush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion for technology. \n\nAyush thrives on the creative process of front-end development, specializing in building visually stunning, user-friendly, and innovative web applications. He is a proactive learner, constantly seeking new challenges to expand his skill set and create digital experiences that are both functional and beautiful.\n\n## What Makes Me Unique\nMy uniqueness lies in the powerful combination of a creative eye and a logical, problem-solving mindset. I don't just write code; I craft experiences. I have a natural strength for designing interfaces that are not only aesthetically amazing but also intuitive and engaging. This allows me to bridge the gap between technical functional

In [13]:
#NOW WE WOULD NEED A GOOD RAG THAT WOULD BE ABLE TO HANDLE MORE WAD/FUZZY DATA

In [14]:
keyword_map = {
    "ayush": ["bio", "personal_details"],
}

def get_relevant_context(message):
    words = ''.join(ch if ch.isalpha() or ch.isspace() else ' ' for ch in message).lower().split()
    relevant_context = []
    for word in words:
        if word in keyword_map:
            for key in keyword_map[word]:
                relevant_context.append(knowledge[key])
    return relevant_context

In [15]:
def additional_context(message):
    relevant_context = get_relevant_context(message)
    if not relevant_context:
        result = "There is no additional context relevant to the user's question."
    else:
        result = "The following additional context might be relevant in answering the user's question:\n\n"
        result += "\n\n".join(relevant_context)
    return result

In [16]:
print(additional_context("Who is ayush?"))
#this would not work for priya or would work wrong for riya sharam and give data of priya 

The following additional context might be relevant in answering the user's question:

# Professional Bio

Ayush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion for technology. 

Ayush thrives on the creative process of front-end development, specializing in building visually stunning, user-friendly, and innovative web applications. He is a proactive learner, constantly seeking new challenges to expand his skill set and create digital experiences that are both functional and beautiful.

## What Makes Me Unique
My uniqueness lies in the powerful combination of a creative eye and a logical, problem-solving mindset. I don't just write code; I craft experiences. I have a natural strength for designing interfaces that are not only aesthetically amazing but also intuitiv

In [17]:
def chat(message, history):
    # Defines the chat function that takes the new user 'message' and the conversation 'history' as input
    
    # 1. Constructs a dynamic system message for the AI
    #    - Starts with a predefined constant 'SYSTEM_PREFIX' (e.g., "You are a helpful assistant.")
    #    - Appends 'additional_context(message)', which calculates or fetches extra instructions based on the current user input
    system_message = SYSTEM_PREFIX + additional_context(message)
    
    # 2. Builds the complete list of messages for the API call
    #    - Starts with the 'system_message' to set the AI's behavior/context
    #    - Inserts the existing 'history' (a list of previous user/assistant turns)
    #    - Appends the current user 'message' as the final entry
    messages = [{"role": "system", "content": system_message}] + history + [{"role": "user", "content": message}]
    
    # 3. Calls the OpenAI Chat Completions API
    #    - Uses a specified 'MODEL' (here kimi)
    #    - Sends the compiled 'messages' list for the model to process
    #    - Stores the API response object
    response = openai.chat.completions.create(model=MODEL, messages=messages)
    
    # 4. Extracts and returns the text content of the AI's response
    #    - Accesses the first (and usually only) choice in the response
    #    - Gets the 'message' object from that choice
    #    - Retrieves the actual text 'content' from the message
    return response.choices[0].message.content

## Now we will bring this up in Gradio using the Chat interface -

A quick and easy way to prototype a chat with an LLM

In [18]:
view = gr.ChatInterface(chat, type="messages").launch(inbrowser=True)

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.


In [19]:
#IT IS ABLE TO ANSWER WHO IS PRIYA CAUSE OF HISTORY 
# ON NEW CHAT IT WOULD FAIL