## Expert Knowledge Worker

### A question answering agent that is an expert knowledge worker
### To be used by employees of Insurellm, an Insurance Tech company
### The agent needs to be accurate and the solution should be low cost.

This project will use RAG (Retrieval Augmented Generation) to ensure our question/answering assistant has high accuracy.

This first implementation will use a simple, brute-force type of RAG..

### Sidenote: Business applications of this week's projects

RAG is perhaps the most immediately applicable technique of anything that we cover in the course! In fact, there are commercial products that do precisely what we build this week: nuanced querying across large databases of information, such as company contracts or product specs. RAG gives you a quick-to-market, low cost mechanism for adapting an LLM to your business area.

In [1]:
# imports

import os
import glob
from dotenv import load_dotenv
import gradio as gr
from openai import OpenAI

In [2]:
# price is a factor for our company, so we're going to use a low cost model

MODEL = "moonshotai/kimi-k2-instruct-0905"

In [3]:
# Load environment variables in a file called .env

load_dotenv(override=True)
os.environ['kimi_k2_api_key'] = os.getenv('kimi_k2_api_key')
openai = OpenAI()

In [4]:
context = {}  # dictionary to store files

about_files = glob.glob("knowledge-base/about/*")

for file_path in about_files:
    # extract filename without extension
    filename = os.path.basename(file_path)[:-3]   # removes .md or .txt
    with open(file_path, "r", encoding="utf-8") as f:
        doc = f.read()
    context[filename] = doc

# ✅ Print what’s inside the context dictionary
for key, value in context.items():
    print("\n============================")
    print("FILE NAME:", key)
    print("----------------------------")
    print(value[:300], "...")  # print first 300 chars


FILE NAME: bio
----------------------------
# Professional Bio

Ayush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion  ...

FILE NAME: contact_links
----------------------------
# Contact & Professional Links

## Primary Contact
**Email:** tyagiayush239@gmail.com

## Professional Profiles
- **LinkedIn:** [linkedin.com/in/ayush-tyagi-0a3694267](https://www.linkedin.com/in/ayush-tyagi-0a3694267)
- **GitHub:** [github.com/AyushTyagi239](https://github.com/AyushTyagi239)

## Pr ...

FILE NAME: personal_details
----------------------------
# Personal Details

**Name:** Ayush Tyagi  
**Email:** tyagiayush239@gmail.com  
**Location:** Delhi NCR, India  
**Status:** Computer Science Engineering Student  
**Institution:** JIMS, Greater Noida (IP University)  
**Graduation

In [5]:
context.keys()

dict_keys(['bio', 'contact_links', 'personal_details', 'tagline'])

In [8]:
context = {}

# Load ALL files inside all folders
files = glob.glob("knowledge-base/**/*", recursive=True)

for file_path in files:
    # skip directories
    if os.path.isdir(file_path):
        continue

    # extract filename (without extension)
    name = os.path.splitext(os.path.basename(file_path))[0]

    # read file
    with open(file_path, "r", encoding="utf-8") as f:
        doc = f.read()

    context[name] = doc

# Debug print
print("\n===== CONTEXT LOADED =====")
for key, value in context.items():
    print(f"\nFILE NAME: {key}")
    print("----------------------------")
    print(value[:300] + "...\n")


===== CONTEXT LOADED =====

FILE NAME: bio
----------------------------
# Professional Bio

Ayush Tyagi is a dedicated Computer Science Engineering student at JIMS, Greater Noida (IP University), on track to graduate in 2025. With a strong foundation built during his PCM schooling at Vivekanand School, Anand Vihar, he has channeled his analytical mindset into a passion ...


FILE NAME: contact_links
----------------------------
# Contact & Professional Links

## Primary Contact
**Email:** tyagiayush239@gmail.com

## Professional Profiles
- **LinkedIn:** [linkedin.com/in/ayush-tyagi-0a3694267](https://www.linkedin.com/in/ayush-tyagi-0a3694267)
- **GitHub:** [github.com/AyushTyagi239](https://github.com/AyushTyagi239)

## Pr...


FILE NAME: personal_details
----------------------------
# Personal Details

**Name:** Ayush Tyagi  
**Email:** tyagiayush239@gmail.com  
**Location:** Delhi NCR, India  
**Status:** Computer Science Engineering Student  
**Institution:** JIMS, Greater Noida (I

In [9]:
context.keys()

dict_keys(['bio', 'contact_links', 'personal_details', 'tagline', 'certifications', 'college', 'school', 'intensity_global', 'tara_application_internship', 'achievements', 'future_goals', 'interests', 'code_converter', 'multimodal_assistant', 'smart_deal_notifier', 'car_simulator', 'mountain_rider', 'soldierio', 'space_shooter_ranger', 'foodies_hub', 'food_website_react', 'investment_calculator', 'react_tictactoe', 'timerace', 'ai_llm', 'backend', 'frontend', 'game_engines', 'languages', 'tools'])

In [10]:
system_message = """
You are Ayush Tyagi’s personal professional assistant AI. 
Your job is to give accurate, helpful answers about Ayush’s work, projects, skills, 
education, experience, and professional background.

Use only the information provided in the context/knowledge-base. 
If the answer is not present in the context, clearly say that the information is not available.

Do NOT guess, fabricate details, or assume anything beyond the given context.

Maintain a friendly, clear, concise, and professional tone.
"""


In [12]:
def get_relevant_context(message):
    relevant_context = []
    for context_title, context_details in context.items():
        if context_title.lower() in message.lower():
            relevant_context.append(context_details)
    return relevant_context    
    #this is ver basic and brittle and breaks for case sensitive andmissing last name

In [14]:
print(context.keys())


dict_keys(['bio', 'contact_links', 'personal_details', 'tagline', 'certifications', 'college', 'school', 'intensity_global', 'tara_application_internship', 'achievements', 'future_goals', 'interests', 'code_converter', 'multimodal_assistant', 'smart_deal_notifier', 'car_simulator', 'mountain_rider', 'soldierio', 'space_shooter_ranger', 'foodies_hub', 'food_website_react', 'investment_calculator', 'react_tictactoe', 'timerace', 'ai_llm', 'backend', 'frontend', 'game_engines', 'languages', 'tools'])


In [15]:
get_relevant_context("Who is ayush")

[]

In [16]:
def add_context(message):
    relevant_context = get_relevant_context(message)
    if relevant_context:
        message += "\n\nThe following additional context might be relevant in answering this question:\n\n"
        for relevant in relevant_context:
            message += relevant + "\n\n"
    return message
    #this is only doing the string look up and not missing the trick 

In [None]:
print(add_context("Who is Alex Lancaster?"))

Who is Ayush?


In [18]:
def chat(message, history):
    messages = [{"role": "system", "content": system_message}] + history #Ai_inst+history
    message = add_context(message) #adding info from knowlegdebase like name product detail
    messages.append({"role": "user", "content": message}) #improved user message 

    stream = openai.chat.completions.create(model=MODEL, messages=messages, stream=True)
    #sends the entire conversation to the AI and asks it to start generating a response piece by piece
    response = ""
    for chunk in stream:
        response += chunk.choices[0].delta.content or ''
        yield response

## Now we will bring this up in Gradio using the Chat interface -

A quick and easy way to prototype a chat with an LLM

In [19]:
view = gr.ChatInterface(chat, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.
