<a href="https://colab.research.google.com/github/khushi-1909/ragqa-bot/blob/main/RAG_QA_BOT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 👩‍💻 RAG QA Bot for Business Operations – Interactive Demo

This notebook shows how you can build a smart question-answering bot for any business using the latest OpenAI and Pinecone tools.
The goal is to make it easy for anyone in the company to get instant, reliable answers from your policy documents, HR handbook, or internal FAQs without searching through endless PDFs or Slack threads.

---

###  Working

1. First the relevant documents are converted into 'vectors'. (digital fingerprints).
2. WHen a question is asked, it is turned into a vector as well and then the most similar documents are found using Pinecone.
3. Finally, an AI model (like GPT) is used to write a natural-sounding answer, only using information from the documents we found.

---

### About Embeddings and Demo Mode

This notebook is designed to demonstrate a real Retrieval-Augmented Generation (RAG) pipeline for business QA, using OpenAI’s embeddings and Pinecone for semantic search.

However, due to OpenAI API quota limitations on my current account, I am unable to generate real embeddings for the documents and queries in this demo.

So instead, I have used simulated (random) embeddings so the pipeline, retrieval logic, and answer generation flow can still be demonstrated and reviewed.

---

## In a production or fully working demo:

With an active OpenAI API key and sufficient quota, the system would use the following line to generate each document’s semantic vector:
  ```python
  client.embeddings.create(input=[text], model="text-embedding-ada-002").data[0].embedding
  ```
  
Follow the instructions below to swap- in real embeddings.


---

### To switch from demo to production
1. Uncomment the real embedding code.
2. Add your OpenAI API key.
3. Comment out the simulated vectors and replace the variable name.

Instructions for this are given in the code cells as well for clarity.



In [None]:
#Install required libraries for OpenAI and Pinecone
!pip install openai pinecone --quiet

In [None]:
# Import the new OpenAI SDK, Pinecone SDK, and numpy for handling vectors
from openai import OpenAI
from pinecone import Pinecone, ServerlessSpec
import numpy as np

In [None]:
openai_api_key='Enter your OpenAI API key here'
pinecone_api_key='Enter your pinecone API key here'
pinecone_env='gcp-starter'

#Initialize Pinecone and OpenAI clients
pc=Pinecone(api_key=pinecone_api_key, environment=pinecone_env)
client=OpenAI(api_key=openai_api_key)

In [None]:
#Define the name for the pinecone index
index_name='business------faq'
#create the index if it doesnt already exist
if index_name not in pc.list_indexes():
    pc.create_index(
        name=index_name,
        dimension=1536,   #dimension for OpenAI embeddings
        metric='cosine',  #Similarity metric used
        spec=ServerlessSpec(
            cloud='aws',
            region='us-east-1'
            )
    )

#connect to the pinecone index
index=pc.Index(name=index_name)

In [None]:
# Example business FAQ documents, you can add your other documents here as well
documents=[
    "Our company offers 24 paid leaves per year to full-time employees.",
    "To reset your IT password, visit the internal portal and follow the instructions.",
    "The onboarding process includes orientation, HR documentation, and system access setup.",
    "Health insurance benefits cover employee, spouse, and two children.",
    "Reimbursement claims for travel must be submitted within 30 days.",
    "Remote work is allowed for up to 3 days a week with manager approval.",
    "For hardware issues, contact IT support through the helpdesk ticketing system.",
    "Performance reviews are conducted every 6 months by your direct manager.",
    "Company policy prohibits sharing confidential information with external parties.",
    "To request a training program, fill the form on the HR portal."
]
#for demonstration, since my OpenAI quota is limited
#this generates random vecotrs for each document
#Comment the 2 code lines below to switch from random embedding
np.random.seed(42)
fake_embeddings=[np.random.rand(1536).tolist() for _ in documents]

In [None]:
#the following function converts text into an embedding vecotr using OpenAI
#Uncomment this function to switch to real embeddings
"""def get_embedding(text, model="text-embedding-ada-002"):
   return client.embeddings.create(input = [text], model=model).data[0].embedding"""

# Prepare and upsert vectors for all documents into the Pinecone index.
vectors=[]
for i, doc in enumerate(documents):
  vec=fake_embeddings[i]      #replace with: vec=get_embedding(doc)
  vectors.append((f'doc{i}', vec, {'text':doc}))

index.upsert(vectors=vectors)

{'upserted_count': 10}

In [None]:

def rag_query(query, k=2):
  np.random.seed(abs(hash(query))%(10**8))
  query_vec=np.random.rand(1536).tolist()
  results=index.query(vector=query_vec, top_k=k, include_metadata=True)
  retrieved_docs=[match['metadata']['text'] for match in results['matches']]
  answer=f'Based on our company policy/rules: {retrieved_docs[0]}'
  return answer, retrieved_docs
#this function takes a question from the user and finds the most relevant documents and asks GPT to answer using only those documents
#replace the above function with the following function to switch from simnulated embeddings to real embeddings
"""
def rag_query(query, k=2):
  query_vec=get_embedding(query)
  # Retrieve top-k similar documents from Pinecone
  results=index.query(vector=query_vec, top_k=k, include_metadata=True)
  retrieved_docs=[match['metadata']['text'] for match in results['matches']]
  # the prompt for GPT to answer based on the retrieved docs only
  prompt=(
    f'Answer the question using the information given below. \n'
    f'Information:\n {chr(10).join(retrieved_docs)} \n'
    f'Question: {query} \n'
    f'Answer:'
  )
  try:
    #get the answer from GPT 3.5
    completion=client.chat.completions.create(
      model='gpt-3.5-turbo',
      messages=[
        {'role':'user', 'content':prompt}
      ],
      max_tokens=100,
      temperature=0
    )
    answer=completion.choices[0].message.content.strip()
  except Exception as e:
    answer=f'Retrieved info: {retrieved_docs[0]}'
  return answer, retrieved_docs
"""


In [None]:
#example questions to see how the bot responds, you can add your own questions here as well
sample_questions=[
    "How many paid leaves do I get in a year?",
    "Can I work remotely every week?",
    "Who is covered under the company health insurance?",
    "How do I submit a reimbursement for travel?",
    "How can I reset my IT password?"
]

for q in sample_questions:
  print(f'Question: {q}')
  ans, ctx=rag_query(q)
  print(f'Answer: {ans} \n Context: {ctx} \n {"-" * 80} \n')

Question: How many paid leaves do I get in a year?
Answer: Based on our company policy/rules: For hardware issues, contact IT support through the helpdesk ticketing system. 
 Context: ['For hardware issues, contact IT support through the helpdesk ticketing system.', 'To reset your IT password, visit the internal portal and follow the instructions.'] 
 -------------------------------------------------------------------------------- 

Question: Can I work remotely every week?
Answer: Based on our company policy/rules: Performance reviews are conducted every 6 months by your direct manager. 
 Context: ['Performance reviews are conducted every 6 months by your direct manager.', 'Health insurance benefits cover employee, spouse, and two children.'] 
 -------------------------------------------------------------------------------- 

Question: Who is covered under the company health insurance?
Answer: Based on our company policy/rules: Remote work is allowed for up to 3 days a week with manag