<a href="https://colab.research.google.com/github/Abhiprameesh/Voice-AI-call-center-RAG_Models/blob/main/RAG_MODEL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:

# 1. Installing the Required Dependencies
!pip install faiss-cpu sentence-transformers transformers pandas --quiet
import pandas as pd
import faiss
import os
import pickle
from sentence_transformers import SentenceTransformer
from transformers import pipeline
# 2. Config (Choose Backend)
USE_OPENAI = False  # Change to True if you want to use OpenAI API
# 3. Load Datasetcontent/
df = pd.read_csv("RAGCHAT1.csv")
print("Sample rows:")
print(df.head())
# Expected Columns: ["Intent (Category)", "User Question Variation", "Official Bot Response", "Source URL", "Notes"]
# 4. Embeddings + FAISS Index
embedder = SentenceTransformer('all-MiniLM-L6-v2')
embeddings = embedder.encode(df["User Question Variation"].tolist(), normalize_embeddings=True)
dimension = embeddings.shape[1]
index = faiss.IndexFlatIP(dimension)
index.add(embeddings)
print(f"FAISS index built with {index.ntotal} entries.")
# 5. Save / Load Pickle
def save_index(filename="rag_index.pkl"):
    with open(filename, "wb") as f:
        pickle.dump((df, index), f)
    print(f" Saved index to {filename}")

def load_index(filename="rag_index.pkl"):
    global df, index
    with open(filename, "rb") as f:
        df, index = pickle.load(f)
    print(f"Loaded index from {filename}")
# 6. Retrieval Function
def retrieve_answer(query, top_k=2):
    query_vec = embedder.encode([query], normalize_embeddings=True)
    scores, indices = index.search(query_vec, top_k)

    retrieved_data = []
    for idx, score in zip(indices[0], scores[0]):
        if idx != -1:
            row = df.iloc[idx]
            retrieved_data.append(f"- {row['Official Bot Response']} (Source: {row['Source URL']})")
    return "\n".join(retrieved_data)
# 7. Choose LLM Backend
if USE_OPENAI:
    from openai import OpenAI
    os.environ["OPENAI_API_KEY"] = "sk-..."  # replace with your key
    client = OpenAI()

    def respectful_bot(query):
        context = retrieve_answer(query, top_k=2)
        if not context.strip():
            return "üôè I don‚Äôt know the answer. Please check the BBMP official website."

        template = f"""
You are a respectful municipal chatbot.
Answer the user question ONLY using the context below.
If the answer is not in the context, reply exactly: "I don‚Äôt know the answer."

User Question: {query}
Context:
{context}
"""
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "system", "content": "You are a polite municipal assistant."},
                {"role": "user", "content": template}
            ],
            temperature=0
        )
        return response.choices[0].message["content"]

else:
    rag_llm = pipeline("text2text-generation", model="google/flan-t5-base", device=-1)

    def respectful_bot(query):
        context = retrieve_answer(query, top_k=2)
        if not context.strip():
            return "üôè I don‚Äôt know the answer. Please check the BBMP official website."

        prompt = f"""
You are a respectful municipal chatbot.
Answer the user question ONLY using the context below.
If the answer is not in the context, reply exactly: "I don‚Äôt know the answer."

User Question: {query}
Context: {context}
"""
        result = rag_llm(prompt, max_length=256, temperature=0)[0]['generated_text']
        return result
# 8. Test Queries
test_queries = [
    "How do I dispose hazardous waste like batteries, e-waste?",
    "How can I appeal against property tax penalties?",
    "What is the fastest way to complaint the BBMP about road damage?",
    "Who is the Prime Minister of India?"  # should say "I don‚Äôt know the answer."
]
for q in test_queries:
    print("User:", q)
    print("Bot:", respectful_bot(q))
    print("-" * 60)
# 9. Save Index (for later use)
save_index("RAG_index_1.pkl")


Sample rows:
      Intent (Category)                            User Question Variation  \
0  ask_garbage_schedule         What time is garbage collected in my area?   
1  ask_garbage_schedule      When does garbage pickup happen in my street?   
2   ask_collection_days        Which days is garbage collected in my area?   
3     ask_missed_pickup   My garbage was not picked up today‚Äîwhat do I do?   
4        ask_bulk_waste  How can I dispose of old furniture or bulky wa...   

                               Official Bot Response  \
0  As per the latest BBMP update effective from A...   
1  BBMP collects waste door-to-door beginning at ...   
2  BBMP arranges daily wet waste collection for m...   
3  We regret the inconvenience. Missed pickups mu...   
4  Bulk waste such as furniture, mattresses, and ...   

                                          Source URL  \
0  https://www.hindustantimes.com/cities/bengalur...   
1  https://www.goodreturns.in/news/bengaluru-wast...   
2  https://

Device set to use cpu


User: How do I dispose hazardous waste like batteries, e-waste?


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Bot: at designated collection centers or through authorized channels to prevent environmental contamination
------------------------------------------------------------
User: How can I appeal against property tax penalties?


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Bot: Complete the appeal form on the BBMP portal with proper documentation explaining the grounds. Appeals are reviewed by officers; decisions communicated online. (Source: https://bbmptax.karnataka.gov.in/forms/Complaintrequest.aspx)
------------------------------------------------------------
User: What is the fastest way to complaint the BBMP about road damage?


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Bot: The quickest way is to register your complaint on the BBMP‚Äôs dedicated ‚ÄúFix Pothole‚Äù Android app or call the 1533 helpline. WhatsApp messaging to local ward numbers with location and photos also speeds up notification and response. (Source: https://play.google.com/store/apps/details?id=com.indigo.bbmp.fixpothole)
------------------------------------------------------------
User: Who is the Prime Minister of India?


Both `max_new_tokens` (=256) and `max_length`(=256) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Bot: I don‚Äôt know the answer.
------------------------------------------------------------
 Saved index to RAG_index_1.pkl
