<span style="color: red; font-size: 20px;">Install Required Packages</span>

In [1]:
!pip install faiss-cpu
!pip install sentence-transformers
!pip install transformers accelerate sentencepiece
!pip install gradio
!pip install numpy

Collecting accelerate
  Using cached accelerate-1.11.0-py3-none-any.whl.metadata (19 kB)
Collecting sentencepiece
  Downloading sentencepiece-0.2.1-cp312-cp312-win_amd64.whl.metadata (10 kB)
Using cached accelerate-1.11.0-py3-none-any.whl (375 kB)
Downloading sentencepiece-0.2.1-cp312-cp312-win_amd64.whl (1.1 MB)
   ---------------------------------------- 0.0/1.1 MB ? eta -:--:--
   ---------------------------------------- 1.1/1.1 MB 6.3 MB/s  0:00:00
Installing collected packages: sentencepiece, accelerate

   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerate]
   -------------------- ------------------- 1/2 [accelerat

<span style="color: red; font-size: 20px;">Setup Groq Client</span>

In [2]:
import os
from openai import OpenAI

# Set your Groq API key
os.environ["GROQ_API_KEY"] = "GROQ_API_KEY"

# Initialize Groq client
client = OpenAI(
    api_key=os.getenv("GROQ_API_KEY"),
    base_url="https://api.groq.com/openai/v1"
)

print("PHASE 1 DONE — Groq Client Connected Successfully!")


PHASE 1 DONE — Groq Client Connected Successfully!


<span style="color: red; font-size: 20px;">PHASE 2 — Create sample TN Gov Dataset</span>

In [3]:
# ================================
# PHASE 2 — Create Dataset Folder & Sample TN Govt Data
# ================================

from pathlib import Path

DATA_DIR = Path("tn_docs")
DATA_DIR.mkdir(exist_ok=True)

sample_text = """
WIDOW PENSION:
Eligibility:
- Applicant must be a widow living in Tamil Nadu.
- Annual income must be below government limit.

How to Apply:
- Visit your Taluk Office.
- Submit Husband's Death Certificate, Aadhaar, Address Proof.
- Application reviewed by VAO and RI.
- Pension amount credited monthly.

COMMUNITY CERTIFICATE:
Process:
- Apply via e-Sevai Centre or Taluk Office.
- Submit Aadhaar, Address Proof, School TC.
- Fee: Rs. 60
- Certificates issued by Revenue Department.

DRIVING LICENSE RENEWAL:
Procedure:
- Apply via Parivahan Website or RTO.
- Required: Existing DL, Aadhaar, Address Proof.
- Pay renewal fee online.
- Collect renewed DL from RTO.

OLD AGE PENSION:
Eligibility:
- Age 60+
- No family support
- Must be a resident of Tamil Nadu
- Monthly pension amount: Rs. 1,000

DISABILITY CERTIFICATE:
Steps:
- Visit District HQ Hospital.
- Meet Medical Board.
- Submit medical records, Aadhaar, photos.

VOTER ID APPLICATION (FORM-6):

Eligibility:
- Must be 18+ years old.
- Must be a resident of the constituency.
- Must not be registered as a voter in any other location.

How to Apply:
1. Visit https://voters.eci.gov.in or use the Voter Helpline App.
2. Select "Form 6 – New Voter Registration".
3. Fill personal details: Name, DOB, Gender.
4. Upload documents:
   - Address proof (Aadhaar / Ration Card / Passport / Electricity Bill)
   - Age proof (Birth Certificate / SSLC Certificate)
5. Add family voter details (optional).
6. Submit application.

Offline Method:
1. Visit your nearest BLO (Booth Level Officer) / Taluk office.
2. Fill Form-6 manually.
3. Attach photocopies of address and age proof.
4. Submit to BLO.

Verification:
- BLO will visit your home or verify documents.
- After approval, Voter ID will be generated.

Download:
- Visit https://voters.eci.gov.in
- Select “Download e-EPIC”
"""

# Save sample data
(DATA_DIR / "tn_sample.txt").write_text(sample_text, encoding="utf-8")

print("PHASE 2 DONE — tn_docs folder created & sample data saved")


PHASE 2 DONE — tn_docs folder created & sample data saved


<span style="color: red; font-size: 20px;">PHASE 3 — Text Chunking for RAG</span>

In [4]:
def chunk_text(text, max_words=120):
    """
    Splits long text into smaller chunks.
    max_words=120 means each chunk has ~120 words.
    """
    words = text.split()
    chunks = []

    for i in range(0, len(words), max_words):
        chunk = " ".join(words[i:i + max_words])
        chunks.append(chunk)

    return chunks


# Load all text files from tn_docs/
docs = []
sources = []

for file in DATA_DIR.glob("*.txt"):
    content = file.read_text(encoding="utf-8")
    chunks = chunk_text(content)

    docs.extend(chunks)
    sources.extend([file.name] * len(chunks))

print("PHASE 3 DONE — Chunking complete!")
print("Total Chunks Created:", len(docs))
print("Example Chunk:\n", docs[0][:300], "...")


PHASE 3 DONE — Chunking complete!
Total Chunks Created: 3
Example Chunk:
 WIDOW PENSION: Eligibility: - Applicant must be a widow living in Tamil Nadu. - Annual income must be below government limit. How to Apply: - Visit your Taluk Office. - Submit Husband's Death Certificate, Aadhaar, Address Proof. - Application reviewed by VAO and RI. - Pension amount credited monthly ...


<span style="color: red; font-size: 20px;">PHASE 4 — Embeddings (text-embedding-3-small)</span>

In [5]:
!pip install sentence-transformers



In [6]:


from sentence_transformers import SentenceTransformer
import numpy as np

# Best free Tamil + English embedding model
embed_model = SentenceTransformer("intfloat/multilingual-e5-small")

def embed_text(text_list):
    """
    Generate embeddings using multilingual-e5-small (local CPU).
    """
    vectors = embed_model.encode(text_list, convert_to_numpy=True, normalize_embeddings=True)
    return vectors.astype("float32")

# Generate embeddings for all chunks
embeddings = embed_text(docs)

print("PHASE 4 DONE — HuggingFace Embeddings Created.")
print("Embedding shape:", embeddings.shape)


PHASE 4 DONE — HuggingFace Embeddings Created.
Embedding shape: (3, 384)


In [7]:
print(embed_text(["widow pension tamil nadu"]).shape)


(1, 384)


<span style="color: red; font-size: 20px;">PHASE 5 — Build FAISS Index</span>

In [8]:
import faiss
import json

# Create FAISS index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(embeddings)

# Save index + metadata
faiss.write_index(index, "tn_faiss.index")

metadata = {
    "docs": docs,
    "sources": sources
}
json.dump(metadata, open("tn_meta.json", "w"))

print("PHASE 5 DONE — FAISS Index Created & Saved.")


PHASE 5 DONE — FAISS Index Created & Saved.


<span style="color: red; font-size: 20px;">PHASE 6 — Retrieval Function</span>

In [9]:
def retrieve(query, k=3):
    """
    Finds top-k similar chunks using FAISS.
    """
    index = faiss.read_index("tn_faiss.index")
    meta = json.load(open("tn_meta.json"))

    q_emb = embed_text([query])
    distances, idxs = index.search(q_emb, k)

    results = []
    for i in idxs[0]:
        results.append({
            "text": meta["docs"][i],
            "source": meta["sources"][i]
        })
    
    return results

# Quick test
print("PHASE 6 DONE — Retrieval test:", retrieve("widow pension")[0])


PHASE 6 DONE — Retrieval test: {'text': "WIDOW PENSION: Eligibility: - Applicant must be a widow living in Tamil Nadu. - Annual income must be below government limit. How to Apply: - Visit your Taluk Office. - Submit Husband's Death Certificate, Aadhaar, Address Proof. - Application reviewed by VAO and RI. - Pension amount credited monthly. COMMUNITY CERTIFICATE: Process: - Apply via e-Sevai Centre or Taluk Office. - Submit Aadhaar, Address Proof, School TC. - Fee: Rs. 60 - Certificates issued by Revenue Department. DRIVING LICENSE RENEWAL: Procedure: - Apply via Parivahan Website or RTO. - Required: Existing DL, Aadhaar, Address Proof. - Pay renewal fee online. - Collect renewed DL from RTO. OLD AGE PENSION: Eligibility: - Age 60+ - No family support - Must be", 'source': 'tn_sample.txt'}


<span style="color: red; font-size: 20px;">PHASE 7 — Prompt Builder (English + Tamil)</span>

In [10]:
def build_prompt(query):
    ctx_chunks = retrieve(query)
    context_text = "\n\n".join([c["text"] for c in ctx_chunks])

    prompt = f"""
You are a Tamil Nadu Government Assistant.

Context:
{context_text}

User Question:
{query}

Instructions:
1. First answer in clear ENGLISH based ONLY on the context.
2. Then give the SAME answer in clear TAMIL.
3. If context does NOT contain the answer:
   English: "Information not available."
   Tamil: "தகவல் கிடைக்கவில்லை."

Answer:
"""
    return prompt

print("PHASE 7 DONE — Prompt builder ready.")


PHASE 7 DONE — Prompt builder ready.


<span style="color: red; font-size: 20px;">PHASE 8 — Answer Generator (gpt-oss-120b)</span>

In [11]:
def is_small_talk(text):
    text = text.lower().strip()
    greetings = ["hi", "hello", "hey", "vanakkam", "good morning", "good evening", "good afternoon"]
    return text in greetings or len(text.split()) <= 2

def tn_answer(query):
    # ---- 1. Handle greetings / small talk (NO RAG) ----
    if is_small_talk(query):
        response = client.chat.completions.create(
            model="openai/gpt-oss-120b",
            messages=[
                {"role": "system", "content": "You are a friendly Tamil Nadu Government Assistant."},
                {"role": "user", "content": f"Respond politely in English and Tamil to: {query}"}
            ]
        )
        return response.choices[0].message.content
    
    # ---- 2. Normal RAG processing ----
    prompt = build_prompt(query)

    response = client.chat.completions.create(
        model="openai/gpt-oss-120b",
        messages=[
            {"role": "system", "content": "You are a helpful Tamil Nadu Government Assistant."},
            {"role": "user", "content": prompt}
        ]
    )

    return response.choices[0].message.content


In [12]:
# Test

print(tn_answer("How to apply for widow pension in Tamil Nadu?"))

**Answer in English**

**How to apply for Widow Pension in Tamil Nadu**

1. **Check eligibility** – You must be a widow residing in Tamil Nadu and your annual family income must be below the government‑specified limit.  
2. **Visit your Taluk Office** – Go to the Taluk Office that serves your area.  
3. **Submit the required documents**  
   - Husband’s Death Certificate  
   - Your Aadhaar card (identity & address)  
   - Proof of your current address (e.g., electricity bill, rent agreement, etc.)  
4. **Application processing** – The application will be examined by the Village Administrative Officer (VAO) and the Revenue Inspector (RI).  
5. **Pension credit** – Once approved, the widow pension amount will be credited to your bank account on a monthly basis.  

---

**தமிழில் பதில்**

**தமிழ்நாட்டில் விதவை பेंஷனை பெற விண்ணப்பிக்கும் முறை**

1. **தகுதி உறுதி** – நீங்கள் தமிழ்நாட்டில் வசிக்கும் விதவை ஆவீர்கள், மேலும் உங்கள் வருடாந்திர குடும்ப வருமானம் அரசு நிர்ணயித்த வரம்புக்கு கீழ் இர

<span style="color: red; font-size: 20px;">PHASE 10 — Gradio User Interface (Chatbot)</span>

In [15]:
import gradio as gr

def chat_fn(message, history):
    return tn_answer(message)

ui = gr.ChatInterface(
    chat_fn,
    title="TN Vazikatti AI (TN Government Assistant)",
    description="Ask anything about TN Government Schemes. Answers in English + Tamil."
)

ui.launch()


  self.chatbot = Chatbot(


* Running on local URL:  http://127.0.0.1:7862
* To create a public link, set `share=True` in `launch()`.


