<a href="https://colab.research.google.com/github/SanjiGautam/Quotation-Microservice/blob/main/RAG_Knowledge.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [21]:
!pip install -q sentence-transformers faiss-cpu langchain langchain-community pypdf




In [22]:
import os
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from sentence_transformers import SentenceTransformer
import numpy as np


In [24]:
PDF_PATH = "led_catalog.pdf"   # make sure you uploaded this file
loader = PyPDFLoader(PDF_PATH)
pages = loader.load()
print(f"Loaded {len(pages)} pages from PDF")

Loaded 1 pages from PDF


In [25]:
splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=80)
all_texts = []
metadatas = []

for i, doc in enumerate(pages):
    chunks = splitter.split_text(doc.page_content)
    for ci, c in enumerate(chunks):
        all_texts.append(c)
        metadatas.append({"source": PDF_PATH, "page": i, "chunk_id": ci})

print(f"Created {len(all_texts)} chunks")


Created 3 chunks


In [26]:
st_model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

class LocalEmbeddings:
    def __init__(self, model):
        self.model = model

    def embed_documents(self, texts):
        arr = self.model.encode(texts, convert_to_numpy=True)
        return [v.astype(np.float32).tolist() for v in arr]

    def embed_query(self, text):
        v = self.model.encode([text], convert_to_numpy=True)[0]
        return v.astype(np.float32).tolist()

    # Fix for FAISS: make the object callable
    def __call__(self, text):
        return self.embed_query(text)

embeddings = LocalEmbeddings(st_model)
print("Local embeddings ready")


Local embeddings ready


In [27]:
vector_store = FAISS.from_texts(all_texts, embeddings, metadatas=metadatas)
print("FAISS index built successfully")



FAISS index built successfully


In [28]:
def query_rag(query, k=3):
    results = vector_store.similarity_search(query, k=k)
    answers = []
    for r in results:
        answers.append({
            "answer": r.page_content,
            "source": r.metadata["source"],
            "page": r.metadata["page"],
            "chunk_id": r.metadata["chunk_id"]
        })
    return answers



In [29]:
queries = [
    "Do you have LED street lights? Which types are available?",
    "What is the warranty on flood lights?",
    "Can I place a bulk order of 500 panel lights?",
    "What is the price of a 90W LED street light?",
    "Why should I use LED instead of CFL?",
    "هل تحتوي إضاءة LED على الزئبق؟"
]

for q in queries:
    print("\n" + "="*50)
    print("Query:", q)
    results = query_rag(q, k=2)
    for i, r in enumerate(results, start=1):
        print(f"\nResult #{i}")
        print("Answer:", r["answer"])
        print("Source:", r["source"], "| Page:", r["page"], "| Chunk:", r["chunk_id"])




Query: Do you have LED street lights? Which types are available?

Result #1
Answer: Alrouf Lighting Technology Pvt Ltd – LED Lighting Catalog & Policies 
Product Catalog 
• LED Street Light (90W, 120W, 150W) – High efficiency, IP65, 5-year warranty. 
• LED Flood Light (50W, 100W, 200W) – Waterproof, suitable for outdoor projects. 
• LED Panel Light (2x2, 2x4) – Slim design, office/indoor use. 
• لدينا مصابيح  LED للشارع  بقدرات  مختلفة )90 واط، 120 واط، 150 واط(.
Source: led_catalog.pdf | Page: 0 | Chunk: 0

Result #2
Answer: • لدينا مصابيح  LED للشارع  بقدرات  مختلفة )90 واط، 120 واط، 150 واط(.   
Warranty Policy 
• All LED lights come with 5 years warranty against manufacturing defects. 
• Replacement or repair provided within warranty period. 
Bulk Order & Pricing Policy 
• Bulk orders above 100 units get a 10% discount. 
• Standard delivery: 2–3 weeks for bulk orders. 
• Pricing: 
o Street Light 90W – 240 SAR
Source: led_catalog.pdf | Page: 0 | Chunk: 1

Query: What is the warrant