<a href="https://colab.research.google.com/github/Devraj02-sys/Generative-AI/blob/main/Embeddings_%26_Indexing_Documents.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [12]:


#  Step 1: Install Required Packages
!pip install faiss-cpu sentence-transformers gradio

# Step 2: Import Libraries
from  sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import gradio as gr

#  Step 3: Load Embedding Model (better for Q&A)
model = SentenceTransformer('multi-qa-MiniLM-L6-cos-v1')

# 📄 Step 4: Expanded and Meaningful Documents
documents = [
    "Democracy is a system of government where citizens elect representatives.",
    "India is the largest democracy in the world.",
    "In a democracy, power rests with the people and their elected leaders.",
    "Democracy allows freedom of speech, press, and expression.",
    "A core principle of democracy is equal participation of all citizens.",
    "Chess is a strategic board game.",
    "Python is a popular programming language."
]

#  Step 5: Generate Embeddings
embeddings = model.encode(documents)

#  Step 6: Create FAISS Index
index = faiss.IndexFlatL2(embeddings.shape[1])
index.add(np.array(embeddings))

# 🔁 Step 7: Define Search Function with Filter
def search_documents(query, k=3):
    query_embedding = model.encode([query])
    D, I = index.search(np.array(query_embedding), k)
    results = []
    for i, (idx, dist) in enumerate(zip(I[0], D[0])):
        if dist < 1.0:  # Similarity threshold (lower is better for L2 distance)
            results.append(f"{i+1}. {documents[idx]}")
    return "\n".join(results) if results else "No relevant result found. Try rephrasing your query."

#  Step 8: Gradio Interface
ui = gr.Interface(
    fn=search_documents,
    inputs=gr.Textbox(label="Enter your query", placeholder="e.g. What is democracy?"),
    outputs=gr.Textbox(label="Top Matching Documents"),
    title="Document Search with FAISS + Embeddings",
    description="Search through documents using SentenceTransformers + FAISS."
)

ui.launch()

# ✅ Try query: "What is democracy?" or "Explain democratic principles."

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://020e459e277f4dbe86.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


