Lab Title:
Build Your First Mini RAG System using OpenAI API and FAISS

Objective:

By the end of this lab, you will:

Understand how Retrieval-Augmented Generation works.

Learn to load custom data, convert it into embeddings, store it in a vector database, and retrieve it for context-aware question answering.

See the difference between a normal LLM response and a RAG-powered response.

In [None]:
!pip install faiss-cpu tiktoken

In [None]:
!pip install openai

In [14]:
from openai import OpenAI
import faiss
import numpy as np
import os

Define a simple text sample

In [16]:
docs = [
    "Retrieval-Augmented Generation combines information retrieval with generation to create accurate answers."
]

Ask a question without retrieval:

In [17]:
# Set your OpenRouter API key here
OPENROUTER_API_KEY = "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"

# Initialize OpenRouter client
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=OPENROUTER_API_KEY
)

In [19]:
question = "What does RAG mean?"

# Use OpenRouter with a model (you can change the model)
response = client.chat.completions.create(
    model="google/gemini-2.0-flash-exp:free",  # Correct OpenRouter model name
    messages=[
        {"role": "user", "content": question}
    ]
)

print(response.choices[0].message.content)

RAG stands for **Retrieval-Augmented Generation**. It's an AI framework that combines a retrieval mechanism with a generative model (like a large language model or LLM) to improve the quality and relevance of generated text.

Here's a breakdown of what that means:

*   **Retrieval:** This involves searching a knowledge base (documents, databases, web pages, etc.) for information that is relevant to a specific query or prompt.
*   **Augmented:** The retrieved information is then "augmented" or added to the original prompt, providing the LLM with external context.
*   **Generation:** Finally, the LLM uses the augmented prompt (original prompt + retrieved information) to generate a more informed and contextually relevant response.

In essence, RAG allows LLMs to access and incorporate external knowledge, making them more accurate and reliable, overcoming LLM limitations such as:

*   **Lack of Real-Time Information:** LLMs are trained on a snapshot of data and can't provide up-to-date inf