Lab Title:
Build Your First Mini RAG System using OpenAI API and FAISS

Objective:

By the end of this lab, you will:

Understand how Retrieval-Augmented Generation works.

Learn to load custom data, convert it into embeddings, store it in a vector database, and retrieve it for context-aware question answering.

See the difference between a normal LLM response and a RAG-powered response.

In [27]:
!pip install faiss-cpu tiktoken

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [28]:
!pip install openai

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [29]:
from openai import OpenAI
import faiss
import numpy as np
import os

Define a simple text sample

In [30]:
docs = [
    "Retrieval-Augmented Generation combines information retrieval with generation to create accurate answers."
]

Ask a question without retrieval:

In [31]:
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Get API key from environment variable
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY")

if not OPENROUTER_API_KEY:
    raise ValueError("OPENROUTER_API_KEY not found! Please set it in your .env file")

# Initialize OpenRouter client
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=OPENROUTER_API_KEY
)

In [32]:
question = "What does RAG mean?"

# Use the existing client (OpenRouter with free model)
try:
    response = client.chat.completions.create(
        model="qwen/qwen-2-7b-instruct:free",  # Free model on OpenRouter
        messages=[
            {"role": "user", "content": question}
        ]
    )
    print("WITHOUT RAG:")
    print("-" * 60)
    print(response.choices[0].message.content)
except Exception as e:
    print(f"❌ OpenRouter Error: {e}")
    print("\nPlease check:")
    print("1. Get a fresh API key from https://openrouter.ai/keys")
    print("2. Update your .env file with the new key")
    print("3. Make sure your OpenRouter account is active")
    print("\nFallback answer (without API):")
    print("-" * 60)
    print("RAG typically stands for Retrieval-Augmented Generation, a technique that combines information retrieval with text generation to produce more accurate and contextual responses.")

❌ OpenRouter Error: Error code: 401 - {'error': {'message': 'User not found.', 'code': 401}}

Please check:
1. Get a fresh API key from https://openrouter.ai/keys
2. Update your .env file with the new key
3. Make sure your OpenRouter account is active

Fallback answer (without API):
------------------------------------------------------------
RAG typically stands for Retrieval-Augmented Generation, a technique that combines information retrieval with text generation to produce more accurate and contextual responses.
