# 🤖💻 **Implement a Simple RAG Pipeline with FAISS and Hugging Face**

**Time Estimate:** 60 minutes

## 📋 **Overview**

This activity will introduce you to the concept of Retrieval-Augmented Generation (RAG) by synthesizing data retrieval with generative language models. Leveraging FAISS for efficient vector searches and Hugging Face for language model integration, you'll develop a system that enhances generative model outputs with real-time, contextually relevant information from a text corpus.

- Connect real-world applications by creating a knowledge-enhanced AI system.
- Gain hands-on experience building an application with key industry tools.
- Understand how to merge retrieval capabilities with generative AI for improved outcomes.

## 🎯 **Learning Outcomes**

By the end of this lab, you will be able to:

- Implement a simple RAG pipeline using FAISS and Hugging Face Transformers.
- Generate contextually accurate and relevant outputs by integrating retrieval processes with generative models.

## Task 1: Prepare Your Dataset [15 minutes]

In [None]:
# imports
from sentence_transformers import SentenceTransformer
import faiss
from transformers import pipeline

Use a small, structured text corpus as the knowledge base for retrieval.  We've provided this as the documents variable. Please feel free to add to this list or adjust this list as you see fit.

In [None]:
# Task 1
documents = [
    "Deep learning models are solving complex problems.",
    "Generative AI can create lifelike images and videos.",
    "AI models need optimization to reduce biases.",
    "Natural language processing enables better human-computer interaction.",
    "Computer vision algorithms can detect objects in real-time.",
    "Reinforcement learning helps agents learn optimal strategies.",
    "Transfer learning accelerates model training on new tasks.",
    "Attention mechanisms have revolutionized sequence modeling."
]

✅ **Success Checklist**

- The dataset is formatted as a list of documents.
- Documents contain diverse content for testing different queries.

💡 **Key Points**

- Ensure the dataset is relevant and diverse to test varied prompts.
- Quality of documents directly impacts retrieval effectiveness.

❗ **Common Mistakes to Avoid**

- Creating documents that are too similar to each other.
- Using empty strings or very short, uninformative text.
- Not considering the domain relevance of your documents.

## Task 2: Create Embeddings and Index [15 minutes]
Transform documents into vector representations and build a FAISS index.
1. Generate embeddings for your documents
2. Create a FAISS index
3. Add the embeddings to the index

In [None]:
# Task 2
# your code here...

✅ **Success Checklist**

- Embeddings are generated without errors.
- FAISS index is created successfully.
- Index contains the correct number of vectors.

💡 **Key Points**

- Use Sentence Transformers for efficient and scalable embedding generation.
- FAISS provides fast similarity search capabilities.
- Embedding dimension must match the model's output dimension.

❗ **Common Mistakes to Avoid**

- Using inconsistent embedding models for documents vs queries.
- Not checking that embedding dimensions match FAISS index requirements.
- Forgetting to convert embeddings to the correct data type for FAISS.

## Task 3: Retrieve and Generate [30 minutes]
Retrieve relevant information and integrate it with language model queries.
1. Perform a query and retrieve documents
2. Create an enhanced prompt with retrieved context
3. Generate a response using a language model
4. Compare the enhanced response with a baseline response

In [None]:
# Task 3
# your code here ...

✅ **Success Checklist**

- Retrieved content accurately reinforces the language model's response.
- Responses show improved contextual relevance over baseline generation.

💡 **Key Points**

- Combining retrieved data with generative prompts creates more grounded responses.
- The quality of retrieval directly impacts the final generation quality.

❗ **Common Mistakes to Avoid**

- Not structuring the prompt clearly to separate context from the question.
- Retrieving too many or too few documents for the context.

## 🚀  **Next Steps**

Explore further by using larger datasets or different model architectures. Understand how RAG can be applied across varied text domains to solve complex problems like sentiment analysis or domain-specific information retrieval.

## 💻 Exemplar Solution

<details>    
<summary><strong>Click HERE to see an exemplar solution</strong></summary>

### Task 1 Solution
    
```python
# Loading a sample dataset
documents = [
    "Deep learning models are solving complex problems.",
    "Generative AI can create lifelike images and videos.",
    "AI models need optimization to reduce biases.",
    "Natural language processing enables better human-computer interaction.",
    "Computer vision algorithms can detect objects in real-time.",
    "Reinforcement learning helps agents learn optimal strategies.",
    "Transfer learning accelerates model training on new tasks.",
    "Attention mechanisms have revolutionized sequence modeling."
]

print(f"Dataset prepared with {len(documents)} documents")
print("Sample document:", documents[0])
```

### Task 2 Solution
    
```python
# Generate embeddings
model = SentenceTransformer('paraphrase-MiniLM-L6-v2')
embeddings = model.encode(documents)

print(f"Generated embeddings shape: {embeddings.shape}")

# Create FAISS index
dim = embeddings.shape[1]
index = faiss.IndexFlatL2(dim)
index.add(embeddings.astype('float32'))

print(f"FAISS index created with {index.ntotal} vectors")
```

### Task 3 Solution

```python
# Perform a query and retrieve documents
query = "How do AI models optimize data?"
query_embedding = model.encode([query]).astype('float32')

k = 2  # Number of nearest neighbors
distances, indices = index.search(query_embedding, k)

retrieved_text = " ".join([documents[i] for i in indices[0]])

print(f"Query: {query}")
print(f"Retrieved documents: {retrieved_text}")

# Create enhanced prompt
complete_prompt = f"Info: {retrieved_text}\\nQ: {query}\\nA:"

# Integrate with LLM
generator = pipeline('text-generation', model='distilgpt2')
response = generator(complete_prompt, max_length=100)

print("Enhanced Response:", response[0]['generated_text'])

# Compare with baseline (no retrieval)
baseline_prompt = f"Q: {query}\\nA:"
baseline_response = generator(baseline_prompt, max_length=100)

print("\\nBaseline Response:", baseline_response[0]['generated_text'])
print("\\nComparison: The enhanced response should be more informed and contextual.")
```
</details>