<a href="https://colab.research.google.com/github/MohamedMadhoun/mini-rag-system-project/blob/main/Copy_of_Model_3_Section_1_Homework_fixed.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#  Build and Test a Mini RAG System from Scratch üß†

> **üéØ Today's Goal**: Combine the knowledge from the first three lessons (Embeddings, Retrieval, Generation) to build a functional Retrieval-Augmented Generation (RAG) system from scratch. Then, test it with a self-assessment!

In [None]:
!pip install sentence-transformers transformers torch



## ‚öôÔ∏è Part 1: The Retriever - Finding the Right Knowledge

First, we'll set up our Retriever. Its job is to take a question and find the most relevant piece of text from our knowledge base.

1.  **Load the Embedding Model** (`all-MiniLM-L6-v2`)
2.  **Create our Knowledge Base**
3.  **Encode Everything into Embeddings**
4.  **Calculate Similarity** to find the best match

In [None]:
import torch
from sentence_transformers import SentenceTransformer, util
from transformers import pipeline

print("‚úÖ Libraries imported successfully!")

# 1. Load our embedding model
retriever_model = SentenceTransformer('all-MiniLM-L6-v2')

# 2. Create a simple knowledge base
knowledge_base = [
    "The capital of France is Paris, a city famous for the Eiffel Tower and the Louvre museum.",
    "The Amazon rainforest is the world's largest tropical rainforest, known for its incredible biodiversity.",
    "Mount Everest is the highest mountain on Earth, located in the Himalayas.",
    "The Great Wall of China is a series of fortifications stretching over 13,000 miles.",
    "Photosynthesis is the process used by plants to convert light energy into chemical energy."
]

# 3. Encode our knowledge base into embeddings
knowledge_embeddings = retriever_model.encode(knowledge_base, convert_to_tensor=True)

print(f"‚úÖ Retriever model loaded and knowledge base encoded with {len(knowledge_base)} documents.")

‚úÖ Libraries imported successfully!


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

‚úÖ Retriever model loaded and knowledge base encoded with 5 documents.


## ‚úçÔ∏è Part 2: The Generator - Extracting the Answer

Now we set up our Generator. This model will take the question and the context found by the retriever and extract the exact answer from it.

In [None]:
# Load our question-answering (generator) model
generator = pipeline('question-answering', model='distilbert-base-cased-distilled-squad')

print("‚úÖ Generator (QA) model loaded.")

config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

Device set to use cpu


‚úÖ Generator (QA) model loaded.


## üöÄ Part 3: Testing our RAG System

Time to put it all together! The function below will simulate a full RAG pipeline and grade itself against a predefined set of questions and answers.

It will test two key things:
1.  **Retrieval Accuracy**: Did we find the right document?
2.  **Generation Accuracy**: Did we extract the correct answer from that document?

In [None]:
def run_rag_assessment():
    """Runs a self-assessment of the RAG pipeline with multiple questions."""

    # Define our questions, expected context keywords, and expected answers
    test_questions = [
        {
            "question": "What is the highest mountain?",
            "expected_keyword": "Everest",
            "expected_answer": "Mount Everest"
        },
        {
            "question": "Which city is home to the Louvre museum?",
            "expected_keyword": "France",
            "expected_answer": "Paris"
        },
        {
            "question": "What process do plants use for energy?",
            "expected_keyword": "Photosynthesis",
            "expected_answer": "Photosynthesis"
        }
    ]

    score = 0
    total = len(test_questions) * 2 # 2 points per question (1 for retrieval, 1 for generation)

    print("--- üöÄ Starting RAG System Assessment ---\n")

    for i, test in enumerate(test_questions):
        question = test["question"]
        print(f"\n--- Question {i+1}: '{question}' ---")

        # --- 1. Retrieval Step ---
        question_embedding = retriever_model.encode(question, convert_to_tensor=True)
        cos_scores = util.pytorch_cos_sim(question_embedding, knowledge_embeddings)[0]
        top_result_index = torch.argmax(cos_scores)
        retrieved_context = knowledge_base[top_result_index]

        print(f"üîé  Retrieved Context: '{retrieved_context}'")

        # Check if the retrieval was correct
        if test["expected_keyword"] in retrieved_context:
            print("‚úÖ  Retrieval Correct!")
            score += 1
        else:
            print(f"‚ùå  Retrieval Failed. Expected context with keyword: '{test['expected_keyword']}'")

        # --- 2. Generation Step ---
        qa_result = generator(question=question, context=retrieved_context)
        generated_answer = qa_result['answer']

        print(f"‚úçÔ∏è  Generated Answer: '{generated_answer}'")

        # Check if the generation was correct
        if test["expected_answer"].lower() in generated_answer.lower():
            print("‚úÖ  Generation Correct!")
            score += 1
        else:
            print(f"‚ùå  Generation Failed. Expected answer: '{test['expected_answer']}'")

    # --- Final Score ---
    print(f"\n--- üèÅ Assessment Complete ---")
    print(f"üéØ Final Score: {score} / {total}")
    if score == total:
        print("üéâüéâüéâ Perfect! Your RAG system is working as expected!")
    elif score >= total / 2:
        print("üëç Good job! The system is mostly correct.")
    else:
        print("üîß The system ran into some issues. Review the steps and check the logic.")

# Run the assessment!
run_rag_assessment()

--- üöÄ Starting RAG System Assessment ---


--- Question 1: 'What is the highest mountain?' ---
üîé  Retrieved Context: 'Mount Everest is the highest mountain on Earth, located in the Himalayas.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Mount Everest'
‚úÖ  Generation Correct!

--- Question 2: 'Which city is home to the Louvre museum?' ---
üîé  Retrieved Context: 'The capital of France is Paris, a city famous for the Eiffel Tower and the Louvre museum.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Paris'
‚úÖ  Generation Correct!

--- Question 3: 'What process do plants use for energy?' ---
üîé  Retrieved Context: 'Photosynthesis is the process used by plants to convert light energy into chemical energy.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Photosynthesis'
‚úÖ  Generation Correct!

--- üèÅ Assessment Complete ---
üéØ Final Score: 6 / 6
üéâüéâüéâ Perfect! Your RAG system is working as expected!


#  STUDENT TASKS üßë‚Äçüíª

Now it's your turn to be the AI engineer. Your tasks are to run, analyze, and extend the RAG system you've just built.

### Task 1: Execute and Understand

Your first task is to simply run all the cells above and carefully read the output of the final self-assessment.

* **Observe the Score:** Did the system get a perfect score (6/6)?
* **Analyze Each Step:** For each question, look at the "Retrieved Context" and the "Generated Answer."
    * Did the retriever find the correct piece of knowledge?
    * Did the generator extract the right answer from that context?

### Task 2 (Challenge): Add a New Question

Your second task is to test the system with a new question about the **existing knowledge**.

**Instructions:**
1.  Copy the code from the cell below. It's the same assessment function as before, but with a new test question added.
2.  Run the cell and see if the system can answer correctly. The score should now be out of 8.

In [None]:
# Task 2: Add a new question to the assessment function

def run_rag_assessment_task_2():
    test_questions = [
        {
            "question": "What is the highest mountain?",
            "expected_keyword": "Everest",
            "expected_answer": "Mount Everest"
        },
        {
            "question": "Which city is home to the Louvre museum?",
            "expected_keyword": "France",
            "expected_answer": "Paris"
        },
        {
            "question": "What process do plants use for energy?",
            "expected_keyword": "Photosynthesis",
            "expected_answer": "Photosynthesis"
        },
        {
            "question": "Which wall stretches over 13,000 miles?",
            "expected_keyword": "Wall of China",
            "expected_answer": "Great Wall of China"
        }
    ]

    score = 0
    total = len(test_questions) * 2

    print("--- üöÄ Starting RAG System Assessment (Task 2) ---\n")

    for i, test in enumerate(test_questions):
        question = test["question"]
        print(f"\n--- Question {i+1}: '{question}' ---")
        question_embedding = retriever_model.encode(question, convert_to_tensor=True)
        cos_scores = util.pytorch_cos_sim(question_embedding, knowledge_embeddings)[0]
        top_result_index = torch.argmax(cos_scores)
        retrieved_context = knowledge_base[top_result_index]
        print(f"üîé  Retrieved Context: '{retrieved_context}'")
        if test["expected_keyword"] in retrieved_context:
            print("‚úÖ  Retrieval Correct!")
            score += 1
        else:
            print(f"‚ùå  Retrieval Failed. Expected context with keyword: '{test['expected_keyword']}'")
        qa_result = generator(question=question, context=retrieved_context)
        generated_answer = qa_result['answer']
        print(f"‚úçÔ∏è  Generated Answer: '{generated_answer}'")
        if test["expected_answer"].lower() in generated_answer.lower():
            print("‚úÖ  Generation Correct!")
            score += 1
        else:
            print(f"‚ùå  Generation Failed. Expected answer: '{test['expected_answer']}'")

    print(f"\n--- üèÅ Assessment Complete ---")
    print(f"üéØ Final Score: {score} / {total}")
    if score == total:
        print("üéâüéâüéâ Perfect! Your RAG system handled the new question!")

# Run the updated assessment
run_rag_assessment_task_2()


--- üöÄ Starting RAG System Assessment (Task 2) ---


--- Question 1: 'What is the highest mountain?' ---
üîé  Retrieved Context: 'Mount Everest is the highest mountain on Earth, located in the Himalayas.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Mount Everest'
‚úÖ  Generation Correct!

--- Question 2: 'Which city is home to the Louvre museum?' ---
üîé  Retrieved Context: 'The capital of France is Paris, a city famous for the Eiffel Tower and the Louvre museum.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Paris'
‚úÖ  Generation Correct!

--- Question 3: 'What process do plants use for energy?' ---
üîé  Retrieved Context: 'Photosynthesis is the process used by plants to convert light energy into chemical energy.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Photosynthesis'
‚úÖ  Generation Correct!

--- Question 4: 'Which wall stretches over 13,000 miles?' ---
üîé  Retrieved Context: 'The Great Wall of China is a series of fortifications stretching over 13,000

### Task 3 (Advanced Challenge): Add New Knowledge & Test It

Your final and most important task is to **expand the RAG system's knowledge base** and then test it.

**Instructions:**
1.  **Add a new fact** to the `knowledge_base` in the code cell below.
2.  **You must re-run this cell** to update the `knowledge_embeddings`! The system won't know about the new fact until you do.
3.  Finally, run the last code cell, which has a new test question about the knowledge you just added.

In [None]:
# Task 3, Step 1: Add a new sentence to the knowledge base

knowledge_base_task_3 = [
    "The capital of Japan is Tokyo, famous for its technology and culture."
]

# Re-encode the updated knowledge base
knowledge_embeddings_task_3 = retriever_model.encode(knowledge_base_task_3, convert_to_tensor=True)

print(f"‚úÖ Knowledge base updated and re-encoded with {len(knowledge_base_task_3)} documents.")

‚úÖ Knowledge base updated and re-encoded with 1 documents.


In [None]:
# Task 3, Step 2: Test your newly added knowledge

def run_rag_assessment_task_3():
    test_questions = [
        {
            "question": "What is the capital of Japan?",
            "expected_keyword": "Japan",
            "expected_answer": "Tokyo"
        }
    ]

    score = 0
    total = len(test_questions) * 2

    print("--- üöÄ Starting RAG System Assessment (Task 3) ---\n")

    for i, test in enumerate(test_questions):
        question = test["question"]
        print(f"\n--- Question {i+1}: '{question}' ---")
        question_embedding = retriever_model.encode(question, convert_to_tensor=True)
        cos_scores = util.pytorch_cos_sim(question_embedding, knowledge_embeddings_task_3)[0]
        top_result_index = torch.argmax(cos_scores)
        retrieved_context = knowledge_base_task_3[top_result_index]
        print(f"üîé  Retrieved Context: '{retrieved_context}'")
        if test["expected_keyword"] in retrieved_context:
            print("‚úÖ  Retrieval Correct!")
            score += 1
        else:
            print(f"‚ùå  Retrieval Failed. Expected context with keyword: '{test['expected_keyword']}'")
        qa_result = generator(question=question, context=retrieved_context)
        generated_answer = qa_result['answer']
        print(f"‚úçÔ∏è  Generated Answer: '{generated_answer}'")
        if test["expected_answer"].lower() in generated_answer.lower():
            print("‚úÖ  Generation Correct!")
            score += 1
        else:
            print(f"‚ùå  Generation Failed. Expected answer: '{test['expected_answer']}'")

    print(f"\n--- üèÅ Assessment Complete ---")
    print(f"üéØ Final Score: {score} / {total}")
    if score == total:
        print("üèÜüèÜüèÜ Success! You have successfully extended the knowledge of your RAG system!")

# Run the final assessment
run_rag_assessment_task_3()

--- üöÄ Starting RAG System Assessment (Task 3) ---


--- Question 1: 'What is the capital of Japan?' ---
üîé  Retrieved Context: 'The capital of Japan is Tokyo, famous for its technology and culture.'
‚úÖ  Retrieval Correct!
‚úçÔ∏è  Generated Answer: 'Tokyo'
‚úÖ  Generation Correct!

--- üèÅ Assessment Complete ---
üéØ Final Score: 2 / 2
üèÜüèÜüèÜ Success! You have successfully extended the knowledge of your RAG system!
