# Building an Evaluation Dataset

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wandb/weave/blob/master/docs/docs/guides/cookbooks/llamaindex_rag_ncert/notebooks/02_fetch_question_banks.ipynb)

We build an evaluation dataset by scraping a question bank of solved question-answer pairs of the Flamigo textbook from [LearnCBSE](https://www.learncbse.in/chapter-wise-important-questions-class-12-english/). The dataset consists of 358 question-answer pairs corresponding to the 8 chapters from our knowledge base in the following format:

```python
{
  "question": "What was the mood in the classroom when M. Hamel gave his last French lesson? ",
  "answer": "When M.Hamel was giving his last French ; lesson, the mood in the classroom was solemn and sombre. When he announced that this was their last French lesson everyone present in the classroom suddenly developed patriotic feelings for their native language and genuinely regretted ignoring their mother tongue.",
  "marks": "3-4",
  "chapter_name": "The Last Lesson"
}
```

## Install the dependencies

First, let us install all the libraries that we would need to build the dataset.

In [None]:
!pip install -qU weave

In [None]:
import re

import requests
from bs4 import BeautifulSoup

import weave

## Scrape the Dataset

In [None]:
def remove_selected_substrings(text):
    def check_if_contains_target(x):
        targets = ["Comptt.", "Delhi", "All India"]
        return any(target in x for target in targets)

    def replace_func(match):
        if check_if_contains_target(match.group()):
            return ""
        else:
            return match.group()

    result = re.sub(r"\(([^()]|\([^()]*\))*\)", replace_func, text)
    return result

In [None]:
def extract_qa(qa_pair):
    match = re.search(r"Question \d+\.(.*?)Answer:(.*)", qa_pair)
    if match:
        question = match.group(1).strip()
        answer = match.group(2).strip()
        return {"question": remove_selected_substrings(question), "answer": answer}
    return None

In [None]:
def generate_question_bank(chapter_url: str, chapter_name: str):
    response = requests.get(chapter_url)
    html_content = response.content
    soup = BeautifulSoup(html_content, "html.parser")
    question_answer_pairs = []
    for p_tag in soup.find_all("p"):
        question_answer_pair = p_tag.get_text(strip=True)
        question_answer_pairs.append(question_answer_pair)
    marks = ""
    question_bank = []
    for qa_pair in question_answer_pairs:
        if "Short Answer Type Questions".lower() in qa_pair.lower():
            marks = "3-4"
        elif "Long Answer Type Questions".lower() in qa_pair.lower():
            marks = "5-6"
        elif "Question" in qa_pair and "Answer:" in qa_pair:
            extracted_qa = extract_qa(qa_pair)
            if extracted_qa is not None:
                extracted_qa["marks"] = marks
                extracted_qa["chapter_name"] = chapter_name
                question_bank.append(extracted_qa)
    return question_bank

In [None]:
chapter_urls = [
    "https://www.learncbse.in/the-last-lesson-important-questions/",
    "https://www.learncbse.in/lost-spring-important-questions/",
    "https://www.learncbse.in/deep-water-important-questions/",
    "https://www.learncbse.in/the-rattrap-important-questions/",
    "https://www.learncbse.in/indigo-important-questions/",
    "https://www.learncbse.in/poets-and-pancakes-important-questions/",
    "https://www.learncbse.in/the-interview-important-questions/",
    "https://www.learncbse.in/going-places-important-questions/",
]

chapter_names = [
    "The Last Lesson",
    "Lost Spring Important",
    "Deep Water Important",
    "The Rattrap Importan",
    "Indigo Important",
    "Poets and Pancakes",
    "The Interview",
    "Going Places",
]

question_bank = []
for chapter_url, chapter_name in zip(chapter_urls, chapter_names):
    question_bank.extend(generate_question_bank(chapter_url, chapter_name))

## Publish the Dataset to Weave

We log this dataset as a [`weave.Dataset`](../../core-types/datasets.md) which enables us to collect examples for evaluation and automatically track versions for accurate comparisons.

In [None]:
# initialize Weave
weave.init(project_name="groq-rag")

# Create the Weave Dataset
dataset = weave.Dataset(name="flamingos-prose-question-bank", rows=question_bank)

# Publish the Dataset
weave.publish(dataset)

| ![](../images/weave_evaluation_dataset.gif) |
|---|
| Exploring the evaluation dataset using the Weave UI. |