## 5. Usage Notes

### How to Run This Notebook:
1. Make sure you have the required dependencies installed: `pip install datasets`
2. Run all cells in order
3. The notebook will load data from HuggingFace and process it

### Customization Options:
- **Number of examples**: Change `test[:200]` to `test[:N]` where N is your desired number
- **Dataset split**: Change `"test"` to `"train"` or `"validation"` 
- **Difficulty metric**: Modify the calculation in `len(example["question"]) / 100`
- **Output format**: Add or remove fields in the data dictionary

### Self-Contained Design:
This notebook is completely self-contained and doesn't require any external files. All sample data is inlined as Python variables instead of reading from JSON files.

In [None]:
# Sample output data (inlined instead of reading from file)
sample_data = [
    {
        "id": "example_000",
        "question": "What is 2+2?",
        "answer": "4",
        "difficulty": 0.15
    },
    {
        "id": "example_001",
        "question": "If x=5, what is 2x?",
        "answer": "10",
        "difficulty": 0.22
    },
    {
        "id": "example_002",
        "question": "Solve: 3y + 6 = 15",
        "answer": "y=3",
        "difficulty": 0.28
    }
]

print("Sample processed data:")
print(json.dumps(sample_data, indent=2))

## 4. Sample Output Data

Here's what the processed data would look like. Instead of saving to a file, we'll inline a sample of the expected output format:

In [None]:
# Collect the data
data = collect_data()
print(f"Collected {len(data)} examples")

# Display first 3 examples to see the structure
print("\nFirst 3 examples:")
for i in range(min(3, len(data))):
    example = data[i]
    print(f"\nExample {i+1}:")
    print(f"  ID: {example['id']}")
    print(f"  Question: {example['question'][:100]}{'...' if len(example['question']) > 100 else ''}")
    print(f"  Answer: {example['answer'][:50]}{'...' if len(example['answer']) > 50 else ''}")
    print(f"  Difficulty: {example['difficulty']:.2f}")

## 3. Execute Data Collection

Now let's run the data collection function and see what we get!

In [None]:
def collect_data():
    """Collect benchmark data for DKW controller evaluation."""
    # Load HuggingFace dataset
    ds = load_dataset("gsm8k", "main", split="test[:200]")

    data = []
    for i, example in enumerate(ds):
        data.append({
            "id": f"example_{i:03d}",
            "question": example["question"],
            "answer": example["answer"],
            "difficulty": len(example["question"]) / 100,  # Simple proxy
        })

    return data

## 2. Data Collection Function

This function loads the GSM8K dataset from HuggingFace and processes it for our benchmark. We'll take the first 200 examples from the test split and create a structured format with:
- Unique ID for each example
- The question text
- The answer
- A difficulty metric (based on question length as a simple proxy)

In [None]:
"""Dataset collection script for DKW benchmark."""
import json
from datasets import load_dataset

## 1. Import Required Libraries

We'll import the necessary libraries for data processing and JSON handling.

# Dataset Collection for DKW Benchmark

**Artifact:** dataset_001 (data.py)

This notebook demonstrates a dataset collection script for DKW controller evaluation. It loads data from the GSM8K (Grade School Math 8K) dataset and processes it for benchmarking purposes.

The notebook is completely self-contained and doesn't require any external files.