In [None]:
# Uncomment the lines below if you want to save to a file
# with open("data_out.json", "w") as f:
#     json.dump(processed_data, f, indent=2)
# print("Data saved to data_out.json")

print("Note: File saving is commented out to keep this notebook self-contained")

## Optional: Save to File

If you want to save the processed data to a JSON file (as in the original script), you can run this cell:

In [None]:
# Process the data using our function
processed_data = collect_data(SAMPLE_DATASET)

print(f"Collected {len(processed_data)} examples")
print("\nProcessed data:")
print(json.dumps(processed_data, indent=2))

## Process the Data

Let's run the data collection function and see the results:

In [None]:
def collect_data(dataset: List[Dict[str, str]]) -> List[Dict[str, Any]]:
    """Collect benchmark data for DKW controller evaluation.
    
    Args:
        dataset: List of dictionaries with 'question' and 'answer' keys
        
    Returns:
        List of processed examples with metadata
    """
    data = []
    for i, example in enumerate(dataset):
        data.append({
            "id": f"example_{i:03d}",
            "question": example["question"],
            "answer": example["answer"],
            "difficulty": len(example["question"]) / 100,  # Simple proxy based on question length
        })
    
    return data

## Data Collection Function

The main function processes the raw dataset and adds metadata such as difficulty scores:

In [None]:
# Sample data inlined from the original JSON output
# This represents what would typically be loaded from HuggingFace datasets
SAMPLE_DATASET = [
    {
        "question": "What is 2+2?",
        "answer": "4"
    },
    {
        "question": "If x=5, what is 2x?", 
        "answer": "10"
    },
    {
        "question": "Solve: 3y + 6 = 15",
        "answer": "y=3"
    }
]

print(f"Loaded {len(SAMPLE_DATASET)} sample questions")

## Sample Data

Instead of loading from external sources, we'll use this sample data that represents the kind of mathematical problems in the GSM8K dataset:

In [None]:
"""Dataset collection script for DKW benchmark."""
import json
from typing import List, Dict, Any

## Overview

This notebook contains a dataset collection script that was originally designed to load data from HuggingFace's GSM8K dataset. For demonstration purposes, this version uses inline sample data to make the notebook completely self-contained.

### What this notebook does:
1. Defines a data collection function
2. Uses sample benchmark data (inlined from JSON)
3. Processes the data to add metadata like difficulty scores
4. Displays the results in an easy-to-understand format

# Dataset Collection for DKW Benchmark

**Artifact ID:** dataset_001  
**Name:** data.py

This notebook demonstrates a dataset collection script for DKW controller evaluation. The original script has been converted into an interactive, self-contained notebook that processes benchmark data without external dependencies.