# Dataset Usage Example (using Marketsplit)

This example demonstrates how to work with datasets in OMMX Quantum Benchmarks using Marketsplit as a representative case.

**Important**: The patterns shown here apply to ALL datasets in the collection (Labs, Portfolio, Topology, etc.). Simply replace `Marketsplit` with any other dataset class name - the API is identical across all datasets.

## Dataset Overview

In [2]:
from ommx_quantum_benchmarks.qoblib import Marketsplit

# Initialize the dataset
dataset = Marketsplit()

print(f"Dataset: {dataset.name}")
print(f"Description: {dataset.description}")
print(f"Available models: {dataset.model_names}")

# Check available instances
for model, instances in dataset.available_instances.items():
    print(f"{model}: {len(instances)} instances")

Dataset: 01_marketsplit
Description: Marketsplit dataset in ommx format, originally provided by https://git.zib.de/qopt/qoblib-quantum-optimization-benchmarking-library/-/tree/main/01-marketsplit?ref_type=heads.
Available models: ['binary_linear', 'binary_unconstrained']
binary_linear: 156 instances
binary_unconstrained: 156 instances


## Working with Binary Linear Model

In [None]:
# Load a specific instance
model_name = "binary_linear"
instance_name = "ms_03_050_002"

instance, solution = dataset(model_name, instance_name)

print(f"Loaded instance: {instance_name}")
print(f"Instance type: {type(instance)}")
print(f"Solution available: {solution is not None}")

if solution:
    print(f"Objective value: {solution.objective}")
    print(f"Feasible: {solution.feasible}")
    print(f"Number of variables: {len(solution.state.entries)}")

## Solution Verification

In [None]:
if solution is not None:
    # Evaluate the solution using the instance
    evaluated = instance.evaluate(solution.state)
    
    print("Solution Verification:")
    print(f"Original objective: {solution.objective}")
    print(f"Evaluated objective: {evaluated.objective}")
    print(f"Objectives match: {solution.objective == evaluated.objective}")
    
    print(f"Original feasibility: {solution.feasible}")  
    print(f"Evaluated feasibility: {evaluated.feasible}")
    print(f"Feasibility matches: {solution.feasible == evaluated.feasible}")
    
    # Check state consistency
    state_match = solution.state.entries == evaluated.state.entries
    print(f"States match: {state_match}")

## Analyzing Multiple Instances

In [None]:
# Analyze first 5 instances of each size category
def analyze_marketsplit_instances():
    results = []
    
    # Group instances by size (extract size from name)
    size_groups = {}
    for instance_name in dataset.available_instances["binary_linear"]:
        # Extract size info from name like "ms_03_050_002"
        parts = instance_name.split('_')
        if len(parts) >= 3:
            size_key = f"{parts[1]}_{parts[2]}"  # e.g., "03_050"
            if size_key not in size_groups:
                size_groups[size_key] = []
            size_groups[size_key].append(instance_name)
    
    # Analyze one instance from each size group
    for size_key, instances in list(size_groups.items())[:5]:
        instance_name = instances[0]  # Take first instance of this size
        
        try:
            instance, solution = dataset("binary_linear", instance_name)
            
            if solution:
                evaluated = instance.evaluate(solution.state)
                results.append({
                    'instance': instance_name,
                    'size_category': size_key,
                    'objective': solution.objective,
                    'feasible': solution.feasible,
                    'variables': len(solution.state.entries),
                    'verification_passed': (
                        solution.objective == evaluated.objective and
                        solution.feasible == evaluated.feasible
                    )
                })
                
        except Exception as e:
            print(f"Error with {instance_name}: {e}")
    
    return results

# Run analysis
results = analyze_marketsplit_instances()

print("\\nAnalysis Results:")
print(f"{'Instance':<15} {'Size':<8} {'Variables':<10} {'Objective':<12} {'Feasible':<9} {'Verified':<9}")
print("-" * 70)

for r in results:
    print(f"{r['instance']:<15} {r['size_category']:<8} {r['variables']:<10} "
          f"{r['objective']:<12.2f} {str(r['feasible']):<9} {str(r['verification_passed']):<9}")

This example demonstrates the key patterns for working with the Marketsplit dataset, including basic usage, solution verification, performance analysis, and robust error handling.