# Knapsack Solver Execution Framework

This notebook provides the framework to run knapsack solvers against the 27 instances generated by `sample.ipynb`.

**Purpose:**
1.  Load the 27 generated problem instances.
2.  Define a common solver interface (`fit` function) as specified in `team_project_plan.md`.
3.  Import solvers from `my_solvers.py`.
4.  Execute the solvers against all instances, handling deterministic (1 run) and randomized (5 runs) protocols.
5.  Measure runtime and capture the `best_value`.
6.  Save all results to a single `solver_runs.json` file, structured for the evaluation notebook.

In [1]:
import os
import json
import time
import random
from pathlib import Path
from typing import List, Dict, Any, Tuple, Callable, Optional
from tqdm import tqdm

# --- Configuration ---

# Directory where 'sample.ipynb' saved its data
DATA_ROOT = Path("knapsack_multisize_data")

# Directory to save the output of this notebook
RESULTS_DIR = Path("experiment_results")
RESULTS_DIR.mkdir(exist_ok=True)

# Protocol defined in team_project_plan.md
N_RUNS_RANDOMIZED = 5

# A base seed for the solver *runs* (distinct from instance generation seeds)
BASE_RUN_SEED = 20241019 

print(f"Data root (input): {DATA_ROOT.resolve()}")
print(f"Results dir (output): {RESULTS_DIR.resolve()}")

Data root (input): C:\Users\abhay\Desktop\Projects\COMA_IIITR\knapsack_multisize_data
Results dir (output): C:\Users\abhay\Desktop\Projects\COMA_IIITR\experiment_results


In [2]:
def load_instance_json(path: Path) -> Optional[Dict]:
    "Helper to load a single JSON instance file."
    if not path.exists():
        print(f"Warning: File not found, skipping: {path}")
        return None
    try:
        with path.open("r", encoding="utf8") as f:
            return json.load(f)
    except Exception as e:
        print(f"Error loading {path}: {e}")
        return None

def load_all_instances(data_root: Path) -> List[Tuple[str, str, str, Dict, Path]]:
    """
    Loads all 27 instances based on the metadata file.
    
    Returns:
        A flat list of tuples, where each tuple is:
        (n, dist_name, cap_name, instance_dict, instance_path)
    """
    meta_path = data_root / "generation_metadata.json"
    if not meta_path.exists():
        print(f"Error: Metadata file not found at {meta_path}")
        print("Please run 'sample.ipynb' first to generate the datasets.")
        return []
    
    with meta_path.open("r", encoding="utf8") as f:
        meta = json.load(f)
        
    all_instances = []
    print(f"Loading instances from {meta_path}...")

    # meta[dist_name][cap_name] -> list of records
    for dist_name, cap_dict in meta.items():
        for cap_name, records in cap_dict.items():
            for rec in records:
                n = rec["n"]
                json_path = Path(rec["json"])
                
                inst_data = load_instance_json(json_path)
                
                if inst_data:
                    all_instances.append((n, dist_name, cap_name, inst_data, json_path))
    
    print(f"Successfully loaded {len(all_instances)} instances.")
    return all_instances

# --- Load the data ---
# This list contains all 27 problem instances
ALL_INSTANCES = load_all_instances(DATA_ROOT)

if ALL_INSTANCES:
    print("\nExample instance check:")
    n, dist, cap, inst, path = ALL_INSTANCES[0]
    print(f"  n = {n}, dist = {dist}, cap = {cap}")
    print(f"  Path = {path.name}")
    print(f"  Capacity = {inst['capacity']}")
    print(f"  Num Items = {len(inst['items'])}")

Loading instances from knapsack_multisize_data\generation_metadata.json...
Successfully loaded 27 instances.

Example instance check:
  n = 50, dist = uniform, cap = cap_0.2
  Path = knapsack_n50_seed20251020.json
  Capacity = 2516
  Num Items = 50


## 2. Solver Interface & Implementations

This is where you will add your own solvers.

**Requirement:** Every solver *must* follow the `fit` function interface defined in `team_project_plan.md`:

`fit(instance: Dict, timeout: Optional[float] = None, seed: Optional[int] = None) -> Tuple[List[int], float, Dict]`

**Returns:**
1.  `best_solution`: A list of item IDs (e.g., `[0, 5, 12]`) that are in the knapsack.
2.  `best_value`: The total value of the `best_solution` (e.g., `1580.0`).
3.  `logs`: A dictionary of extra info (e.g., convergence plot data, final weight). This will be saved in the results JSON.

In [None]:
# --- Solver Interface (Type Hint) ---
SolverFunction = Callable[[Dict, Optional[float], Optional[int]], Tuple[List[int], float, Dict]]

# --- Import your solvers from external .py files ---
# Ensure my_solvers.py is in the same directory as this notebook
try:
    # Amod's Solvers
    from my_solvers import (
        dynamic_programming_solver, greedy_ratio_solver, 
        genetic_algorithm_solver, tabu_search_solver, grover_search_solver
    )
    # Kartik's Solvers
    from my_solvers import (
        branch_and_bound_solver, greedy_value_solver, 
        ant_colony_solver, differential_evolution_solver, qaoa_solver
    )
    # Sudarshan's Solvers
    from my_solvers import (
        backtracking_solver, greedy_weight_solver, 
        particle_swarm_solver, simulated_annealing_solver, quantum_annealing_solver
    )
    
    print("Successfully imported all 15 custom solvers from my_solvers.py")
    
except ImportError as e:
    print(f"WARNING: Could not import from my_solvers.py. {e}")
    print("Please create 'my_solvers.py' in the same directory.")
    print("Using inline dummy solvers only.")
    
    # Define placeholder functions if import fails so notebook doesn't crash
    def placeholder_solver(*args, **kwargs): 
        raise NotImplementedError("Solver not found. Check my_solvers.py.")
    
    # Amod
    dynamic_programming_solver = greedy_ratio_solver = genetic_algorithm_solver = tabu_search_solver = grover_search_solver = placeholder_solver
    # Kartik
    branch_and_bound_solver = greedy_value_solver = ant_colony_solver = differential_evolution_solver = qaoa_solver = placeholder_solver
    # Sudarshan
    backtracking_solver = greedy_weight_solver = particle_swarm_solver = simulated_annealing_solver = quantum_annealing_solver = placeholder_solver


# --- ** DEFINE THE FULL LIST OF SOLVERS TO RUN ** ---
SOLVERS_TO_RUN = {
    # --- Amod's Solvers ---
    "DynamicProgramming": {
        "function": dynamic_programming_solver,
        "is_randomized": False,
        "params": {"note": "Exact DP solver"}
    },
    "Greedy_Ratio": {
        "function": greedy_ratio_solver,
        "is_randomized": False,
        "params": {"heuristic": "value/weight ratio"}
    },
    "GeneticAlgorithm": {
        "function": genetic_algorithm_solver,
        "is_randomized": True,
        "params": {"pop_size": 100, "generations": 200, "mutation_rate": 0.02}
    },
    "TabuSearch": {
        "function": tabu_search_solver,
        "is_randomized": True, # Randomized if init solution or neighborhood is sampled
        "params": {"tabu_tenure": 20, "iterations": 500}
    },
    "GroverSearch": {
        "function": grover_search_solver,
        "is_randomized": True, # Quantum simulation is stochastic, uses seed
        "params": {"note": "Grover-based search"}
    },
    
    # --- Kartik's Solvers ---
    "BranchAndBound": {
        "function": branch_and_bound_solver,
        "is_randomized": False,
        "params": {"note": "Exact B&B solver"}
    },
    "Greedy_Value": {
        "function": greedy_value_solver,
        "is_randomized": False,
        "params": {"heuristic": "value-first"}
    },
    "AntColonyOptimization": {
        "function": ant_colony_solver,
        "is_randomized": True,
        "params": {"ants": 50, "iterations": 100, "evaporation": 0.1}
    },
    "DifferentialEvolution": {
        "function": differential_evolution_solver,
        "is_randomized": True,
        "params": {"pop_size": 50, "generations": 100, "F": 0.8, "CR": 0.7}
    },
    "QAOA": {
        "function": qaoa_solver,
        "is_randomized": True, # Quantum simulation is stochastic, uses seed
        "params": {"reps": 1, "optimizer": "COBYLA"}
    },

    # --- Sudarshan's Solvers ---
    "Backtracking": {
        "function": backtracking_solver,
        "is_randomized": False,
        "params": {"note": "Exact backtracking solver"}
    },
    "Greedy_Weight": {
        "function": greedy_weight_solver,
        "is_randomized": False,
        "params": {"heuristic": "weight-first"}
    },
    "ParticleSwarmOptimization": {
        "function": particle_swarm_solver,
        "is_randomized": True,
        "params": {"particles": 50, "iterations": 100, "w": 0.7, "c1": 1.5, "c2": 1.5}
    },
    "SimulatedAnnealing": {
        "function": simulated_annealing_solver,
        "is_randomized": True,
        "params": {"start_temp": 1000, "end_temp": 0.1, "alpha": 0.99}
    },
    "QuantumAnnealing": {
        "function": quantum_annealing_solver,
        "is_randomized": True, # Samplers are stochastic, use seed
        "params": {"sampler": "SimulatedAnnealingSampler", "num_reads": 100}
    }
}

print(f"Defined {len(SOLVERS_TO_RUN)} solvers to run.")
print(f"Solvers: {list(SOLVERS_TO_RUN.keys())}")

## 3. Experiment Execution Loop

This cell iterates through all solvers and all 27 instances, applying the correct run protocol (1 run or 5 runs) for each.

It builds a list of result dictionaries, `all_results`, which matches the format required by the project plan.

In [None]:
# Create a flat list of all tasks to run
tasks = []
for solver_name, config in SOLVERS_TO_RUN.items():
    # (n, dist_name, cap_name, instance_dict, instance_path)
    for (n, dist, cap, instance, path) in ALL_INSTANCES:
        
        n_runs = N_RUNS_RANDOMIZED if config["is_randomized"] else 1
        
        for i in range(n_runs):
            # Each randomized run gets a unique, reproducible seed
            run_seed = (BASE_RUN_SEED + i) if config["is_randomized"] else None
            tasks.append((solver_name, config, n, dist, cap, instance, path, run_seed, i, n_runs))

print(f"Total experiment tasks to run: {len(tasks)}")

all_results = []
start_total_time = time.perf_counter()

for task in tqdm(tasks, desc="Running Solvers"):
    (solver_name, config, n, dist, cap, instance, path, run_seed, run_idx, n_runs) = task
    
    solver_func = config["function"]
    
    try:
        # --- Execute the solver and time it ---
        start_run_time = time.perf_counter()
        
        best_solution_ids, best_value, logs = solver_func(
            instance=instance,
            timeout=None, # You could set a timeout here, e.g., 300 seconds
            seed=run_seed
        )
        
        end_run_time = time.perf_counter()
        runtime = end_run_time - start_run_time
        # ----------------------------------------

        # Build the result record as specified in the project plan
        result_record = {
            # --- Solver Info ---
            "method": solver_name,
            "parameters": config["params"],
            "seed": run_seed, # Will be None for deterministic
            "run_index": run_idx,
            
            # --- Instance Info (for easy filtering) ---
            "instance_file": str(path.name),
            "instance_n": n,
            "instance_dist": dist,
            "instance_cap_ratio_str": cap,
            "instance_seed": instance['meta']['seed'],
            
            # --- Core Results ---
            "best_value": best_value,
            "runtime": runtime,
            
            # --- Extra Data ---
            # Omit 'best_solution_ids' from this summary file to keep it small,
            # but store the solver's logs.
            "logs": logs 
        }
        
        all_results.append(result_record)
    
    except Exception as e:
        print(f"!! ERROR running {solver_name} on {path.name} (Run {run_idx+1}) !!")
        print(f"  Error: {e}")
        # Optionally, save the error record
        all_results.append({
            "method": solver_name,
            "parameters": config["params"],
            "seed": run_seed,
            "instance_file": str(path.name),
            "best_value": None,
            "runtime": None,
            "logs": {"error": str(e)}
        })

end_total_time = time.perf_counter()
print(f"\n--- Execution Complete ---")
print(f"Total results collected: {len(all_results)}")
print(f"Total time: {end_total_time - start_total_time:.2f} seconds")

## 4. Save Results to JSON

This final step serializes all `all_results` into a single JSON file. The evaluation notebook will read this file.

In [None]:
output_path = RESULTS_DIR / "solver_runs.json"

try:
    with open(output_path, "w", encoding="utf8") as f:
        json.dump(all_results, f, indent=2)
    
    print(f"Successfully saved {len(all_results)} results to:")
    print(f"{output_path.resolve()}")

except TypeError as e:
    print(f"Error: Could not serialize results to JSON. {e}")
    print("Check that your 'logs' dictionary contains JSON-compatible types (no numpy arrays, etc.)")
except Exception as e:
    print(f"An unexpected error occurred while saving: {e}")

## 5. Next Steps

1.  **Implement Your Solver**:
    * Open `my_solvers.py`.
    * Fill in the logic for your assigned solvers (e.g., `dynamic_programming_solver`, `genetic_algorithm_solver`).
    * **Crucially**, make sure randomized solvers (GA, SA, ACO) use the `seed` parameter and metaheuristics save a `convergence_history` list in their `logs`.

2.  **Configure Parameters**:
    * In this notebook (Cell 5), adjust the `"params"` dictionary for your solvers (e.g., population size, temperatures).

3.  **Run the Experiment**:
    * Run this entire notebook (`solver_template.ipynb`) from top to bottom.
    * This will execute all solvers defined in `SOLVERS_TO_RUN` on all 27 instances.

4.  **Analyze the Results**:
    * All your data is now in `experiment_results/solver_runs.json`.
    * Open `evaluation_template.ipynb` and run it to generate all summary tables and plots.