# Comprehensive Benchmarking of Parallel QAOA Portfolio Optimization with DWE

This notebook systematically benchmarks the performance and solution quality of our parallel QAOA implementation for portfolio optimization, leveraging the Domain Wall Encoding (DWE) technique. As a telecom engineer with a mission to raise quantum technology awareness, this empirical data will be invaluable.

We will evaluate how varying problem size, QAOA depth, optimizer effort, number of parallel processes, and transpilation levels impact both the time to solution and the quality of the results.

**Key enhancements in this benchmarking notebook:**
1.  **Systematic Parameter Variation:** We'll iterate through predefined ranges of `N`, `p`, `max_iterations_optimizer`, `num_parallel_runs`, and `Transpilation level`.
2.  **Robust Data Collection:** Results from each run will be collected into a Pandas DataFrame.
3.  **Comprehensive Metrics:** We'll track wall-clock time, solution energy, approximation ratio (where applicable), and details of each optimization attempt.
4.  **Automated Visualization:** (Placeholder for future plots to illustrate trends and insights from the benchmarking data).

---

## 1. Setup, Imports, and Worker Function

This section contains all necessary imports and the `run_single_optimization` worker function, originally from `qaoa_parallel_optimizer_worker.py`. This ensures the notebook is self-contained.

In [None]:
import numpy as np
import os
import time
import concurrent.futures # For parallel processing (though we'll use multiprocessing.Pool)
from multiprocessing import Pool # Explicitly use Pool for robustness
import json
import pandas as pd # For data collection and analysis
import matplotlib.pyplot as plt # For visualization
import itertools # For iterating over all parameter combinations

from qokit.qaoa_circuit_portfolio import get_parameterized_qaoa_circuit
from qiskit.circuit import ParameterVector, QuantumCircuit
from qiskit import transpile
from scipy.optimize import minimize

# --- Qiskit Simulator Setup (integrated into worker for each process) ---
# Note: The simulator needs to be initialized within each worker process if it's not picklable.
# For AerSimulator, it's generally fine to create it once or pass it, but creating within
# the worker function for Transpilation level changes is more robust.
try:
    from qiskit_aer import AerSimulator
    GLOBAL_SIMULATOR_BACKEND = AerSimulator()
    print("Initialized AerSimulator for global use (can be re-transpiled per worker).")
except ImportError:
    try:
        from qiskit.providers.aer import Aer
        GLOBAL_SIMULATOR_BACKEND = Aer.get_backend('qasm_simulator')
        print("Initialized Aer.get_backend('qasm_simulator') for global use.")
    except ImportError:
        raise ImportError("Qiskit Aer backend not found. Please install qiskit-aer (pip install qiskit-aer).")


# --- Cost Function for QAOA (from qaoa_parallel_optimizer_worker.py) ---
def qaoa_cost_function(params, po_problem_arg, p_layers, num_shots_simulator, transpilation_level, cost_function_calls, simulator_backend):
    """
    Computes the QAOA cost function for a given set of parameters (betas and gammas).
    This function will be minimized by scipy.optimize.
    """
    cost_function_calls[0] += 1  # Increment call counter

    gammas = params[:p_layers]
    betas = params[p_layers:]

    qaoa_circuit = get_parameterized_qaoa_circuit(
        po_problem=po_problem_arg,
        depth=p_layers,
        gamma=gammas,
        beta=betas
    )
    qaoa_circuit.measure_all()

    # Transpile for the simulator with specified optimization_level
    transpiled_circuit = transpile(qaoa_circuit, simulator_backend, optimization_level=transpilation_level)

    job = simulator_backend.run(transpiled_circuit, shots=num_shots_simulator)
    result = job.result()
    counts = result.get_counts(transpiled_circuit)

    expected_energy = 0
    total_shots = sum(counts.values())

    if total_shots == 0:
        print(f"Warning: No shots recorded for bitstring counts in cost function. Counts: {counts}")
        return np.inf

    J = po_problem_arg["J"]
    h = po_problem_arg["h"]

    for bitstring, count in counts.items():
        x = np.array([int(b) for b in bitstring[::-1]])
        energy_for_bitstring = 0

        for i in range(len(x)):
            for j in range(i + 1, len(x)):
                if (i, j) in J:
                    energy_for_bitstring += J[(i, j)] * x[i] * x[j]
                elif (j, i) in J:
                    energy_for_bitstring += J[(j, i)] * x[i] * x[j]

        for i in range(len(x)):
            if i in h:
                energy_for_bitstring += h[i] * x[i]

        expected_energy += energy_for_bitstring * count

    return expected_energy / total_shots


# --- Worker Function for Parallel Processing (from qaoa_parallel_optimizer_worker.py) ---
def run_single_optimization(initial_point_tuple, po_problem_arg, p_layers, max_iterations_optimizer,
                            num_shots_simulator, run_id, transpilation_level):
    """
    Performs a single QAOA optimization run from a given initial point.
    This function is designed to be run in parallel processes.
    """
    start_time = time.perf_counter()
    initial_point = np.array(initial_point_tuple)

    # Initialize simulator within the worker for robustness with multiprocessing
    try:
        simulator = AerSimulator()
    except ImportError:
        from qiskit.providers.aer import Aer
        simulator = Aer.get_backend('qasm_simulator')

    print(f"Run {run_id}: Starting optimization from initial point: {np.round(initial_point, 3)} with N={po_problem_arg['N']}, p={p_layers}, MaxIter={max_iterations_optimizer}, TL={transpilation_level}")

    cost_function_calls = [0]

    try:
        bounds = [(0, 2 * np.pi)] * p_layers + [(0, np.pi)] * p_layers

        result = minimize(qaoa_cost_function, initial_point,
                          args=(po_problem_arg, p_layers, num_shots_simulator, transpilation_level, cost_function_calls, simulator),
                          method='COBYLA', bounds=bounds,
                          options={'maxiter': max_iterations_optimizer, 'disp': False})

        end_time = time.perf_counter()
        runtime = end_time - start_time

        return {
            "run_id": run_id,
            "optimal_params": result.x.tolist(),
            "optimal_energy": result.fun,
            "nfev": result.nfev,
            "success": result.success,
            "message": result.message,
            "runtime_seconds": runtime
        }

    except Exception as e:
        end_time = time.perf_counter()
        runtime = end_time - start_time
        print(f"Run {run_id}: Optimization failed - {e}")
        return {
            "run_id": run_id,
            "optimal_energy": float('inf'),
            "optimal_params": None,
            "nfev": cost_function_calls[0],
            "success": False,
            "message": str(e),
            "runtime_seconds": runtime
        }


## 2. Define Benchmarking Parameters and Fixed Problem Setup

Here, we define the ranges for the parameters we want to benchmark and set up the fixed problem parameters for the portfolio optimization with DWE.

In [None]:
# --- Benchmarking Parameter Ranges ---
num_parallel_runs_values = [1, 2, 3] # Number of parallel multi-start runs
p_values = [1, 30, 60] # Number of QAOA layers (depth)
max_iterations_optimizer_values = [50, 300 ,700] # Max iterations for classical optimizer
N_values = [5, 8, 15] # Number of assets
transpilation_level_values = [0, 3] # Transpilation optimization level

# --- Fixed Problem Parameters ---
num_shots_simulator = 256 # Number of shots for quantum circuit simulation
q = 0.5 # Risk aversion parameter
lambda_sum = 0 # DWE-inspired penalty coefficient for sum constraint.

# Seed for reproducibility of problem definition (different for each N in benchmark, if desired)
PROBLEM_SEED = 42 # For consistent portfolio problem generation across runs if N is fixed

# --- Helper Function to Define Portfolio Optimization Problem (as in your original notebook) ---
def define_po_problem(N_val, K_val, q_val, lambda_sum_val, problem_seed=None):
    if problem_seed is not None:
        np.random.seed(problem_seed)

    mu = np.random.uniform(0.05, 0.20, N_val)
    Sigma = np.random.uniform(0.001, 0.015, (N_val, N_val))
    Sigma = (Sigma + Sigma.T) / 2 # Make it symmetric
    Sigma = Sigma + np.eye(N_val) * 0.005 # Add a small diagonal component for stability

    factor_J_obj = (2 * q_val) / (K_val**2)
    factor_h_linear_obj = -1 / K_val
    factor_h_diagonal_obj = q_val / (K_val**2)

    J_coeffs_objective = {}
    h_coeffs_objective = {}

    for i in range(N_val):
        for j in range(i + 1, N_val):
            J_coeffs_objective[(i, j)] = factor_J_obj * Sigma[i, j]

    for i in range(N_val):
        h_coeffs_objective[i] = factor_h_linear_obj * mu[i] + factor_h_diagonal_obj * Sigma[i, i]

    J_coeffs_total = {}
    h_coeffs_total = {}

    for (i, j), val in J_coeffs_objective.items():
        J_coeffs_total[(i, j)] = val + 2 * lambda_sum_val

    for i, val in h_coeffs_objective.items():
        h_coeffs_total[i] = val - 5 * lambda_sum_val

    po_problem_dict = {
        "N": N_val,
        "K": K_val,
        "q": q_val,
        "J": J_coeffs_total,
        "h": h_coeffs_total,
        "means": mu,
        "cov": Sigma,
        "q_orig": q_val,
        "scale": 1.0
    }
    return po_problem_dict, mu, Sigma # Return mu and Sigma for brute-force if needed

# Function to evaluate the classical objective energy for a given bitstring
def evaluate_bitstring_energy(bitstring_array, original_mu, original_Sigma, original_q):
    x = bitstring_array
    portfolio_variance = np.dot(x, np.dot(original_Sigma, x))
    portfolio_return = np.dot(original_mu, x)
    return original_q * portfolio_variance - portfolio_return




## 3. Benchmarking Execution Loop

This is the main section that orchestrates the benchmarking. It iterates through all combinations of parameters, runs the parallel QAOA optimization, collects the results, and stores them in a Pandas DataFrame.

In [None]:
if __name__ == "__main__":
    all_benchmarking_results = []

    # Create a list of all parameter combinations
    param_combinations = itertools.product(
        num_parallel_runs_values,
        p_values,
        max_iterations_optimizer_values,
        N_values,
        transpilation_level_values
    )

    total_runs_count = len(num_parallel_runs_values) * \
                       len(p_values) * \
                       len(max_iterations_optimizer_values) * \
                       len(N_values) * \
                       len(transpilation_level_values)

    print(f"Total benchmarking configurations to run: {total_runs_count}")
    current_run_idx = 0

    for current_num_parallel_runs, current_p, current_max_iter, current_N, current_transpilation_level in param_combinations:
        current_run_idx += 1
        print(f"\n--- Starting Benchmarking Run {current_run_idx}/{total_runs_count} ---")
        print(f"Parameters: num_parallel_runs={current_num_parallel_runs}, p={current_p}, max_iter={current_max_iter}, N={current_N}, TL={current_transpilation_level}")

        # --- Define Problem Parameters dynamically for current N ---
        current_K = int(current_N * 0.4)
        po_problem, mu_for_bruteforce, cov_for_bruteforce = define_po_problem(
            current_N, current_K, q, lambda_sum, problem_seed=PROBLEM_SEED
        )

        # --- Calculate Brute-Force Classical Optimal Energy (if feasible) ---
        E_min_classical = None
        E_max_classical = None
        if current_N <= 20: # Keep brute-force limited for practical runtime
            try:
                from qokit.portfolio_optimization import portfolio_brute_force
                original_po_problem_for_brute_force = {
                    "N":current_N, "K":current_K, "q":q, "means":mu_for_bruteforce, "cov":cov_for_bruteforce, "scale": 1.0
                }
                E_min_classical, E_max_classical = portfolio_brute_force(original_po_problem_for_brute_force, return_bitstring=False)
                print(f"Brute-force E_min: {E_min_classical:.6f}, E_max: {E_max_classical:.6f}")
            except ImportError:
                print("qokit.portfolio_optimization.portfolio_brute_force not available.")
            except Exception as e:
                print(f"Brute-force calculation failed for N={current_N}: {e}")
        else:
            print(f"Brute-force calculation skipped for N={current_N} (too large).")

        # --- Generate Initial Points for Multi-Start ---
        initial_points = []
        np.random.seed(current_run_idx) # Use run_idx as seed for initial points to ensure variety
        for _ in range(current_num_parallel_runs):
            gammas_initial = np.random.uniform(0, 2 * np.pi, current_p)
            betas_initial = np.random.uniform(0, np.pi, current_p)
            initial_points.append(tuple(np.concatenate((gammas_initial, betas_initial))))

        # --- Parallel Execution ---
        start_overall_time = time.perf_counter()

        worker_args = [(initial_point, po_problem, current_p, current_max_iter, num_shots_simulator, i + 1, current_transpilation_level)
                       for i, initial_point in enumerate(initial_points)]

        individual_run_results = []
        try:
            with Pool(processes=os.cpu_count()) as pool:
                results = [pool.apply_async(run_single_optimization, args) for args in worker_args]

                for future in results:
                    individual_run_results.append(future.get())
        except Exception as e:
            print(f"Error during parallel execution for current configuration: {e}")
            # Append a partial result to indicate failure for this config
            individual_run_results = [{'success': False, 'message': str(e), 'optimal_energy': float('inf'), 'runtime_seconds': (time.perf_counter() - start_overall_time)}]

        end_overall_time = time.perf_counter()
        overall_duration = end_overall_time - start_overall_time
        print(f"All parallel optimization runs for current config completed in {overall_duration:.2f} seconds.")

        # --- Process and Summarize Results for current config ---
        best_overall_energy = float('inf')
        best_overall_params = None
        num_successful_individual_runs = 0
        all_individual_energies = []

        if individual_run_results:
            for result in individual_run_results:
                if result['success']:
                    num_successful_individual_runs += 1
                    all_individual_energies.append(result['optimal_energy'])
                if result['optimal_energy'] < best_overall_energy:
                    best_overall_energy = result['optimal_energy']
                    best_overall_params = result['optimal_params']

        approximation_ratio = None
        if E_min_classical is not None and E_max_classical is not None and (E_max_classical - E_min_classical) != 0:
            approximation_ratio = (best_overall_energy - E_min_classical) / (E_max_classical - E_min_classical)
        elif E_min_classical is not None and E_max_classical is not None and (E_max_classical - E_min_classical) == 0:
            # Special case for AR when range is zero (all solutions are optimal)
            approximation_ratio = 0.0 if best_overall_energy == E_min_classical else np.inf # Or handle as NaN

        # Store results for this configuration
        config_results = {
            'N': current_N,
            'K': current_K,
            'p': current_p,
            'num_parallel_runs': current_num_parallel_runs,
            'max_iterations_optimizer': current_max_iter,
            'num_shots_simulator': num_shots_simulator,
            'q': q,
            'lambda_sum': lambda_sum,
            'transpilation_level': current_transpilation_level,
            'overall_runtime_seconds': overall_duration,
            'best_overall_energy': best_overall_energy,
            'approximation_ratio': approximation_ratio,
            'E_min_classical': E_min_classical,
            'E_max_classical': E_max_classical,
            'num_successful_individual_runs': num_successful_individual_runs,
            'total_individual_runs_attempted': current_num_parallel_runs,
            'all_individual_energies': all_individual_energies # Store for detailed analysis if needed
        }
        all_benchmarking_results.append(config_results)

    print("\n--- Benchmarking Completed ---")
    # Convert results to a Pandas DataFrame
    results_df = pd.DataFrame(all_benchmarking_results)
    print("\nBenchmarking Results DataFrame Head:")
    print(results_df.head())

    # Optional: Save results to CSV or other format
    # results_df.to_csv("qaoa_benchmarking_results.csv", index=False)
    # print("Results saved to qaoa_benchmarking_results.csv")




## 4. Results Analysis and Visualization (Future)

This section is where you would perform analysis on the `results_df` and create visualizations to understand the impact of varying parameters on performance and solution quality.

### Example Analysis Ideas:
- Plot `overall_runtime_seconds` vs. `N` (for different `p` values).
- Plot `best_overall_energy` vs. `p` (for different `N` values).
- Plot `approximation_ratio` vs. `max_iterations_optimizer`.
- Analyze the success rate (`num_successful_individual_runs / total_individual_runs_attempted`).
- Create heatmaps or 3D plots to show interactions between multiple parameters.

You can use libraries like `matplotlib.pyplot` and `seaborn` for creating insightful plots.

In [None]:
# Example of basic plotting (uncomment and run after the main loop finishes)
# plt.figure(figsize=(10, 6))
# for N_val in N_values:
#     subset = results_df[results_df['N'] == N_val]
#     plt.plot(subset['p'], subset['best_overall_energy'], label=f'N={N_val}')
# plt.xlabel('QAOA Layers (p)')
# plt.ylabel('Best Overall Energy')
# plt.title('Best Energy vs. QAOA Layers for different N')
# plt.legend()
# plt.grid(True)
# plt.show()

# You can also save the DataFrame to a CSV for external analysis:
# results_df.to_csv('qaoa_benchmarking_results.csv', index=False)
# print("Benchmarking results saved to 'qaoa_benchmarking_results.csv'")