## Environment Setup: Installing Required Python Libraries

This cell installs all external Python libraries required for the notebook to run successfully. It ensures that the computational environment has the necessary quantum computing, scientific computing, and data analysis dependencies.

###Quantum computing Libraries
* `qiskit`: IBM’s open-source framework for quantum computing, used to construct, simulate, and run quantum circuits.

* `qiskit-aer`: Provides high-performance simulators for quantum circuits, enabling execution without access to real quantum hardware.

* `qiskit-algorithms`: Contains high-level quantum algorithms (e.g., optimization, variational algorithms, and sampling routines) built on top of Qiskit primitives

###Scientific computin and data handling libraries


* `numpy`: Core numerical library for array-based computations and linear algebra.

* `pandas`: Used for structured data manipulation, tabular datasets, and data analysis workflows.

* `matplotlib`: Provides plotting and visualization utilities for results and diagnostics.

* `scipy`: Supplies scientific routines such as optimization, statistics, and numerical solvers.

* `openpyxl`: Enables reading from and writing to Excel (.xlsx) files, useful for exporting results.

In [None]:
!pip install -q qiskit qiskit-aer qiskit-algorithms
!pip install -q numpy pandas matplotlib scipy openpyxl

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.0/8.0 MB[0m [31m40.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.4/12.4 MB[0m [31m58.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m327.8/327.8 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m40.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m54.4/54.4 kB[0m [31m2.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
import qiskit
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

print(f"Qiskit version:{qiskit.__version__}")
print(f"NumPy version:{np.__version__}")
print(f"Pandas version: {pd.__version__}")

Qiskit version:2.2.3
NumPy version:2.0.2
Pandas version: 2.2.2


##Importing dependencies and configuring the environment

**Core scientific and data libraries**: `numpy`, `pandas`

**Type hinting and structured data utilities**:
* `typing`: enables static type hints for improved code readability and maintainability. `Dict`, `List`, `Tuple`, `Optional` are used to clearly specify function inputs and outputs.

* `dataclass`: facilitates the creation of lightweight classes for stroin structured data with minimal boilerplate code.

**Quantum computing imports**
* `QuantumCircuit`: Core object used to define quantum circuits.

Optimizers:

* `COBYLA`: Gradient-free classical optimizer suited for small-scale variational problems.

* `SPSA`: Noise-robust optimizer commonly used in quantum settings.

* `Sampler`: Executes quantum circuits and returns measurement samples.

* `QuadraticProgram`: Represents optimization problems in a mathematical programming form.

* `QuadraticProgramToQubo`: Converts constrained quadratic programs into QUBO (Quadratic Unconstrained Binary Optimization) form, which is required by QAOA.

Availability Flag

`QISKIT_AVAILABLE`:

* Set to `True` if all Qiskit imports succeed.

* Set to `False` if Qiskit is not installed or unavailable.

* This flag allows the rest of the notebook to gracefully degrade to classical alternatives or skip quantum-specific sections when quantum libraries are missing.

In [None]:
import numpy as np
import pandas as pd
from typing import Dict, List, Tuple, Optional
from dataclasses import dataclass
import warnings
warnings.filterwarnings('ignore')

#Quantum imports
try:
  from qiskit import QuantumCircuit
  from qiskit_algorithms import QAOA
  from qiskit_algorithms.optimizers import COBYLA, SPSA
  from qiskit.primitives import Sampler
  from qiskit_optimization import QuadraticProgram
  from qiskit_optimization.converters import QuadraticProgramToQubo
  QISKIT_AVAILABLE=True
except ImportError:
  QISKIT_AVAILABLE=False

##Portfolio optimization configuration
This cell defines a configuration container for the portfolio optimization problem using a Python `dataclass`. It centralizes all tunable parameters controlling risk, constraints, and quantum algorithm setting

The configuration controls three main aspects of the system:

1. **Portfolio structure and constraints**

It specifies the size of the investment universe, limits how many positions can be changed during rebalancing, and enforces diversification through sector exposure caps. These parameters define the feasible solution space before any optimization is performed.

2. **Risk and cost modeling**

The configuration includes a risk aversion parameter that balances expected returns against portfolio risk, as well as a transaction cost multiplier that penalizes excessive trading. These values directly influence how the optimization objective is constructed, shaping the trade-off between performance and stability.

3. **Quantum optimization behavior**

The configuration determines whether quantum methods are enabled and, if so, how complex the quantum algorithm should be. The parameter controls the expressiveness of the quantum circuit, while a global quantum-enabled flag allows the system to seamlessly switch between quantum and classical solvers depending on availability or experimental needs.




In [None]:
@dataclass
class PortfolioConfig:
  """Configuration for portfolio optimization"""
  portfolio_size: int=20
  max_position_changes: int=10
  risk_aversion: float=0.5
  transaction_cost_mult : float=1.0
  sector_limit: float=0.4
  qaoa_depth: int=2
  quantum_enabled: bool=True


##Portfolio data processing and preparation
**Initialization and Data Preprocessing Flow**

Upon initialization, the processor creates a defensive copy of the input dataset to avoid mutating the original data source. It then immediately triggers an internal preprocessing routine.

This preprocessing step standardizes the dataset by:

* Correcting inconsistent or malformed column names.

* Ensuring that key financial fields are converted to numeric types, coercing invalid values to missing values.

* Handling missing portfolio position data by assuming a neutral (zero) prior position where information is unavailable.

By the end of this stage, the dataset is guaranteed to be clean, numerically consistent, and safe for quantitative analysis.

**Covariance and risk structure modelling**

Once the data is prepared, the processor can generate a realistic covariance matrix that captures asset-level risk and interdependence.

Rather than relying on historical return time series, the covariance structure is synthesized using:

* Market capitalization as a proxy for volatility, based on the empirical observation that smaller-cap assets tend to exhibit higher volatility.

* Sector-based correlation assumptions, where assets within the same sector are modeled as more strongly correlated than assets across different sectors.

The flow proceeds by:

1. Estimating individual asset volatilities from market capitalization.

2. Constructing a correlation matrix that reflects sector relationships.

3. Enforcing symmetry and numerical stability.

4. Converting the correlation structure into a covariance matrix using the estimated volatilities.

This approach provides a plausible risk model even when historical data is sparse or unavailabl

**Risk-adjusted performance estimation**

Using the synthesized covariance structure, the processor computes risk-adjusted performance metrics for each asset.

Specifically, it calculates Sharpe ratios by:

* Extracting individual asset volatilities from the covariance matrix.

* Adjusting expected returns by a configurable risk-free rate.

* Normalizing returns by asset-level risk.

The resulting Sharpe ratios provide a standardized measure of return per unit of risk and serve as a key ranking signal for asset selection

**Intelligent asset universe reduction**

To ensure computational tractability—especially for quantum optimization—the processor includes a smart asset screening mechanism.

This screening stage reduces a potentially large asset universe to a manageable subset by:

* Ranking assets based on their Sharpe ratios.

* Enforcing minimum representation from each sector to preserve diversification.

* Selecting top-performing assets within each sector in a first pass.

* Filling any remaining slots with the highest-ranked assets overall, regardless of sector.

The final output is a compact, diversified asset set that balances risk-adjusted performance with sector coverage, making it suitable for both classical and quantum portfolio optimization.


In [None]:
class PortfolioDataProcessor:
  """Processes and prepares portfolio data"""
  def __init__(self, df: pd.DataFrame):
    self.df=df.copy()
    self._preprocess_data()

  def _preprocess_data(self):
    """Clean and prepare the dataset"""
    #Fix column name
    if '0Market_Cap (billions)' in self.df.columns:
      self.df.rename(columns={'0Market_Cap (billions)':'Market_Cap'}, inplace=True)

    #Ensure numeric types
    numeric_cols=['Expected_Return', 'Transaction_Cost','Market_Cap', 'Previous_Position']
    for col in numeric_cols:
      if col in self.df.columns:
        self.df[col]=pd.to_numeric(self.df[col], errors='coerce')

    #fill any mssing previous positions with 0
    self.df['Previous_Position'].fillna(0, inplace=True)

  def generate_sector_covariance(self)-> np.ndarray:
    """
    Generate realistic covariance matrix based on sector relationships
    and market cap (as proxy for volatility)
    """
    n=len(self.df)

    #Calculate volatility based on market cap (inverse relationships)
    #Smaller components typically have higher
    volatilities=0.15+0.35 /np.log(self.df['Market_Cap']+1)

    #Create correlation matrix based on sectors
    sectors=self.df['Sector'].values
    correlation=np.zeros((n, n))

    for i in range(n):
      for j in range(n):
        if i==j:
          correlation[i, j]=1.0
        elif sectors[i]==sectors[j]:
          #Same sector: hih correlation(0.5-0.7)
          correlation[i, j]=0.6+0.1*np.random.randn()*0.1
        else:
          #Different sector: low correlation (0.05-0.15)
          correlation[i, j]=0.1+0.05*np.random.randn()*0.1

    #Ensure correlation is symmetric and positive semi-definits
    correlation=(correlation+correlation.T)/2

    #Convert to covariance using volatilities
    vol_matrix=np.outer(volatilities, volatilities)
    covariance=correlation*vol_matrix

    return covariance

  def calculate_sharpe_ratios(self, risk_free_rate: float=0.02)-> pd.Series:
    """Calculate Sharpe ratio for each asset"""
    volatilities=np.sqrt(np.diag(self.generate_sector_covariance()))
    sharpe_ratios=(self.df['Expected_Return']-risk_free_rate)/volatilities
    return pd.Series(sharpe_ratios, index=self.df.index)

  def smart_asset_screening(self, target_size: int=20)-> pd.DataFrame:
    """
    Intelligently reduce asset universe usng Sharpe ratio
    while minimizn sector diversification
    """
    #Calculate Sharpe rtios
    sharpe_ratios=self.calculate_sharpe_ratios()
    self.df['Sharpe_Ratio']=sharpe_ratios

    #Get sector distribution
    sectors=self.df['Sector'].unique()
    min_per_sector=max(2, target_size//len(sectors))

    selected_assets=[]

    #first pass:Select top assests from each sector
    for sector in sectors:
      sector_assets=self.df[self.df['Sector']==sector].nlargest(
      min_per_sector,'Sharpe_Ratio'
      )
      selected_assets.append(sector_assets)

    selected_df=pd.concat(selected_assets)

    #Second pass: Fill remaining slots with best overall Sharpe ratios
    if len(selected_df)<target_size:
      remaining=self.df[~self.df.index.isin(selected_df.index)]
      additional=remaining.nlargest(
          target_size-len(selected_df), 'Sharpe_Ratio'
      )
      selected_df=pd.concat([selected_df, additional])

    return selected_df.head(target_size).reset_index(drop=True)

##QUBO formulator
This component is responsible for translating the portfolio optimization problem into a mathematical form suitable for quantum optimization. It serves as the bridge between financial intuition (returns, risk, costs, constraints) and the formal representation required by the quantum algorithms.

After portfolio data has been cleaned and a covariance matrix constructed, the QUBO formulator takes over, its task is to encode the entire optimization obejective into a single quadratic energy function over binary decision variables, where each variable represents whether an asset is included in the portfolio.

The output of this module is a mathematical object that can be consumed by a quantum alorithm via an equivalent Ising Hamiltonian.

##QUBO construction logic
The optimization problem is constructed in two parts: linear terms and quadratic terms.

**Linear Terms — Asset-Level Effects**

The linear component captures effects that depend on individual asset selection:

* Expected returns are encoded with a negative sign so that maximizing return becomes equivalent to minimizing energy.

* Transaction costs penalize changes from previous portfolio positions, discouraging unnecessary turnover.

* The formulation approximates position changes using asymmetric penalties for buying and selling, allowing prior portfolio state to influence the optimization outcome

**Quadratic Terms — Risk and Interactions**

The quadratic component captures interactions between assets:

* Portfolio risk is modeled through the covariance matrix, scaled by a configurable risk-aversion parameter.

* This term penalizes combinations of assets that jointly increase portfolio variance, encouraging diversification at the optimization level.

**Portfolio Size Constraint via Penalty Method**

Since QUBO problems are unconstrained by definition, the portfolio size constraint is enforced using a quadratic penalty formulation.

The logic is as follows:

* The desired number of selected assets is specified in the configuration.

* A large penalty weight is computed dynamically to dominate the objective whenever the constraint is violated.

* Both linear and quadratic coefficients are adjusted so that solutions selecting too many or too few assets incur a high energy cost.

This converts a hard combinatorial constraint into a soft penalty embedded directly in the QUBO energy function.

**QUBO Representation**

The result of the formulation process is:

* A quadratic coefficient matrix representing pairwise asset interactions.

* A linear coefficient vector representing asset-level costs and benefits.

Together, these fully define the QUBO objective function to be minimized.

**Ising Hamiltonian conversion**

To support quantum execution, the QUBO is further converted into an Ising Hamiltonian, the native representation used by quantum algorithms.

This conversion:

* Maps binary decision variables to spin variables.

* Separates the problem into linear (`h`) and quadratic (`J`) spin coefficients.

* Computes a constant energy offset that preserves equivalence between the QUBO and Ising formulations.

The resulting Ising model can be directly passed to QAOA or other variational quantum algorithms.

In [None]:
class QUBOFormulator:
  """Formulates the QUBO problem for quantum optimization"""
  def __init__(self, df: pd.DataFrame, covariance: np.ndarray, config: PortfolioConfig):
    self.df=df
    self.covariance=covariance
    self.config=config
    self.n_assets=len(df)

  def build_qubo_matrix(self)-> Tuple[np.ndarray, np.ndarray]:
    """
    Build QUBO matrix Q
    Minimize
    """
    n=self.n_assets

    #Linear terms (c vector)
    linear=np.zeros(n)

    #Maximize returns(negative because we minimize)
    linear-=self.df['Expected_Return'].values

    #Transaction costs
    prev_positions=self.df['Previous_Position'].values
    transaction_penalties=self.df['Transaction_Cost'].values*self.config.transaction_cost_mult

    #Penalty for changin positions: \tau * |x-x_prev|
    #this is approximated as: \tau* (x-x_prev) for x_prev=0, and \tau*(1-x) for x_prev=1
    for i in range(n):
      if prev_positions[i]==0:
        linear[i]+=transaction_penalties[i] #Cost to buy
      else:
        linear[i]+=transaction_penalties[i] #Cost to not hold (sell)

    #Quadratic terms (Q matrix)
    quadratic=np.zeros((n, n))

    #Risk Terms: \lambda * \sum
    quadratic+=self.config.risk_aversion*self.covariance

    #Portfolio size constraint (hard penalty)
    # We wanr:
    #Penalty:
    target_size=self.config.portfolio_size
    penalty_weight=max(np.abs(linear).max(), np.abs(quadratic).max())*10

    #Expand
    for i in range(n):
      linear[i]-=2*penalty_weight*target_size
      quadratic[i, 1]+=penalty_weight
      for j in range(i+1, n):
        quadratic[i, j]+=penalty_weight

    return quadratic, linear

  def build_ising_hamiltonian(self)-> Tuple[Dict, Dict, float]:
    """
    Convert QUBO to Ising Hamiltonian
    """
    Q, c=self.build_qubo_matrix()
    n=self.n_assets

    #Linear coefficients (h)
    h={}
    for i in range(n):
      for j in range(n):
        if i!=j:
          h[i]+=Q[i, j]/2


    #Quadratic coefficients (J)
    J={}
    for i in range(n):
      for j in range(i+1, n):
        J[(i, j)]=q[i, j]/4


    #Offset
    offset=sum(c)/2+sum(Q.flatten())/4

    return h, J, offset



##Quantum optimizer- The execution layer
This component implements the quantum optimization stage of the portfolio optimization pipeline. Its role is to take a problem already expressed as an Ising Hamiltonian and solve it using the Quantum Approximate Optimization Algorithm (QAOA).

It represents the final execution layer where the abstract optimization problem is handed off to a quantum algorithm.

**Conceptual Workflow**

1.**Hamiltonian Assembly**

The optimizer receives:

* Linear spin coefficients (`h`), representing individual asset biases.

* Pairwise interaction coefficients (`J`), representing asset-to-asset interactions.

* A constant energy offset.

These coefficients define the energy landscape that QAOA will attempt to minimize. The Hamiltonian is constructed using Pauli-Z operators, mapping each binary decision variable to a quantum spin.

At this stage, the financial optimization problem exists entirely as a quantum mechanical object.

2.**QAOA Configuration**

The optimizer configures a QAOA instance with:

* A sampler backend for executing quantum circuits.

* A classical optimizer to tune variational parameters.

* A configurable circuit depth (p), controlling the expressive power of the quantum ansatz.

The depth parameter directly influences the trade-off between solution quality and computational cost

3.**Hybrid Quantum–Classical Optimization Loop**

QAOA operates as a hybrid algorithm:

* The quantum circuit prepares parameterized quantum states.

* Measurement results are sampled from the quantum system.

* A classical optimizer updates circuit parameters to minimize the expected energy.

This loop continues until convergence or until the optimization budget is exhausted.

4.**Solution extraction**
Once optimization completes:

* The lowest-energy measured quantum state is identified.

* The corresponding bitstring is extracted, representing the optimal selection of assets.

* This bitstring is converted into a binary array, where each element indicates whether an asset is included in the portfolio.

If a definitive measurement is unavailable, a conservative fallback solution is returned to ensure robustness.

In [None]:
class QuantumOptimizer:
  """Quantum optimization using QAOA"""
  def __init__(self, config: PortfolioConfig):
    self.config=config

  def solve_qaoa(self, h: Dict, J: Dict, offset: float)-> np.ndarray:
    """
    Solve using QAOA (Quantum Approximate Optimization Algorithm)
    """

    n=len(h)

    #Create Hamiltonian terms
    from qiskit.quantum_info import SparsePauliOp

    #Build Pauli operators
    pauli_list=[]

    #Linear terms (Z Operators)
    for i, coeff in h.items():
      pauli_str=['I']*n
      pauli_str[i]='Z'
      pauli_list.append((''.join(pauli_str), coeff))

    #Create operator
    hamiltonian=SparsePauliOp.from_list(pauli_list)

    #Setup QAOA
    optimizer=COBYLA(maxiter=1000)
    qaoa=QAOA(
        sampler=Sampler(),
        optimizer=optimizer,
        reps=self.config.qaoa_depth
    )

    #Run optimization
    result=qaoa.compute_minimum_eigenvalue(hamiltonian)

    #Extract best bitstring
    if hasattr(result, 'best_measurement'):
      bitstring=result.best_measurement['bitstring']
    else:
      #Fallback: Sample from state
      bitstring='0'*n

    #Convert to array
    solution=np.array([int (b) for b in bitstring])

    return solution

##Baseline and fallback
This component implements classical optimization strategies for portfolio selection. It serves two critical roles in the system:

1. A reliable fallback when quantum execution is unavailable or impractical.

2. A baseline reference against which quantum optimization results can be compared.

By providing classical heuristics and metaheuristics, this module ensures robustness, interpretability, and reproducibility of results.

**Optimization Strategies Implemented**
1.**Greedy Risk-Adjusted Selection**

The greedy strategy provides a fast, deterministic approximation to the portfolio optimization problem.

The logic proceeds as follows:

* Each asset is scored using a risk-adjusted return metric, balancing expected return against volatility.

* Transaction costs are incorporated as penalties, particularly discouraging changes from the existing portfolio.

* Assets are ranked by their adjusted scores.

* The top-ranked assets are selected until the target portfolio size is reached.

2.**Simulated annealing for combinatorial optimization**

To improve upon the greedy solution, the optimizer implements simulated annealing, a stochastic metaheuristic designed to escape local optima.

The annealing process follows a physically inspired workflow:

* Initialization begins from the greedy solution to ensure a strong starting point.

* At each iteration, a neighboring portfolio is generated by swapping one included asset with one excluded asset, maintaining portfolio size.

* The change in objective value (energy) determines whether the new solution is accepted:

  * Improvements are always accepted.

  * Worsening solutions may be accepted probabilistically, depending on temperature.

* The temperature is gradually reduced according to a cooling schedule, shifting the search from exploration to exploitation.

The best solution encountered during the process is retained and returned.

**Objective function and enery interpretation**

Both classical strategies rely on a shared objective function that mirrors the QUBO formulation:

* Portfolio return contributes negatively to the energy, as higher returns are preferred.

* Portfolio risk is penalized through a quadratic form involving the covariance matrix and a risk-aversion parameter.

* Transaction costs penalize deviations from the previous portfolio state.

The combined objective is minimized, ensuring consistency with both classical and quantum formulations.

In [None]:
class ClassicalOptimizer:
  """Classical optimization fallback and baseline"""
  def __init__(self, config: PortfolioConfig):
    self.config=config

  def solve_greedy(self, df: pd.DataFrame, covariance: np.ndarray)-> np.ndarray:
    """
    Greedy algorithm: Select assets with best risk-adjusted returns
    """
    n=len(df)

    #Calculate risk-adjusted scores
    returns=df['Expected_Return'].values
    volatilities=np.sqrt(np.diag(covariance))
    transaction_costs=df['Transaction_Cost'].values
    prev_positions=df['Previous_Position'].values

    #Score=return / risk-transaction cost
    scores=returns/(volatilities+1e-6)

    #Penalize changes from previous positions
    for i in range(n):
      if prev_positions[i]==0:
        scores[i]-=transaction_costs[i]*self.config.transaction_cost_mult

    #Select top N assets
    selected_indices=np.argsort(scores)[::-1][:self.config.portfolio_size]

    solution=np.zeros(n, dtype=int)
    solution[selected_indices]=1

    return solution

  def solve_simulated_annealing(self, df:pd.DataFrame, covariance: np.ndarray,
                                iterations: int=10000)-> np.ndarray:
      """
      Simulated annealing for QUBO
      """
      n=len(df)

      #Initialize with greedy solution
      current_solution=self.solve_greedy(df, covariance)
      current_energy=self._calculate_energy(current_solution, df, covariance)

      best_solution=current_solution.copy()
      best_energy=current_energy

      #Annealing parameters
      temp=1.0
      temp_min=0.01
      cooling_rate=0.9995

      for iteration in range(iterations):
        #Generate neighbour by flipping two bits (maintain portfolio size)
        neighbor=current_solution.copy()

        #Find one asset to remove an one to add
        held_indices=np.where(current_solution==1)[0]
        not_held_indices=np.where(current_solution==0)[0]

        if len(held_indices)>0 and len(not_held_indices)>0:
          remove_idx=np.random.choice(held_indices)
          add_idx=np.random.choice(not_held_indices)

          neighbor[remove_idx]=0
          neighbor[add_idx]=1

          #Calculate energy
          neighbor_energy=self._calculate_energy(neighbor, df, covariance)

          #Accept or reject
          delta_e=neighbor_energy-current_energy
          if delta_e<0 or np.random.random()<np.exp(-delta_e/ temp):
            current_solution=neighbor
            current_energy=neighbor_energy

            if current_energy<best_energy:
              best_solution=current_solution.copy()
              best_energy=current_energy


        #Cool down
        temp=max(temp*cooling_rate, temp_min)

      return best_solution

  def _calculate_energy(self, solution: np.ndarray, df: pd.DataFrame,
                        covariance: np.ndarray)-> float:

      """Calculate objective function value"""
      returns=df['Expected_Return'].values
      prev_positions=df['Previous_Position'].values
      transaction_costs=df['Transaction_Cost'].values

      #Portfolio return
      portfolio_return=np.dot(returns, solution)

      #Portfolio risk
      portfolio_risk=np.dot(solution, np.dot(covariance, solution))

      #Transaction costs
      changes=np.abs(solution-prev_positions)
      total_transaction_cost=np.dot(transaction_costs, changes)

      #Objective (minimize)
      energy=-portfolio_return+self.config.risk_aversion*portfolio_risk\
              +self.config.transaction_cost_mult*total_transaction_cost

      return energy

##Constraint enforcer
This component implements a deterministic post-processing stage that ensures all hard portfolio constraints are strictly satisfied. It is designed to operate after either quantum or classical optimization has produced a candidate solution.

Because both QUBO-based quantum solvers and heuristic classical solvers may return solutions that slightly violate constraints, this module acts as a final corrective layer that guarantees feasibility.

###Constraint enforcer flow
1.**Maximum Position Changes (Turnover Constraint)**

The first and most critical constraint enforced is the maximum number of allowed position changes relative to the previous portfolio.

The logic is:

* Count how many assets changed state (buy or sell).

* If the number exceeds the allowed maximum:

  * Rank changes by their impact on expected return.

  * Retain only the most impactful changes.

  * Revert lower-impact changes back to their previous positions.

This ensures that turnover remains within operational limits while preserving the most economically meaningful adjustments.

2.**Portfolio Size Constraint (Cardinality)**

Once turnover is controlled, the solution is adjusted to ensure exactly the target number of assets is selected.

This step:

* Computes a risk-adjusted score for each asset.

* Adds high-quality assets if the portfolio is too small.

* Removes low-quality assets if the portfolio is too large.

All adjustments are made while respecting the already-enforced turnover constraint.

3.**Sector Diversification Constraints**

Next, the enforcer ensures that no sector exceeds the maximum allowable allocation.

For each sector:

* If overrepresented, the weakest assets in that sector are removed.

* Replacement assets are selected from other sectors based on risk-adjusted performance.

This maintains diversification without significantly degrading portfolio quality.





In [None]:
class ConstraintEnforcer:
  """Post-processing to enforce hard constraints"""
  def __init__(self, config: PortfolioConfig):
    self.config=config

  def repair_solution(self, solution: np.ndarray, df: pd.DataFrame,
                      covariance: np.ndarray)-> np.ndarray:

      """
      Repair solution to satisfy all constraints
      """
      solution=solution.copy()
      prev_positions=df['Previous_Position'].values.astype(int)

      #1. Fix max_chanes FIRST(hard constraint)
      solution=self._fix_max_changes(solution, df, covariance)

      #2. Fix size while respecting max_changes
      solution=self._fix_portfolio_size_constrained(solution, df, covariance, prev_positions)

      #3. Fix sectors while respecting max_changes
      solution=self._fix_sector_limits_constrained(solution, df, covariance, prev_positions)

      #4. Final check
      solution=self._fix_portfolio_size_constrained(solution, df, covariance, prev_positions)

      return solution

  def _fix_portfolio_size_constrained(self, solution: np.ndarray, df: pd.DataFrame,
                          covariance: np.ndarray, prev_positions: np.ndarray)-> np.ndarray:
      """Ensure exactly N assets are selected"""
      current_size=solution.sum()
      target_size=self.config.portfolio_size

      if current_size==target_size:
        return solution

      #Calculate asset scores
      returns=df['Expected_Return'].values
      volatilities=np.sqrt(np.diag(covariance))
      scores=returns / (volatilities+1e-6)

      if current_size<target_size:
        #Add best assets
        available=np.where(solution==0)[0]
        to_add=target_size-current_size
        best_available=available[np.argsort(scores[available])[::-1][:to_add]]
        solution[best_available]=1
      else:
        #Remove worst assets
        held=np.where(solution==1)[0]
        to_remove=current_size-target_size
        worst_held=held[np.argsort(scores[held])[:to_remove]]
        solution[worst_held]=0

      return solution

  def _fix_sector_limits_constrained(self, solution: np.ndarray, df: pd.DataFrame,
                         covariance: np.ndarray, prev_positions: np.ndarray)->np.ndarray:
      """Ensure no sector exceeds maximum allocation"""
      sectors=df['Sector'].values
      unique_sectors=df['Sector'].unique()
      max_per_sector=int(self.config.sector_limit*self.config.portfolio_size)

      returns=df['Expected_Return'].values
      volatilities=np.sqrt(np.diag(covariance))
      scores=returns /(volatilities+1e-6)

      for sector in unique_sectors:
        sector_mask=sectors==sector
        sector_count=(solution&sector_mask).sum()

        if sector_count>max_per_sector:
          #Remove worst assets from this sector
          sector_indices=np.where(solution & sector_mask)[0]
          to_remove=sector_count-max_per_sector
          worst_in_sector=sector_indices[np.argsort(scores[sector_indices])[:to_remove]]
          solution[worst_in_sector]=0

          #Add best assets from other sectors
          other_sectors_mask=~sector_mask& (solution==0)
          available=np.where(other_sectors_mask)[0]
          if len(available)>=to_remove:
            best_others=available[np.argsort(scores[available])[::-1][:to_remove]]
            solution[best_others]=1

      return solution


  def _fix_max_changes(self, solution: np.ndarray, df: pd.DataFrame,
                       covariance: np.ndarray)-> np.ndarray:
      """Ensure position changes don't exceed K"""
      prev_positions=df['Previous_Position'].values
      changes=np.abs(solution-prev_positions)
      num_changes=changes.sum()

      if num_changes<=self.config.max_position_changes:
        return solution

      #Revert minimal-impact changes
      changed_indices=np.where(changes==1)[0]

      #Score changes by impact
      returns=df['Expected_Return'].values
      change_impacts=np.abs(returns[changed_indices])

      #Keep most impactful changes
      to_keep=self.config.max_position_changes
      keep_indices=changed_indices[np.argsort(change_impacts)[::-1][:to_keep]]

      #Revert others
      revert_indices=changed_indices[~np.isin(changed_indices, keep_indices)]
      solution[revert_indices]=prev_positions[revert_indices]

      return solution

##Portfolio Analyzer
The PortfolioAnalyzer component is responsible for evaluating a finalized portfolio solution and translating it into financial, risk, and operational metrics that are easy to interpret.

It produces a complete analytical summary suitable for:

* Model evaluation

* Strategy comparison

* Reporting to decision-makers

* Exporting results (e.g., Excel, dashboards

**Output Summary**

The analyzer returns a structured report containing:

* Selected assets

* Number of holdings

* Expected return

* Risk (volatility)

* Sharpe ratio

* Transaction costs

* Number of position changes

* Sector allocation percentages

* Buy / Sell / Hold action lists

This output is designed to be:

* Machine-readable

* Human-interpretable

* Ready for visualization or export


In [None]:
class PortfolioAnalyzer:
  """Analyzes and reports portfolio metrics"""
  @staticmethod
  def calculate_metrics(solution: np.ndarray, df:pd.DataFrame,
                        covariance: np.ndarray)-> Dict:

      """Calculate comprehensive portfolio metrics"""
      selected_assets=df[solution==1].copy()

      #Portfolio return
      portfolio_return=selected_assets['Expected_Return'].sum()

      #Portfolio risk (standard deviation)
      selected_covariance=covariance[solution==1][:, solution==1]
      portfolio_variance=np.sum(selected_covariance)
      portfolio_risk=np.sqrt(portfolio_variance)

      #Transaction costs
      prev_positions=df['Previous_Position'].values
      changes=np.abs(solution-prev_positions)
      transaction_costs=(df['Transaction_Cost']*changes).sum()

      #Sector allocation
      sector_allocation=selected_assets.groupby('Sector').size().to_dict()
      sector_percentages={k: v/len(selected_assets)*100 for k, v in sector_allocation.items()}

      #Sharpe ratio (assuming risk-free rate=2%)
      sharpe_ratio=(portfolio_return-0.02)/ (portfolio_risk+1e-6)

      #Position changes
      num_changes=changes.sum()

      #Transaction list
      buy_assets=df[((solution==1)& (prev_positions==0))]['Asset'].tolist()
      sell_assets=df[((solution==0)& (prev_positions==1))]['Asset'].tolist()
      hold_assets=df[((solution==1)& (prev_positions==1))]['Asset'].tolist()


      return{
          'selected_assets':selected_assets['Asset'].tolist(),
          'num_assets':len(selected_assets),
          'portfolio_return':portfolio_return,
          'portfolio_risk':portfolio_risk,
          'sharpe_ratio':sharpe_ratio,
          'transaction_costs':transaction_costs,
          'num_changes':int(num_changes),
          'sector_allocation':sector_percentages,
          'buy_list':buy_assets,
          'sell_list':sell_assets,
          'hold_list':hold_assets
      }

##Quantum Portfolio optimization
The QuantumPortfolioOptimizer is the main orchestration layer of the portfolio optimization system.
It coordinates data preparation, optimization strategy selection, constraint enforcement, and performance analysis into a single, reproducible workflow.

Rather than focusing on a single solver, it is designed to flexibly support quantum, classical, and hybrid approaches, making it robust to both hardware availability and problem scale.

The optimizer follows a six-stage pipeline, each stage building on the previous one:

1. Data processing

2. Asset screening

3. QUBO / Ising problem formulation

4. Optimization (quantum, classical, or hybrid)

5. Constraint enforcement

6. Portfolio analysis and reporting

Each stage is deliberately modular, allowing different methods to be swapped without breaking the pipeline.



In [None]:
class QuantumPortfolioOptimizer:
  """Main operator for quantum portfolio optimization"""
  def __init__(self, config: PortfolioConfig):
    self.config=config

  def optimize(self, df: pd.DataFrame, method:str='hybrid')-> Tuple[np.ndarray, Dict]:
    """
    Main optimization method

    Args:
      df:Portfolio data
      method: 'quantum', 'classical', or 'hybrid'

    Returns:
      solution: Binary array including asset selection
      metrics: Portfolio performance metrics
    """
    #Step 1: Data processing
    print("\n[1/6] Processing data...")
    processor=PortfolioDataProcessor(df)

    #Step 2: Smart screening (for quantum: use 2x, for classical: use 1.5x)
    screening_size=min(30, self.config.portfolio_size*2) if method=='quantum' else self.config.portfolio_size+10
    print(f"[2/6] Screening assets (50 -> {screening_size})...")
    screened_df=processor.smart_asset_screening(target_size=screening_size)

    #Generate covariance for second pairs
    temp_processor=PortfolioDataProcessor(screened_df)
    covariance=temp_processor.generate_sector_covariance()

    print(f" Reduced to {len(screened_df)} assets")
    print(f" Sectors: {screened_df['Sector'].unique().tolist()}")

    #Step 3: QUBO formulation
    print("\n[3/6] Formulatin QUBO problem...")
    formulator=QUBOFormulator(screened_df, covariance, self.config)

    #Step 4: Optimization
    print(f"\n[4/6] Running {method} optimization...")

    if method=='quantum' and self.config.quantum_enabled and QISKIT_AVAILABLE:
      h, J, offset=formulator.build_ising_hamiltonian()
      quantum_opt=QuantumOptimizer(self.config)
      try:
        solution=quantum_opt.solve_qaoa(h, J, offset)
        print(" QAOA optimization complete")
      except Exception as e:
        print(f" Quantum optimization failed: {e}. Falling back to classical.")
        classical_opt=ClassicalOptimizer(self.config)
        solution=classical_opt.solve_simulated_annealing(screened_df, covariance)

    elif method=='hybrid':
      #Use simulated annealing(quantum-inspired)
      classical_opt=ClassicalOptimizer(self.config)
      solution=classical_opt.solve_simulated_annealing(screened_df, covariance, iterations=5000)
      print(" Simulated annealing complete")

    else:
      #Classical greedy
      classical_opt=ClassicalOptimizer(self.config)
      solution=classical_opt.solve_greedy(screened_df, covariance)
      print(" Greedy optimization complete")


    #Step 5: Constraint enforcement...
    print("\n[5/6] Enforcing constraints...")
    enforcer=ConstraintEnforcer(self.config)
    solution=enforcer.repair_solution(solution, screened_df, covariance)
    print("All constraints satisfied")

    #Step 6: Analysis
    print("\n[6/6] Analyzing results...")
    metrics=PortfolioAnalyzer.calculate_metrics(solution, screened_df, covariance)

    return solution, metrics, screened_df

In [None]:
def print_results(metrics: Dict):
  print(f"\nPORTFOLIO SUMMARY")
  print(f"Assets Selected: {metrics['num_assets']}")
  print(f"Expected Return: {metrics['portfolio_return']*100:.2f}%")
  print(f"Portfolio Risk:{metrics['portfolio_risk']*100:.2f}%")
  print(f"Sharpe ratio:{metrics['sharpe_ratio']:.3f}")
  print(f"Transaction Cost: ${metrics['transaction_costs']:.4f}")


  print(f"\nTRANSACTIONS ({metrics['num_changes']} changes)")
  print(f"Buy: {len(metrics['buy_list'][:5])}" +
        (f"... (+{len(metrics['buy_list'])-5} more)" if len(metrics['buy_list'])>5 else ""))

  print(f" Sell: {len(metrics['sell_list'])} assets")
  if metrics['sell_list']:
    print(f"    {', '.join(metrics['sell_list'][:5])}"+
          (f"... (+{len(metrics['sell_list'])-5} more)" if len(metrics['sell_list'])>5 else ""))

  print(f"  Hold: {len(metrics['hold_list'])} assets")

  print(f"\nSECTOR ALLOCATION")
  for sector, pct in sorted(metrics['sector_allocation'].items(), key=lambda x: -x[1]):
    bar='|'*int(pct/5)
    print(f"  {sector:8s} |{bar:<20s}{pct:5.1f}%")

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
if __name__=="__main__":
  df=pd.read_excel('/content/drive/MyDrive/AQC-PAQC-FinanceTrack-Dataset.xlsx')

  config=PortfolioConfig(
      portfolio_size=20,
      max_position_changes=10,
      risk_aversion=0.5,
      transaction_cost_mult=1.0,
      sector_limit=0.4,
      qaoa_depth=2,
      quantum_enabled=True
  )

  optimizer=QuantumPortfolioOptimizer(config)
  solution, metrics, screened_df=optimizer.optimize(df, method='hybrid')

  print_results(metrics)


[1/6] Processing data...
[2/6] Screening assets (50 -> 30)...
 Reduced to 30 assets
 Sectors: ['ENERGY', 'CONS', 'HEALTH', 'FIN', 'TECH']

[3/6] Formulatin QUBO problem...

[4/6] Running hybrid optimization...
 Simulated annealing complete

[5/6] Enforcing constraints...
All constraints satisfied

[6/6] Analyzing results...

PORTFOLIO SUMMARY
Assets Selected: 20
Expected Return: 1400.44%
Portfolio Risk:288.03%
Sharpe ratio:4.855
Transaction Cost: $0.5626

TRANSACTIONS (16 changes)
Buy: 5... (+3 more)
 Sell: 8 assets
    ASSET_035, ASSET_037, ASSET_006, ASSET_011, ASSET_020... (+3 more)
  Hold: 12 assets

SECTOR ALLOCATION
  ENERGY   |||||||               30.0%
  CONS     |||||                 20.0%
  HEALTH   |||||                 20.0%
  FIN      ||||                  15.0%
  TECH     ||||                  15.0%
