This notebook is was used to figure out how to solve the portfolio optimization problem using a LP/MIP/QP solver. This notebook contains a prototype that attempts to use [SCIP](https://scipopt.org) to solve the problem.

The problem requires support for:
* Quadratic programming (QP) - supports minimizing a quadratic objective function (i.e., sum of the squared difference)
* Mixed-integer programming - supports integer variables (i.e., number of funds)
* Linear constraints - support constraining overall allocation to 100% and the allocation to any specific asset class or fund to be less than 100%

Because of these requirements, the solver needs to support mixed-integer quadratic programming (MIQP).

Default builds of SCIP do not support MIQP.  I was not able to find clear instructions on how to create a build that supports MIQP.

In [1]:
# import required packages
import pandas as pd

In [None]:
file_path = "../data/exposure_matrix.csv"

# Read only the header row
headers = pd.read_csv(file_path, nrows=0).columns.tolist()

# Define the default dtype for all columns except 'Ticker'
dtype_dict = {col: float for col in headers if col != 'Ticker'}

# Read the full file with the dynamically created dtype and converter
data = pd.read_csv(
    file_path,
    dtype=dtype_dict,  # Set all columns to float except Ticker
    converters={'Ticker': lambda x: x.strip()}  # Strip whitespace from Ticker column
)
data.set_index('Ticker', inplace=True)
data.loc['BNDX']
data.loc['BNDX', 'Intl Bonds']
data.loc[:, 'Intl Bonds']
data

In [None]:
# extract_data(data):

# Extract fund_matrix (all rows except the footer)
fund_matrix = data.query("index != 'Targets'")
fund_matrix.loc['BNDX']
fund_matrix.loc[:,'Cash']
fund_matrix.loc['BNDX','Cash']
fund_matrix

In [None]:
# Extract asset_class_targets (footer row)
asset_class_targets = data.loc['Targets']
asset_class_targets.loc['Emerging']
asset_class_targets

In [None]:
# Extract fund tickers (first column)
funds = fund_matrix.index
funds

In [None]:
# Extract asset classes (header row, excluding the first column)
asset_classes = data.columns
asset_classes

In [7]:
# Problem:
# Minimize the following:
# - sum of the squared difference between final portfolio asset class allocations and target
#   asset class allocations
# - the number of funds included in the portfolio (# of funds with non-zero allocations)
#
# Subject to:
# - sum of the final portfolio asset class allocations equals 1
# - sum of the final portfolio fund allocations equals 1# - sum of the portfolio asset allocations equals 1
# - portfolio allocation for each asset class is less than 1
# - portfolio allocation for each fund is less than 1
# - number of funds included in the portfolio is less than max_funds

In [None]:
# Initialize SCIP model
model = Model("Portfolio Optimization")
model

In [None]:
# Variables: allocation for each fund (lower bound = 0, upper bound = 1)
portfolio_fund_allocations = {fund: model.addVar(vtype="C", lb=0, ub=1, name=f"x_{fund}") for fund in funds}
portfolio_fund_allocations

In [None]:
# Variables: indicator (0/1) for whether a fund is included
fund_included = {fund: model.addVar(vtype="B", name=f"y_{fund}") for fund in funds}
fund_included

In [None]:
#
# Objective: minimize:
# - sum of the squared difference between final portfolio allocations and target allocations
# - penalty for number of funds used

# Objective: minimize squared differences + penalty for number of funds used
#asset_allocations = quicksum(fund_allocations[ticker] * fund_matrix.loc[ticker] for ticker in tickers)
#squared_diff = quicksum((asset_allocations[i] - target_allocations[i])**2 for i in range(len(target_allocations)))
#sparsity_penalty = quicksum(fund_included[ticker] for ticker in tickers)


# create dictionary with sums that calculate the portfolio's allocation to each asset
# class given the allocation to each fund (portfolio_fund_allocations: variable to be optimized) and the known
# asset class allocations for each fund (defined in fund_matrix)
portfolio_asset_class_allocations = {asset_class: quicksum(portfolio_fund_allocations[fund] * 
                                                           fund_matrix.loc[fund, asset_class]
                                                           for fund in funds)
                                     for asset_class in asset_classes}

# create a dictionary with the squared differences between the portfolio asset class allocation and
# the target asset class allocations for each asset class (defined in asset_classes)
asset_class_allocation_diff_squared = {asset_class: (portfolio_asset_class_allocations[asset_class] -
                                                     asset_class_targets[asset_class]) ** 2
                                                    for asset_class in asset_classes}

# calculate the sum of the squared differences (this is the objective function)
sum_of_squared_diff = quicksum(asset_class_allocation_diff_squared[asset_class] for asset_class in asset_classes)
print("sum of squared diff:\n")
for term in sum_of_squared_diff.terms:
    print(term)

# calculate the sparsity penalty for number of funds included
sparsity_penalty = quicksum(fund_included[fund] for fund in funds)
print("\nsparsity penalty:\n")
for term in sparsity_penalty.terms:
    print(term)

# objective function
sparsity_weight = 0.5
objective = sum_of_squared_diff + (sparsity_weight * sparsity_penalty)

print("\nobjective:\n")
for term in objective.terms:
    print(term)

In [None]:
# Solve the Problem
model.setObjective(objective, sense="minimize")

#from pyscipopt.recipe.nonlinear import set_nonlinear_objective

#set_nonlinear_objective(model, objective)