The goal of this notebook is to take the ad hoc implementation in `MIPExample1.ipynb` and vectorize the operations such that any size Tyche problem can be converted to a MIP and solved.

In [1]:
import os
import sys
sys.path.insert(0, os.path.abspath("../src"))

In [2]:
import numpy             as np
import matplotlib.pyplot as pl
import pandas            as pd
import seaborn           as sb
import tyche             as ty

from copy            import deepcopy
from IPython.display import Image 

In [3]:
import cProfile
import timeit

In [4]:
from mip import Model, minimize, BINARY, xsum

In [5]:
designs = ty.Designs("data")
investments = ty.Investments("data")
designs.compile()
tranche_results = investments.evaluate_tranches(designs, sample_count=250)
results = investments.tranches.join(tranche_results.summary)
evaluator = ty.Evaluator(investments.tranches, tranche_results.summary)

Get the wide-format interpolated elicitation data from the Tyche Evaluator and reset the multi-level index.

In [6]:
wide = evaluator.evaluate_corners_wide().reset_index()

In [7]:
wide

Index,CIGS,CdTe,GaAs,InGaP,Perovskite,Polysilicon,Power Electronics,Soft Costs,Capital,Efficiency,GHG,Hazardous,LCOE,Lifetime,Strategic,Yield
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-1.530424,2.068725,-0.003592,0.972416,-0.144040,187.983314,0.063388,9999.033113
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1000000.0,-1.450576,2.068036,-0.003592,0.973190,-0.141542,187.983324,0.063388,9999.031773
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5000000.0,-1.280386,2.068401,-0.003592,0.972725,-0.136217,187.983370,0.063388,9999.036431
3,0.0,0.0,0.0,0.0,0.0,0.0,1000000.0,0.0,-1.465804,2.067513,-0.003592,0.961801,-0.138266,187.983293,0.063388,10063.916501
4,0.0,0.0,0.0,0.0,0.0,0.0,1000000.0,1000000.0,-1.385955,2.066823,-0.003592,0.962575,-0.135768,187.983303,0.063388,10063.915161
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6556,3000000.0,6000000.0,7500000.0,9500000.0,9500000.0,4500000.0,1000000.0,1000000.0,3.272544,2.334725,-0.003592,0.799772,0.054470,187.979747,0.043162,10277.622035
6557,3000000.0,6000000.0,7500000.0,9500000.0,9500000.0,4500000.0,1000000.0,5000000.0,3.442734,2.335090,-0.003592,0.799307,0.059795,187.979793,0.043162,10277.626693
6558,3000000.0,6000000.0,7500000.0,9500000.0,9500000.0,4500000.0,5000000.0,0.0,3.278471,2.336357,-0.003592,0.790923,0.056700,187.979821,0.043162,10319.807406
6559,3000000.0,6000000.0,7500000.0,9500000.0,9500000.0,4500000.0,5000000.0,1000000.0,3.358319,2.335667,-0.003592,0.791697,0.059199,187.979831,0.043162,10319.806065


**Input to MIP constructor needed**: List of investment categories

**Input to MIP constructor needed**: List of metrics to optimize and/or constrain

**Data check**: Confirm that all elements of both input lists match columns in the `evaluate_corners_wide()` data frame.

In [8]:
categories = ['Soft Costs', 'Perovskite', 'CIGS']

metrics = ['Capital']

Fill in the various index values for the one-by-one data set.

In [9]:
# Number of investment categories
Inv = len(categories)

# Number of metrics
J = len(metrics)

In [10]:
# Named series of the number of elicited funding levels in each investment category of interest
l = pd.Series(data=wide.nunique(axis=0, dropna=True)[categories], index=categories)

# Named series of the number of linear intervals in each investment category of interest
n = pd.Series(data=[x - 1 for x in l], index=categories)

# Named series of the number of new lambda variables in each investment category of interest
k = l.copy()

In [57]:
levels = pd.DataFrame(columns=categories)
for i in categories:
    levels.loc[:,i] = pd.Series(pd.unique(wide.loc[:,categories[0]]))

In [58]:
levels

Unnamed: 0,Soft Costs,Perovskite,CIGS
0,0.0,0.0,0.0
1,1000000.0,1000000.0,1000000.0
2,5000000.0,5000000.0,5000000.0


Pull out the investment values and metric values from the data set.

In [11]:
# Investment levels
v = wide.loc[:,categories]

# Elicited metric values
m = wide.loc[:,metrics]

In [27]:
v

Index,Soft Costs,Perovskite,CIGS
0,0.0,0.0,0.0
1,1000000.0,0.0,0.0
2,5000000.0,0.0,0.0
3,0.0,0.0,0.0
4,1000000.0,0.0,0.0
...,...,...,...
6556,1000000.0,9500000.0,3000000.0
6557,5000000.0,9500000.0,3000000.0
6558,0.0,9500000.0,3000000.0
6559,1000000.0,9500000.0,3000000.0


In [28]:
m

Index,Capital
0,-1.530424
1,-1.450576
2,-1.280386
3,-1.465804
4,-1.385955
...,...
6556,3.272544
6557,3.442734
6558,3.278471
6559,3.358319


**Question**: Is there a standard way we're setting up the budget constraint(s)? e.g. Will there always be as many budget constraints as there are investment categories or always only one budget constraint?

Define upper bound(s) for budget constraint(s).

In [12]:
B = 3000000.0

Instantiate the MIP optimization problem.

In [62]:
example = Model()
bin_vars = pd.DataFrame(columns=categories)
lmbd_vars = pd.DataFrame(columns=categories)

Create binary (integer) variables $y_{in_i}$.

In [63]:
for i in categories:
    for n_i in range(n.loc[i]):
        #_name = 'y' + '_' + str(i) + '_' + str(n_i)
        bin_vars.loc[n_i, i] = example.add_var(var_type=BINARY)

Look at structure of `bin_vars` data frame to verify correct variable creation.

In [64]:
bin_vars

Unnamed: 0,Soft Costs,Perovskite,CIGS
0,var(0),var(2),var(4)
1,var(1),var(3),var(5)


Create continuous $\lambda$ variables with lower bound 0.0 and upper bound 1.0

In [65]:
for i in categories:
    for k_i in range(k.loc[i]):
        #_name = 'lmbd' + '_' + str(i) + '_' + str(k_i)
        lmbd_vars.loc[k_i, i] = example.add_var(lb=0.0, ub=1.0)

Check structure of `lmbd_vars`.

In [93]:
sum(sum(lmbd_vars.values))

<mip.entities.LinExpr at 0x1ba281ad148>

Ideally the $\lambda$ and $y_{in_i}$ variables would not need to be defined with nested `for` loops. However, these loops will execute quite quickly for the foreseeable problem types, so vectorizing the operations is a low priority.

Create budget constraint as a function of the $\lambda$ variables and the elicited investment levels from the data set.

In [87]:
example += xsum(lmbd_vars.loc[i][j] * levels.loc[i][j] for i in range(len(levels)) for j in categories) <= B, 'Budget'

Convexity constraints on $\lambda$ variables.

In [19]:
example += sum(sum(lmbd_vars.values)) == 1, 'Lambda convexity' 

**In progress/Not implemented from this point onward**

Constrain binary $y$ variables within each category such that at most one of the $y$ variables per category can be equal to 1.

In [20]:
example += sum(y_1) == 1, 'Interval selection'

Interval constraints on $y$ variables and $\lambda$ variables.

In [21]:
example += y_1[0] <= lmbd_1[0] + lmbd_1[1]

In [22]:
example += y_1[1] <= lmbd_1[1] + lmbd_1[2]

Create objective function: Capital (metric) as a function of $\lambda$s and $y$s.

In [23]:
example.objective = minimize(-1.0 * xsum(lmbd_1[i] * m_1[i] for i in range(k_1)))

Optimize.

In [24]:
example.optimize()

<OptimizationStatus.OPTIMAL: 0>

Print optimal objective function value, with a negative applied to reverse the -1.0 in the objective function definition (minimizing a negative -> maximize a positive).

In [25]:
-1.0 * example.objective_value

-1.3555357403466024

Get the optimal $\lambda$ and $y$ values.

In [26]:
for v in example.vars:
    print('{} : {}'.format(v.name, v.x))

y_1 : 0.0
y_1 : 1.0
lmbd_1 : 0.0
lmbd_1 : 0.5
lmbd_1 : 0.5


\$3,000,000 is halfway between the second and third investment levels, so this solution is the correct optimum.