# WFP CASE STUDY  
Requirements: gurobi license, iai license (not mandatory)

**Problem description**  
In this case study, we use a simplified version of the model proposed by  [Peters et al. (2016)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2880438) which seeks to optimize humanitarian food aid. Its extended version aims to provide the World Food Programme (WFP) with a decision-making tool that simultaneously optimizes the food basket to be delivered, the sourcing plan, the delivery plan, and the transfer modality of a long-term recovery operation for each month in a predefined time horizon. The model proposed by \cite{peters2021nutritious} enforces that the food baskets address the nutrient gap and are palatable. To guarantee a certain level of palatability, the authors use a number of “unwritten rules” that have been defined in collaboration with nutrition experts. In this case study, we take a step further by inferring palatability constraints directly from data that reflects local people's opinions.

**The Model**  
The optimization model is a combination of a capacitated, multi-commodity network flow model and a diet model with constraints for nutrition levels and food basket palatability. 
The set, parameters, and variables used to define the constraints and the objective function are displayed in the following tables:
<img src="figures/WFPsets.jpg" alt="WFP sets" width="300"/>
<img src="figures/WFPparams.jpg" alt="WFP params" width="700"/>
<img src="figures/WFPvars.jpg" alt="WFP vars" width="650"/>

$\min_{x, y, F} \sum_{i \in \mathcal{N_S}} \sum_{j\in \mathcal{N_T \cup N_D}} \sum_{k \in \mathcal{K}} p_{ik}^PF_{ijk} + \sum_{i \in \mathcal{N_S \cup N_T}} \sum_{j \in \mathcal{N_T \cup N_D}} \sum_{k \in \mathcal{K}} p_{ijk}^TF_{ijk}$  

*Subject to*  

$\sum_{j \in \mathcal{N_T}} F_{ijk} = \sum_{j \in \mathcal{N_T}} F_{jik} \ \ \ \forall i \in \mathcal{N_T}, \ \forall k \in \mathcal{K},$  
$\sum_{j \in \mathcal{N_S \cup N_T}} \alpha F_{jik} = d_ix_kdays \ \ \ \forall i \in \mathcal{N_D}, \ \forall k \in \mathcal{K}$ $ \sum_{j \in \mathcal{N_T \cup N_D}} F_{ijk} \leq cap_{ik}^P \ \ \ \forall i \in \mathcal{N_S}, \ \forall k \in \mathcal{K},$    
$ \sum_{k \in \mathcal{K}} Nutval_{kl} x_{k} \geq Nutreq_{l} \ \ \ \forall l\in\mathcal{L},$  
$ x_{salt} = 5,$  
$ x_{sugar} = 20,$  
$ g(y) \leq 0,$  
$ y = \hat{h}(x)$    
$ F_{ijk}, x_{k} \geq 0 \ \ \ \forall i,j \in  \mathcal{N}, \ \forall k \in \mathcal{K}.$  


For a detailed description of the model we refer to [TBD et al. (2021)](https://google.com).

In [1]:
import pandas as pd
import numpy as np
import math
from gurobipy import Model, GRB, quicksum, tupledict
from sklearn.utils.extmath import cartesian
import time
import sys
import os
import time
from imp import reload

In [None]:
question1 = input("Do you have the InterpretableAI license? Y/n: ")
if question1.upper() == 'Y':
    print('Importing InterpretableAI packages...')
    from interpretableai import iai
else:
    print("Optimal trees will not be used")

In [None]:
## Add modules
sys.path.append(os.path.abspath('../../src'))  # TODO: has to be changed
import ConstraintLearning
import embed_mip as em 
import run_MLmodels as ml

### Data loading
**df_fb**: dataframe of food basket instances  
**nutr_val**: nutritional values for each of the 25 foods  
**nutr_req**: 11 nutrition requirements  
**suppliers**: list of supplier nodes  
**transshippers**: list of transshipment nodes  
**demand**: list of demand nodes  
**cost_p**: matrix of procurement costs  
**cost_t**: matrix of transportation costs  
**nodes**: list of nodes in the network

In [None]:
nutr_req = pd.read_excel('processed-data/Syria_instance.xlsx', sheet_name='nutr_req', index_col='Type')
nutr_val = pd.read_excel('processed-data/Syria_instance.xlsx', sheet_name='nutr_val', index_col='Food')
demand = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='Demand')
suppliers = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='Suppliers')
transshippers = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='Transhippers')
edges = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='EdgesCost')
tcost_matrix = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='EdgesCost')
tcost_matrix.drop(['distance', 'duration', 'edge'], axis=1, inplace=True)
cost_p = pd.read_excel('processed-data\Syria_instance.xlsx', sheet_name='FoodCost', index_col='Supplier')
df_fb = pd.read_csv('processed-data\WFP_dataset.csv').sample(frac=1)
bigM = 1e6
I = pd.DataFrame(data=np.concatenate([edges['From'].drop_duplicates(), edges['To'].drop_duplicates()]), columns=['Node']).drop_duplicates()  # list of nodes
TC = pd.DataFrame(cartesian((I['Node'], I['Node'])), columns=['From', 'To'])
TC['tCost'] = np.ones(TC.shape[0])*bigM
for index, row in tcost_matrix.iterrows():
    for index1, row1 in TC.iterrows():
        if (row['From'] == row1[0]) & (row['To'] == row1[1]):
            TC.loc[index1, 'tCost'] = row['tCost']
cost_t = TC.drop_duplicates()
days = 30
alpha = 10000  #  to convert the metric tonnes values to grams
nodes = np.unique(TC['From'])
df_fb

# <font color='blue'>OptiCL</font>

## <font color='#3399FF'>Step 1: Conceptual Model</font>

In [None]:
def init_conceptual_model(cost_p):
    conceptual_model = Model('WFP case study')

    N = list(nutr_val.index)  # foods
    M = nutr_req.columns  # nutrient requirements
    
    '''
    Decision variables
    '''
    x = conceptual_model.addVars(N, vtype=GRB.CONTINUOUS, name='x', lb=0)  # variables controlling the food basket
    f = conceptual_model.addVars(edges['From'].drop_duplicates(), edges['To'].drop_duplicates(), N, vtype=GRB.CONTINUOUS, name='flow', lb=0)
    
    '''
    Objective function.
    '''
    conceptual_model.modelSense = GRB.MINIMIZE
    objective = quicksum(quicksum(quicksum(f[supplier, node['To'], food]*float(cost_p.loc[supplier, food])  # procurement costs
                                                                                        for food in N) 
                                                                                        for idx, node in edges[edges['From'] == supplier].iterrows()) 
                                                                                        for supplier in suppliers['Suppliers']) \
+ quicksum(quicksum(quicksum(f[node_from, node_to['To'], food]*cost_t[(cost_t['From'] == node_from) & (cost_t['To'] == node_to['To'])]['tCost'].values[0]  # Transportation costs
                                                                                        for food in N) 
                                                                                        for idx_to, node_to in edges[edges['From'] == node_from].iterrows()) 
                                                                                        for node_from in edges['From'].drop_duplicates())
    conceptual_model.setObjective(objective)
    '''
    Flow constraints.
    '''
    conceptual_model.addConstrs(quicksum(f[tr, node['To'], food] for idx, node in edges[edges['From'] == tr].iterrows()) == quicksum(f[node['From'], tr, food] for idx, node in edges[edges['To'] == tr].iterrows())
                                for food in N for tr in transshippers['Transhippers'])
    
    conceptual_model.addConstrs(alpha*quicksum(f[node['From'], dl, food] for idx, node in edges[edges['To'] == dl].iterrows()) == demand[demand['Node'] == dl]['Demand'].values[0]*days*x[food]
                                for food in N for dl in demand['Node'])

    '''
    Nutrients requirements constraint.
    '''
    conceptual_model.addConstrs(quicksum(x[food] * nutr_val.loc[food, req] for food in N) >= nutr_req[req].item() for req in M)
    '''
    Sugar constraint
    '''
    conceptual_model.addConstr(x['Sugar'] == 0.2)
    '''
    Salt constraint
    '''
    conceptual_model.addConstr(x['Salt'] == 0.05)
    
    return conceptual_model, x

## <font color='#3FCB83'>Step 2: Data Processing</font>
The palatabily score is normalized such that we have a value between 0 and 1, where 1 is assigned to most palatable rations and 0 to the least palatable ones.

In [None]:
# normalize the palatabily score to be between 0 and 1
def normalize(y):
    minimum = 71.969  
    maximum = 444.847  
    return 1 - (y - minimum)/(maximum - minimum)
y = df_fb['label']
X = df_fb.drop(['label'], axis=1, inplace=False)

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## <font color='#CBE221'>Part 3: Learn the predictive models</font>

In [None]:
version = 'v1_exp'
if question1.upper() == 'Y':
    alg_list = ['iai','mlp', 'linear','cart','rf','svm','gbm']
else:
    alg_list = ['mlp', 'linear','cart','rf','svm','gbm']
outcome_list = ['palatability']  # Constraint to be learned

question2 = input('What is the palatability threshold that you want to use in the constraint? Choose in the range(0, 1): ')
constraint_extrapolation_type = 'r'
threshold = question2

In [None]:
performance = pd.DataFrame()
reload(ml)
reload(ConstraintLearning)

if not os.path.exists('../results/'):
    os.makedirs('../results/')

for outcome in outcome_list:
    print(f'Learning a constraint for {outcome}')

    for alg in alg_list:
        if not os.path.exists('../results/%s/' % alg):
            os.makedirs('../results/%s/' % alg)
        print(f'Training {alg}')
        s = 0

        ## Run shallow/small version of RF
        alg_run = 'rf_shallow' if alg == 'rf' else alg

        m, perf = ml.run_model(X_train, y_train, X_test, y_test, alg_run, task = 'continuous', 
                               seed = s, cv_folds = 5, 
                               save = False,
#                               parameter_grid = {'hidden_layer_sizes':[(5),(10)]}
                              )

        ## Save model
        constraintL = ConstraintLearning.ConstraintLearning(X_train, y_train, m, alg)
        constraint_add = constraintL.constraint_extrapolation(constraint_extrapolation_type)
        constraint_add.to_csv('../results/%s/%s_%s_model.csv' % (alg, version, outcome), index = False)

        ## Extract performance metrics
        try:
            perf['auc_train'] = roc_auc_score(y_train >= threshold, m.predict(X_train))
            perf['auc_test'] = roc_auc_score(y_test >= threshold, m.predict(X_test))
        except: 
            perf['auc_train'] = np.nan
            perf['auc_test'] = np.nan

        perf['seed'] = s
        perf['outcome'] = outcome
        perf['alg'] = alg
        perf['save_path'] = '../results/%s/%s_%s_model.csv' % (alg, version, outcome)
        
            
        perf.to_csv('../results/%s/%s_%s_performance.csv' % (alg, version, outcome), index = False)
        
        performance = performance.append(perf)
        print()
print('Saving the performance...')
performance.to_csv('../results/%s_performance.csv' % version, index = False)
print('Done!')

## <font color='#EA4A34'>Step 4: Predictive model selection</font>


### Model selection

In [None]:
constraints_embed = ['palatability']
objectives_embed = {}
version = 'v1_exp'
performance = pd.read_csv('../results/%s_performance.csv' % version)
performance.dropna(axis='columns')

Automatic selection based on measure  
**valid_score**: r2

In [None]:
reload(em)
model_master = em.model_selection(performance, constraints_embed, objectives_embed)
model_master

### Constraints embedding and Model optimization

In [None]:
palatability_threshold = question2
trust_region = input("Do you want to use the trust region? True/False ")

In [None]:
model_master['lb'] = float(palatability_threshold)
model_master['ub'] = None
em.check_model_master(model_master)

In [None]:
def getSolution(model, X):
    solution = {}
    palatability = 0
    count = 0
    for v in model.getVars():
        if 'x[' in v.varName:
            solution[list(X.columns)[count]]=[v.x]
            print(v.varName)
            count += 1
            
    for v in model.getVars():
        if 'y_palatability' == v.varName:
            palatability = v.x
    return solution, palatability

In [None]:
reload(em)
result = {}
for i in range(1):
    print(f'iter {i}')
    np.random.seed(seed=i)
    price_random = pd.DataFrame(np.random.random((len(suppliers), len(nutr_val)))*1000, columns=cost_p.columns, index=cost_p.index)
    conceptual_model, x = init_conceptual_model(price_random)
    conceptual_model.update()
    MIP_final_model = em .optimization_MIP(conceptual_model, x, model_master, X, tr=bool(trust_region))
    start_time = time.time()
    MIP_final_model.Params.LogToConsole = 0
    start_time = time.time()
    status = MIP_final_model.optimize()
    computation_time = time.time() - start_time    

### Optimal Solution

In [None]:
solution, result['predicted_palatability'] = getSolution(MIP_final_model, X)
result['objective_function'] = MIP_final_model.objVal
result['time'] = computation_time
result['algorithm'] = model_master['model_type'].item()
result['parameters'] = performance[performance['alg']==model_master['model_type'].item()]['best_params'].item()
result['validation MSE'] = -performance[performance['alg']==model_master['model_type'].item()]['valid_score'].item()

In [None]:
X