# Neural Network training with CORLAT dataset, using a feasibility promoting weighted BCE loss

This notebook explores the training of a simple Multi Layer Perceptron (MLP) neural network for the CORLAT dataset.

The MLP outputs assignments of binary variables for the CORLAT dataset. 

The idea behind the custom loss is to provide higher weights for assignments that results in a better objective value (depending on minimization or maximization). The training of MLP in this experiment differs from the majority of neural network training paradigms. The important thing to note here is that:

$$\color{lightblue}\text{For each sample, we have multiple sets of assignments}$$

For example:
Sample 1, 100 solutions (each solution is a set of binary assignments).

We train on every feasible solution gathered (up to `n_sols` specified during data collection using the `corlat.py` script).

The idea is to establish the conditional probability distribution $$p(Y_{i} | X_{i}) \quad \text{for} \quad i=0, 1, 2, \dots, n $$

for feasible assignments. $n$ is the number samples, and $i$ represents the $i$-th sample. i.e., $p(y^{i}_{j} = 1 | X^{i})$ is the probability of assigning a $1$ to binary variable $j$, of sample $i$, such that the assignment is feasible. 

Hence, it becomes clear now that the weights for each set of assignments is to encourage assignments with better objective values.





In [33]:
import torch
import random
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
from torch.optim import AdamW
from xgboost import XGBClassifier

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
import pickle as pkl
import scipy
import os
import gc

from torch.nn import Linear, ReLU, Dropout
from torch.nn.functional import relu
from sklearn.model_selection import train_test_split

from sklearn.metrics import f1_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import confusion_matrix

import gurobipy as gb
import time

from operator import itemgetter
from typing import *
import numpy.typing as npt

In [34]:
!nvidia-smi -L

GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-4045b1e6-3428-f9e1-5643-862c4834363d)
GPU 1: NVIDIA A100 80GB PCIe (UUID: GPU-35ac16d5-81e8-f772-b9cb-a681af1fd2b5)
  MIG 2g.20gb     Device  0: (UUID: MIG-98c0ec5f-a99f-58b2-bbfd-d5521a6986ce)
GPU 2: NVIDIA A100 80GB PCIe (UUID: GPU-d949dd0a-b88e-ee87-9621-3a824f914f82)
GPU 3: NVIDIA A100 80GB PCIe (UUID: GPU-8c13f3ad-24ee-bb68-5eff-7f73091e682a)


In [35]:
!nvidia-smi

Mon Jun 12 19:58:15 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04    Driver Version: 515.43.04    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100 80G...  On   | 00000000:17:00.0 Off |                   On |
| N/A   64C    P0   163W / 300W |  12227MiB / 81920MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100 80G...  On   | 00000000:65:00.0 Off |                   On |
| N/A   37C    P0    65W / 300W |   2250MiB / 81920MiB |     N/A      Default |
|       

### If there is multiple GPU, you can choose which one to use

In [36]:
# set CUDA_VISIBLE_DEVICES=0
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

### Set CUBLAS config for deterministic `torch` behavior

In [37]:
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ":4096:8"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


### Function to set random seeds for reproducibility

In [38]:
def set_seeds(seed):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.benchmark = (
        False  # Force cuDNN to use a consistent convolution algorithm
    )
    torch.backends.cudnn.deterministic = (
        True  # Force cuDNN to use deterministic algorithms if available
    )
    torch.use_deterministic_algorithms(
        True
    )  # Force torch to use deterministic algorithms if available


### Load the dataset

If cannot load, we `cd` to the respective project directory.

In [39]:
try:
    corlat_dataset = pkl.load(open("Data/corlat_presolved/processed_data/corlat_presolved_preprocessed.pickle", "rb"))
except:
    # move dir to /ibm/gpfs/home/yjin0055/Project/DayAheadForecast
    os.chdir("/ibm/gpfs/home/yjin0055/Project/DayAheadForecast")
    corlat_dataset = pkl.load(open("Data/corlat_presolved/processed_data/corlat_presolved_preprocessed.pickle", "rb"))

In [40]:
binary_indices = corlat_dataset[0]["indices"]["indices"]

In [43]:
# read X_train, X_test, y_train, y_test from Data/corlat_presolved/ using numpy.load
X_train = np.load("Data/corlat_presolved/train_test_data/X_train.npy")
X_test = np.load("Data/corlat_presolved/train_test_data/X_test.npy")
y_train = np.load("Data/corlat_presolved/train_test_data/y_train.npy", allow_pickle=True)
y_test = np.load("Data/corlat_presolved/train_test_data/y_test.npy", allow_pickle=True)

In [46]:
y_train.dtype

dtype('O')

### Convert all targets to binary values. Ensuring there is no values such as -0.0   

In [12]:
# for each instance in y_train and y_test, convert it to binary
for i in range(y_train.shape[0]):
    # make all values positive using abs
    # y_train[i] is a tensor of shape (arbritary shape), num_vars
    y_train[i] = np.abs(y_train[i])
    
    # use numpy where to convert values > 0.5 to 1, and values <= 0.5 to 0
    y_train[i] = np.where(y_train[i] > 0.5, 1.0, 0.0)
    
for i in range(y_test.shape[0]):
    # make all values positive using abs
    # y_train[i] is a tensor of shape (arbritary shape), num_vars
    y_test[i] = np.abs(y_test[i])
    
    # use numpy where to convert values > 0.5 to 1, and values <= 0.5 to 0
    y_test[i] = np.where(y_test[i] > 0.5, 1.0, 0.0)

In [13]:
y_train[0]

array([[1., 0., 0., ..., 1., 1., 1.],
       [1., 0., 0., ..., 0., 1., 1.],
       [1., 0., 1., ..., 0., 1., 1.],
       ...,
       [1., 1., 1., ..., 0., 1., 1.],
       [1., 0., 1., ..., 0., 1., 1.],
       [1., 0., 1., ..., 1., 1., 1.]])

### Load the training and testing sample indices

In [14]:
# train and test indices
train_indices = np.load("Data/corlat_presolved/train_test_data/train_idx.npy")
test_indices = np.load("Data/corlat_presolved/train_test_data/test_idx.npy")

In [15]:
n_features = X_train.shape[1]
out_channels = y_train[0].shape[1]

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [18]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(n_features, n_features//8)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(n_features//8, n_features//16)
        self.fc3 = nn.Linear(n_features//16, n_features//32)
        self.fc4 = nn.Linear(n_features//32, out_channels)
        self.sigmoid = nn.Sigmoid()
        
        # add regularization
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc3(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc4(x)
        x = self.sigmoid(x)
        
        return x

In [16]:
print("n_features: ", n_features)
print("out_channels: ", out_channels)

n_features:  14602
out_channels:  100


### Load the weights for weighted feasibility promoting weighted BCELoss

In [17]:
weights = np.load("Data/corlat_presolved/train_test_data/train_weights.npy", allow_pickle=True)

In [19]:
config = {
        'train_val_split': [0.80, 0.20], # These must sum to 1.0
        'batch_size' : 32, # Num samples to average over for gradient updates
        'EPOCHS' : 1000, # Num times to iterate over the entire dataset
        'LEARNING_RATE' : 5e-4, # Learning rate for the optimizer
        'BETA1' : 0.9, # Beta1 parameter for the Adam optimizer
        'BETA2' : 0.999, # Beta2 parameter for the Adam optimizer
        'WEIGHT_DECAY' : 1e-4, # Weight decay parameter for the Adam optimizer
    }

### Define a custom class for our dataset

The custom class is needed as our `y` data is an `object` type tensor. This is because `y` is an array of `n` samples, where each `y[i]` is of `n_sols` x 100, where 100 is 100 binary outputs. `n_sols` can have a maximum of 100, due to our setting of collecting 100 possibl solutions. However `n_sols` varies from sample to sample, as for some sample there might not be 100 sols.  

In [20]:
class multipleTargetCORLATDataset(TensorDataset):
    def __init__(self, X, y, weights=None, test=False):
        super(multipleTargetCORLATDataset, self).__init__()
        self.X = X
        self.y = y
        self.weights = weights
        self.test = test
        # self.obj_coeffs = get_nth_feature(self.X, 1)
        
    def __getitem__(self, index):
        X = self.X[index]
        y = self.y[index]
        
        # duplicate X to match the number of targets
        # X = np.repeat(X[np.newaxis,:], y.shape[0], axis=0)
    
        if self.weights is None and self.test:
            return torch.tensor(X, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)
        
        
        
        weights = self.weights[index]
        
        
        X_tensor = torch.tensor(X, dtype=torch.float32)
        y_tensor = torch.tensor(y, dtype=torch.float32)
        weights_tensor = torch.tensor(weights, dtype=torch.float32)

        # obj_coeffs_tensor = torch.tensor(self.obj_coeffs[index], dtype=torch.float32)
        return X_tensor, y_tensor, weights_tensor
    
    def __len__(self):
        return len(self.X)

### A collate function is defined for our DataLoader

The collate function determines how data is queried from the DataLoader class. A custom collate function is needed as we need our data to be returned in a different format than the default.

We return `X` in our collate function as stacked samples of `X` (which is the default way of how a DataLoader class handles our data), resulting in a `batch_size` x `size_of_X` shape for X.

For `Y` we return a list of Y samples. The list of Y samples have varying shapes. Each element of the list is `n_sols_i` x 100 where `n_sols_i` is `n_sols` of the `i`th sample.

The `weights` have similar shape to `Y`. Each element of the list is of shape `n_sols_i` where `n_sols_i` is `n_sols` of the `i`th sample.

In [31]:

def collate_fn(data):
    # data is a list of tuples (X, Y, weights)
    # X_list = []
    # Y_list = []
    # weights_list = []
    # for item in data:        
    
    X = torch.stack([item[0] for item in data])
    Y = [item[1] for item in data]
    
    
    # only X, and Y no weights
    if len(data[0]) == 2:
        return X, Y    
    
    weights = [item[2] for item in data]
    #     X_list.append(item[0])
    #     Y_list.append(item[1])
    #     weights_list.append(item[2])
    
    # X = torch.stack(X_list)
    # Y = torch.cat(Y_list)
    # weights = torch.cat(weights_list)
    
    return X, Y, weights
    

### Definition for feasibility promoting weighted BCELoss
The custom loss calculates the BCEloss between the predicted binary targets and each (possible) output binary solutions (keep in mind that we have a maximum of 100 total solutions). 

For each sample, we take the mean across the losses of the 100 binary variables. Then for each sample, we multiply the calculated loss with the respective weight. We then sum all the losses of `n_sols_i`. 

For each batch, we then take the mean loss of the batch.

In [None]:
# custom loss for neural network
def feasibility_promoting_weighted_BCELoss(y_pred: torch.Tensor, y_true: torch.Tensor, weights: torch.Tensor, device: torch.device):
    
    batch_loss = []
    
    loss_fn = nn.BCELoss(reduction='none')
        
    # sum over all targets
    for i in range(len(y_true)):
        loss = torch.mean(loss_fn(y_pred[i].expand(len(y_true[i]), -1), y_true[i].to(device)), dim=1)
        loss = torch.mul(loss, weights[i].to(device))
        batch_loss.append(torch.sum(loss))
    
    # sum over all samples
    batch_loss = torch.mean(torch.stack(batch_loss))

    return batch_loss

### Create train and test dataset

In [40]:
train_dataset = multipleTargetCORLATDataset(X_train, y_train, weights=weights)

In [41]:
test_dataset = multipleTargetCORLATDataset(X_test, y_test, test=True)

### Initialize neural network, train loader, valid, loader, and optimizer.

We used a one cycle learning rate in this case, where the learning rate warms up and peaks at `max_lr`. Then decreases to a small learning rate.

In [None]:
net = NeuralNetwork()

batch_size = 32
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)

batch_size_test = 32
valid_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)

params = list(net.parameters())

optimizer = optim.Adam(net.parameters(), lr=0.0001)
total_steps = len(train_loader)

scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=config['LEARNING_RATE'], steps_per_epoch=total_steps, epochs=config['EPOCHS'])

In [49]:
net = net.to(device)

In [50]:
set_seeds(42)

### The main training loop

If the loss is smaller than the minimum loss, then save the model. 

In [51]:
loss_list = []

for epoch in range(config["EPOCHS"]):
    running_loss = 0.0
    curr_lr = optimizer.param_groups[0]['lr']
    for i, data in enumerate(train_loader):
        inputs, labels, weights = data
        
        inputs = inputs.to(device)        
        optimizer.zero_grad()
        outputs = net(inputs)
        
        loss = feasibility_promoting_weighted_BCELoss(outputs, labels, weights, device=device)
        loss.backward()
        optimizer.step()
        scheduler.step()
        running_loss += loss.item()
    print('Epoch %d loss: %.3f lr: %.6f' % (epoch + 1, running_loss / len(train_loader), curr_lr))
    
    if len(loss_list) > 0:
        print("min loss: ", min(loss_list))
        if (running_loss / len(train_loader)) < min(loss_list):
            torch.save(net.state_dict(), "Data/corlat_presolved/models/MLP_corlat_presolved_constraint_weighted_loss.pth")
            print("Model saved")
    
    loss_list.append(running_loss / len(train_loader))
    
    # if training loss is lower than previous loss, save the model


Epoch 1 loss: 0.684 lr: 0.000020
Epoch 2 loss: 0.634 lr: 0.000020
min loss:  0.6844135401200275
Model saved
Epoch 3 loss: 0.615 lr: 0.000020
min loss:  0.6340614399131463
Model saved
Epoch 4 loss: 0.603 lr: 0.000020
min loss:  0.6148994783965909
Model saved
Epoch 5 loss: 0.592 lr: 0.000020
min loss:  0.6025286633141187
Model saved
Epoch 6 loss: 0.579 lr: 0.000020
min loss:  0.592104079772015
Model saved
Epoch 7 loss: 0.568 lr: 0.000020
min loss:  0.5792618965616032
Model saved
Epoch 8 loss: 0.556 lr: 0.000021
min loss:  0.5682333634824169
Model saved
Epoch 9 loss: 0.543 lr: 0.000021
min loss:  0.5556475532298185
Model saved
Epoch 10 loss: 0.537 lr: 0.000021
min loss:  0.5428663115112149
Model saved
Epoch 11 loss: 0.533 lr: 0.000021
min loss:  0.5373138608981152
Model saved
Epoch 12 loss: 0.532 lr: 0.000022
min loss:  0.5332775973543828
Model saved
Epoch 13 loss: 0.531 lr: 0.000022
min loss:  0.5316924379796398
Model saved
Epoch 14 loss: 0.530 lr: 0.000022
min loss:  0.5312055945396423


In [53]:
# load the model
net = NeuralNetwork()
net.load_state_dict(torch.load("Data/corlat_presolved/models/MLP_corlat_presolved_constraint_weighted_loss.pth"))

<All keys matched successfully>

In [54]:
# test number of feasible solutions
# test the model on the test set
net.eval()
net.to(device)

NeuralNetwork(
  (fc1): Linear(in_features=14602, out_features=1825, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=1825, out_features=912, bias=True)
  (fc3): Linear(in_features=912, out_features=456, bias=True)
  (fc4): Linear(in_features=456, out_features=100, bias=True)
  (sigmoid): Sigmoid()
  (dropout): Dropout(p=0.2, inplace=False)
)

# Testing feasibility and time needed for optimization

### Function for testing feasibility.



In [42]:
def feasibility_test(batch_size: int, y_pred: npt.NDArray, test_models: List, indices: List) -> List[int]:
    
    """
    Function to test the feasibility of the solution predicted by the neural network.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    The number of violated constraints is the number of non zero elements in IISConstr.
    
    Args:
    batch_size (int): batch size
    y_pred (npt.NDArray): predictions of the neural network
    test_models (List): list of gurobi models for each instance in the test set
    indices (List): list of indices of binary variables
    

    Returns:
    n_violated_constraints: list of number of violated constraints for each instance in the test set    
    """
    
    n_violated_constraints = []

    # convert predictions of N_samples, N_variables to binary
    y_pred_binary = np.where(y_pred > 0.5, 1, 0)
    
    # Compute the weights for each training instance
    for i in range(len(test_models)):
        
        model = test_models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = indices

        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
        except gb.GurobiError:
            print("Model is feasible")
            n_violated_constraints.append(0)
            continue
            
        
        # get number of violated constraints
        IISConstr = model.getAttr("IISConstr", model.getConstrs())

        # count number of non zero elements in IISConstr        
        n_violated_constraints.append(np.count_nonzero(IISConstr))
        
    return n_violated_constraints

### Append the gurobi test instances into a list

In [56]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)
model_files = pkl.load(open("Data/corlat_presolved/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat_presolved/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    
    

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02


In [57]:
model.ModelSense
# if -1, minimize, if 1, maximize
print("Model objective sense: ", model.ModelSense)

Model objective sense:  -1


In [58]:
obj = model.getObjective()
print(model.getAttr("Sense"))

['<', '=', '=', '=', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<',

### Get number of violated constraints for each test instance

In [None]:
n_violated_constraints = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)
    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    n_violated_constraints_batch = feasibility_test(batch_size_test, outputs.detach().cpu().numpy(), test_models_batch, binary_indices)
    
    n_violated_constraints.append(n_violated_constraints_batch)
    #

In [60]:
# flatten n_violated_constraints
n_violated_constraints = [item for sublist in n_violated_constraints for item in sublist]

In [61]:
print("Average number of violated constraints: ", np.mean(n_violated_constraints))
print("Length of n_violated_constraints: ", len(n_violated_constraints))
print(n_violated_constraints)

Average number of violated constraints:  2.2313624678663238
Length of n_violated_constraints:  389
[1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 0, 1, 1, 1, 74, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 38, 2, 2, 1, 1, 11, 1, 1, 1, 0, 1, 1, 1, 1, 1, 30, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 57, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 13, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 15, 1, 27, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 0, 1, 1, 2, 1, 1, 1, 16, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 19, 1, 28, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 21, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 38, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 1, 0, 1, 19, 1, 1, 1, 14, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 5, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 15,

### Function to calculate the optimization time for each instance in the test set if we use $\color{lightblue}\text{warm start}$ to find optimal solution.

In [62]:
def calculate_diving_opt_time(models: List, binary_indices: List, y_pred: npt.NDArray):
    
    """
    Function to calculate the optimization time for each instance in the test set if we use warm start to find the optimal solution.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    If the model is infeasible, we will set the bounds of the binary variables to 0 and 1, and set the starting value of the binary variables to the value predicted by the neural network.
    We will then optimize the model and record the optimization time.

    Args:
    models: list of gurobi models for each instance in the test set
    binary_indices: list of indices of binary variables
    y_pred: predictions of the neural network
    
    Returns:
    opt_time: list of optimization time for each instance in the test set
    """
    
    opt_time = []
    
    for i in range(len(models)):
        model = models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = binary_indices
        
        y_pred_binary = np.where(y_pred > 0.5, 1, 0)
        
        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
            infeasible_flag = True
        except gb.GurobiError:
            print("Model is feasible")
            infeasible_flag = False
            continue
        
        if infeasible_flag:
            for j in range(len(instanceBinaryIndices)):
                if modelVars[instanceBinaryIndices[j]].IISLB == 0 and modelVars[instanceBinaryIndices[j]].IISUB == 0:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")              
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
                    modelVars[instanceBinaryIndices[j]].setAttr("Start", y_pred_binary[i, j])
                    
                    # else if the variable is in the IIS, 
                    # get the relaxed variable and 
                    # set the bounds to 0 and 1 for the relaxed binary variables
                else:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
        
        else:
            for j in range(len(instanceBinaryIndices)):
                modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
                modelVars[instanceBinaryIndices[j]].setAttr("Start", y_pred_binary[i, j])
        
        model.Params.Threads = 1
        model.optimize()
        print("Optimization time for model ", i, ": ", model.Runtime)
        opt_time.append(model.Runtime)
        
    return opt_time



### Calculate the optimization time for each instance in the test set if we use $\color{lightblue}\text{warm start}$ to find optimal solution 

In [63]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)

model_files = pkl.load(open("Data/corlat_presolved/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat_presolved/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    
    
# loop through all test models and calculate average optimization time
opt_time = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)

    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    opt_time_batch = calculate_diving_opt_time(test_models_batch, binary_indices, outputs.detach().cpu().numpy())
    
    opt_time.append(opt_time_batch)
    
# save opt_time
with open("Data/corlat_presolved/opt_time.pickle", "wb") as f:
    pkl.dump(opt_time, f)

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02
Optimization time for model  0 :  0.02565288543701172
Optimization time for model  1 :  0.8026258945465088
Optimization time for model  2 :  8.61807894706726
Optimization time for model  3 :  1.3193280696868896
Optimization time for model  4 :  1.424036979675293
Optimization time for model  5 :  0.07971501350402832
Optimization time for model  6 :  3.008183002471924
Optimization time for model  7 :  0.03709888458251953
Optimization time for model  8 :  0.021116018295288086
Optimization time for model  9 :  0.09508609771728516
Optimization time for model  10 :  0.6641838550567627
Optimization time for model  11 :  0.0036399364471435547
Optimization time for model  12 :  0.9146218299865723
Optimization time for model  13 :  0.19783592224121094
Optimization time for model  14 :  0.08869719505310059
Optimization time for model  15 :  0.1148231029510498
Optimization time for model  16 :  1.87258410453

In [64]:
# flatten opt_time
opt_time_flat = [item for sublist in opt_time for item in sublist]
print("Average optimization time: ", np.mean(opt_time_flat))

Average optimization time:  1.0126795436206617


### Calculate the optimization time for each instance in the test set if we use $\color{lightblue}\text{equality constraint}$ to find optimal solution

In [65]:
def calculate_equality_constraint_opt_time(models: list, binary_indices: list, y_pred: npt.NDArray):
    """
    Function to calculate the optimization time for each instance in the test set if we use equality constraint to find the optimal solution.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    If the model is infeasible, we will set the bounds of the binary variables to 0 and 1, and set the starting value of the binary variables to the value predicted by the neural network.
    We will then optimize the model and record the optimization time.

    Args:
        models: list of gurobi models for each instance in the test set
        binary_indices: list of indices of binary variables
        y_pred: predictions of the neural network

    Returns:
        opt_time: list of optimization time for each instance in the test set
    """
    
    opt_time = []
    
    for i in range(len(models)):
        model = models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = binary_indices
        
        y_pred_binary = np.where(y_pred > 0.5, 1, 0)
        
        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
            infeasible_flag = True
        except gb.GurobiError:
            print("Model is feasible")
            infeasible_flag = False
            continue
        
        if infeasible_flag:
            for j in range(len(instanceBinaryIndices)):
                if modelVars[instanceBinaryIndices[j]].IISLB == 0 and modelVars[instanceBinaryIndices[j]].IISUB == 0:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    # for each index in binary_indices, set the value of the corresponding variable to the value predicted by model
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])                 
                    
                    # else if the variable is in the IIS, 
                    # get the relaxed variable and 
                    # set the bounds to 0 and 1 for the relaxed binary variables
                else:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
        
        else:
            for j in range(len(instanceBinaryIndices)):
                modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])

        
        model.Params.Threads = 1
        model.optimize()
        print("Optimization time for model ", i, ": ", model.Runtime)
        opt_time.append(model.Runtime)
        
    return opt_time



# Equality Constraint test optimization time

In [66]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)

model_files = pkl.load(open("Data/corlat_presolved/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat_presolved/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    
# loop through all test models and calculate average optimization time
opt_time = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)

    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    opt_time_batch = calculate_equality_constraint_opt_time(test_models_batch, binary_indices, outputs.detach().cpu().numpy())
    
    opt_time.append(opt_time_batch)
    
# save opt_time
with open("Data/corlat_presolved/opt_time_equality_constraint.pickle", "wb") as f:
    pkl.dump(opt_time, f)

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02
Optimization time for model  0 :  0.00022721290588378906
Optimization time for model  1 :  0.0001480579376220703
Optimization time for model  2 :  0.0001850128173828125
Optimization time for model  3 :  2.482743978500366
Optimization time for model  4 :  0.0002071857452392578
Optimization time for model  5 :  0.00013899803161621094
Optimization time for model  6 :  0.00017905235290527344
Optimization time for model  7 :  0.0001270771026611328
Optimization time for model  8 :  0.0001277923583984375
Optimization time for model  9 :  0.0001220703125
Optimization time for model  10 :  0.00016999244689941406
Optimization time for model  11 :  0.00016689300537109375
Optimization time for model  12 :  1.0004680156707764
Optimization time for model  13 :  0.00015020370483398438
Optimization time for model  14 :  0.27437400817871094
Optimization time for model  15 :  0.00019598007202148438
Optimization ti

In [67]:
# flatten opt_time
opt_time_flat = [item for sublist in opt_time for item in sublist]
print("Average optimization time: ", np.mean(opt_time_flat))

Average optimization time:  0.2586452220189665
