# Neural Network training with CORLAT dataset, using a feasibility promoting weighted BCE loss

This notebook explores the training of a simple Multi Layer Perceptron (MLP) neural network for the CORLAT dataset.

The MLP outputs assignments of binary variables for the CORLAT dataset. 

The idea behind the custom loss is to provide higher weights for assignments that results in a better objective value (depending on minimization or maximization). The training of MLP in this experiment differs from the majority of neural network training paradigms. The important thing to note here is that:

$$\color{lightblue}\text{For each sample, we have multiple sets of assignments}$$

For example:
Sample 1, 100 solutions (each solution is a set of binary assignments).

We train on every feasible solution gathered (up to `n_sols` specified during data collection using the `corlat.py` script).

The idea is to establish the conditional probability distribution $$p(Y_{i} | X_{i}) \quad \text{for} \quad i=0, 1, 2, \dots, n $$

for feasible assignments. $n$ is the number samples, and $i$ represents the $i$-th sample. i.e., $p(y^{i}_{j} = 1 | X^{i})$ is the probability of assigning a $1$ to binary variable $j$, of sample $i$, such that the assignment is feasible. 

Hence, it becomes clear now that the weights for each set of assignments is to encourage assignments with better objective values.

The notebook ends with feasibility test for:
1. Number of violated constraints.
2. Optimization time for data-driven optimization using warm-start assignments.
3. Optimization time for data-driven optimization using equality constraint assignments.


In [None]:
import torch
import random
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
from torch.optim import AdamW
from xgboost import XGBClassifier

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
import pickle as pkl
import scipy
import os
import gc

from torch.nn import Linear, ReLU, Dropout
from torch.nn.functional import relu
from sklearn.model_selection import train_test_split

from sklearn.metrics import f1_score
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import confusion_matrix

import gurobipy as gb
import time

from operator import itemgetter

In [2]:
!nvidia-smi -L

GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-4045b1e6-3428-f9e1-5643-862c4834363d)
GPU 1: NVIDIA A100 80GB PCIe (UUID: GPU-35ac16d5-81e8-f772-b9cb-a681af1fd2b5)
  MIG 2g.20gb     Device  0: (UUID: MIG-0447e452-22d5-5021-ab0b-6cc63b39ac83)
GPU 2: NVIDIA A100 80GB PCIe (UUID: GPU-d949dd0a-b88e-ee87-9621-3a824f914f82)
GPU 3: NVIDIA A100 80GB PCIe (UUID: GPU-8c13f3ad-24ee-bb68-5eff-7f73091e682a)


In [3]:
!nvidia-smi

Fri Jun  9 16:23:52 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.43.04    Driver Version: 515.43.04    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA A100 80G...  On   | 00000000:17:00.0 Off |                   On |
| N/A   57C    P0   134W / 300W |  19347MiB / 81920MiB |     N/A      Default |
|                               |                      |              Enabled |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A100 80G...  On   | 00000000:65:00.0 Off |                   On |
| N/A   42C    P0    46W / 300W |     24MiB / 81920MiB |     N/A      Default |
|       

In [4]:
torch.cuda.empty_cache()
gc.collect()

0

In [5]:
# set CUDA to MIG-30c35cbb-1b1b-56b5-a681-575ef4494c6d
# set CUDA_VISIBLE_DEVICES=0
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

In [6]:
os.environ['CUBLAS_WORKSPACE_CONFIG'] = ":4096:8"
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


In [7]:
def set_seeds(seed):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.benchmark = (
        False  # Force cuDNN to use a consistent convolution algorithm
    )
    torch.backends.cudnn.deterministic = (
        True  # Force cuDNN to use deterministic algorithms if available
    )
    torch.use_deterministic_algorithms(
        True
    )  # Force torch to use deterministic algorithms if available


In [10]:
try:
    corlat_dataset = pkl.load(open("Data/corlat/processed_data/corlat_preprocessed.pickle", "rb"))
except:
    # move dir to /ibm/gpfs/home/yjin0055/Project/DayAheadForecast
    os.chdir("/ibm/gpfs/home/yjin0055/Project/DayAheadForecast")
    corlat_dataset = pkl.load(open("Data/corlat/processed_data/corlat_preprocessed.pickle", "rb"))

In [11]:
num_nodes = corlat_dataset[0]["var_node_features"].shape[0]
n_var_node_features = corlat_dataset[0]["var_node_features"].shape[1]
max_constraint_size = corlat_dataset[0]["constraint_node_features"].shape[0]
n_constraint_node_features = corlat_dataset[0]["constraint_node_features"].shape[1]

In [12]:
# for all check the number of nodes and features
for i in range(len(corlat_dataset)):
    assert num_nodes == corlat_dataset[i]["var_node_features"].shape[0]
    assert n_var_node_features == corlat_dataset[i]["var_node_features"].shape[1]
    # assert max_constraint_size == corlat_dataset[i]["constraint_node_features"].shape[0]
    assert n_constraint_node_features == corlat_dataset[i]["constraint_node_features"].shape[1]

### Load the dataset

If cannot load, we `cd` to the respective project directory.

In [13]:
try:
    corlat_presolved_dataset = pkl.load(open("Data/corlat_presolved/processed_data/corlat_presolved_preprocessed.pickle", "rb"))
except:
    # move dir to /ibm/gpfs/home/yjin0055/Project/DayAheadForecast
    os.chdir("/ibm/gpfs/home/yjin0055/Project/DayAheadForecast")
    corlat_presolved_dataset = pkl.load(open("Data/corlat_presolved/processed_data/corlat_presolved_preprocessed.pickle", "rb"))

### Get the binary variable indices

In [16]:
# binary indices for each sample is the same
binary_indices = corlat_dataset[0]["indices"]["indices"]

In [17]:
# read X_train, X_test, y_train, y_test from Data/corlat/ using numpy.load
X_train = np.load("Data/corlat/train_test_data/X_train.npy")
X_test = np.load("Data/corlat/train_test_data/X_test.npy")
y_train = np.load("Data/corlat/train_test_data/y_train.npy", allow_pickle=True)
y_test = np.load("Data/corlat/train_test_data/y_test.npy", allow_pickle=True)

### Convert to binary and ensure there are no -0.0 values

In [18]:
# for each instance in y_train and y_test, convert it to binary
for i in range(y_train.shape[0]):
    # make all values positive using abs
    # y_train[i] is a tensor of shape (arbritary shape), num_vars
    y_train[i] = np.abs(y_train[i])
    
    # use numpy where to convert values > 0.5 to 1, and values <= 0.5 to 0
    y_train[i] = np.where(y_train[i] > 0.5, 1.0, 0.0)
    
for i in range(y_test.shape[0]):
    # make all values positive using abs
    # y_train[i] is a tensor of shape (arbritary shape), num_vars
    y_test[i] = np.abs(y_test[i])
    
    # use numpy where to convert values > 0.5 to 1, and values <= 0.5 to 0
    y_test[i] = np.where(y_test[i] > 0.5, 1.0, 0.0)

In [19]:
y_train[0]

array([[1., 1., 1., ..., 1., 0., 1.],
       [1., 1., 1., ..., 1., 0., 1.],
       [1., 1., 1., ..., 1., 0., 1.],
       ...,
       [1., 1., 1., ..., 1., 0., 1.],
       [1., 1., 1., ..., 1., 0., 1.],
       [1., 1., 1., ..., 1., 0., 1.]])

### Load the indices for training set and testing set

In [20]:
# train and test indices
train_indices = np.load("Data/corlat/train_test_data/train_idx.npy")
test_indices = np.load("Data/corlat/train_test_data/test_idx.npy")

In [21]:
n_features = X_train.shape[1]
out_channels = y_train[0].shape[1]

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [22]:
print("n_features: ", n_features)
print("out_channels: ", out_channels)

n_features:  15847
out_channels:  100


### Load the weights for feasibility promoting weighted BCE loss loss

In [23]:
weights = np.load("Data/corlat/train_test_data/train_weights.npy", allow_pickle=True)

### Define neural network

In [27]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(n_features, n_features//8)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(n_features//8, n_features//16)
        self.fc3 = nn.Linear(n_features//16, n_features//32)
        self.fc4 = nn.Linear(n_features//32, out_channels)
        self.sigmoid = nn.Sigmoid()
        
        # add regularization
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc3(x)
        x = self.relu(x)
        x = self.dropout(x)
        x = self.fc4(x)
        x = self.sigmoid(x)
        
        return x

In [28]:
config = {
        'train_val_split': [0.80, 0.20], # These must sum to 1.0
        'batch_size' : 32, # Num samples to average over for gradient updates
        'EPOCHS' : 1000, # Num times to iterate over the entire dataset
        'LEARNING_RATE' : 5e-4, # Learning rate for the optimizer
        'BETA1' : 0.9, # Beta1 parameter for the Adam optimizer
        'BETA2' : 0.999, # Beta2 parameter for the Adam optimizer
        'WEIGHT_DECAY' : 1e-4, # Weight decay parameter for the Adam optimizer
    }

### Define a custom class for our dataset

The custom class is needed as our `y` data is an `object` type tensor. This is because `y` is an array of `n` samples, where each `y[i]` is of `n_sols` x 100, where 100 is 100 binary outputs. `n_sols` can have a maximum of 100, due to our setting of collecting 100 possibl solutions. However `n_sols` varies from sample to sample, as for some sample there might not be 100 sols.  

In [29]:
class multipleTargetCORLATDataset(TensorDataset):
    def __init__(self, X, y, weights=None, test=False):
        super(multipleTargetCORLATDataset, self).__init__()
        self.X = X
        self.y = y
        self.weights = weights
        self.test = test
        
    def __getitem__(self, index):
        X = self.X[index]
        y = self.y[index]
    
        if self.weights is None and self.test:
            return torch.tensor(X, dtype=torch.float32), torch.tensor(y, dtype=torch.float32)
        
        
        
        weights = self.weights[index]
        
        
        X_tensor = torch.tensor(X, dtype=torch.float32)
        y_tensor = torch.tensor(y, dtype=torch.float32)
        weights_tensor = torch.tensor(weights, dtype=torch.float32)

        return X_tensor, y_tensor, weights_tensor
    
    def __len__(self):
        return len(self.X)
    

def collate_fn(data):
    # data is a list of tuples (X, Y, weights)
    
    X = torch.stack([item[0] for item in data])
    Y = [item[1] for item in data]
    
    
    # only X and Y, no weights
    if len(data[0]) == 2:
        return X, Y    
    
    weights = [item[2] for item in data]

    
    return X, Y, weights
    

### Create train and test dataset

In [30]:
train_dataset = multipleTargetCORLATDataset(X_train, y_train, weights=weights)

In [31]:
test_dataset = multipleTargetCORLATDataset(X_test, y_test, test=True)

### Initialize neural network, train loader, valid, loader, and optimizer.

We used a one cycle learning rate in this case, where the learning rate warms up and peaks at `max_lr`. Then decreases to a small learning rate.

In [48]:
net = NeuralNetwork()
# net = torch.compile(net)

batch_size = 32

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)

batch_size_test = 32
valid_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True, collate_fn=collate_fn)

params = list(net.parameters())

optimizer = optim.Adam(net.parameters(), lr=0.0001)
total_steps = len(train_loader)

scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=config['LEARNING_RATE'], steps_per_epoch=total_steps, epochs=config['EPOCHS'])

### Definition for feasibility promoting weighted BCELoss
The custom loss calculates the BCEloss between the predicted binary targets and each (possible) output binary solutions (keep in mind that we have a maximum of 100 total solutions). 

For each sample, we take the mean across the losses of the 100 binary variables. Then for each sample, we multiply the calculated loss with the respective weight. We then sum all the losses of `n_sols_i`. 

For each batch, we then take the mean loss of the batch.

In [49]:
# custom loss for neural network
def feasibility_promoting_weighted_BCELoss(y_pred: torch.tensor, y_true: torch.tensor, weights: torch.tensor, device: torch.device):
    
    batch_loss = []
    
    loss_fn = nn.BCELoss(reduction='none')
        
    # sum over all targets
    for i in range(len(y_true)):
        loss = torch.mean(loss_fn(y_pred[i].expand(len(y_true[i]), -1), y_true[i].to(device)), dim=1)
        loss = torch.mul(loss, weights[i].to(device))
        batch_loss.append(torch.sum(loss))
    
    # sum over all samples
    batch_loss = torch.mean(torch.stack(batch_loss))
    
    return batch_loss

In [50]:
net = net.to(device)

In [51]:
set_seeds(42)

### The main training loop

If the loss is smaller than the minimum loss, then save the model. 

In [52]:
loss_list = []

for epoch in range(config["EPOCHS"]):
    running_loss = 0.0
    curr_lr = optimizer.param_groups[0]['lr']
    for i, data in enumerate(train_loader):
        inputs, labels, weights = data
        
        inputs = inputs.to(device)        
        optimizer.zero_grad()
        outputs = net(inputs)
        
        loss = feasibility_promoting_weighted_BCELoss(outputs, labels, weights, device=device)
        loss.backward()
        optimizer.step()
        scheduler.step()
        running_loss += loss.item()
    print('Epoch %d loss: %.3f lr: %.6f' % (epoch + 1, running_loss / len(train_loader), curr_lr))
    
    if len(loss_list) > 0:
        print("min loss: ", min(loss_list))
        if (running_loss / len(train_loader)) < min(loss_list):
            torch.save(net.state_dict(), "Data/corlat/models/MLP_corlat_feasibility_promoting_weighted_BCELoss.pth")
            print("Model saved")
    
    loss_list.append(running_loss / len(train_loader))
    
    # if training loss is lower than previous loss, save the model


Epoch 1 loss: 0.691 lr: 0.000020
Epoch 2 loss: 0.649 lr: 0.000020
min loss:  0.6905858302116394
Model saved
Epoch 3 loss: 0.623 lr: 0.000020
min loss:  0.648573921918869
Model saved
Epoch 4 loss: 0.590 lr: 0.000020
min loss:  0.6225843667984009
Model saved
Epoch 5 loss: 0.562 lr: 0.000020
min loss:  0.5899189734458923
Model saved
Epoch 6 loss: 0.542 lr: 0.000020
min loss:  0.5617670226097107
Model saved
Epoch 7 loss: 0.533 lr: 0.000020
min loss:  0.5422833728790283
Model saved
Epoch 8 loss: 0.527 lr: 0.000021
min loss:  0.533007031083107
Model saved
Epoch 9 loss: 0.526 lr: 0.000021
min loss:  0.5274606078863144
Model saved
Epoch 10 loss: 0.524 lr: 0.000021
min loss:  0.5264981603622436
Model saved
Epoch 11 loss: 0.522 lr: 0.000021
min loss:  0.5243312108516693
Model saved
Epoch 12 loss: 0.521 lr: 0.000022
min loss:  0.5223673379421234
Model saved
Epoch 13 loss: 0.521 lr: 0.000022
min loss:  0.5214317381381989
Model saved
Epoch 14 loss: 0.521 lr: 0.000022
min loss:  0.5209495574235916
M

In [53]:
# load the model
net = NeuralNetwork()
net.load_state_dict(torch.load("Data/corlat/models/MLP_corlat_feasibility_promoting_weighted_BCELoss.pth"))

<All keys matched successfully>

In [54]:
# test number of feasible solutions
# test the model on the test set
net.eval()
net.to(device)

NeuralNetwork(
  (fc1): Linear(in_features=15847, out_features=1980, bias=True)
  (relu): ReLU()
  (fc2): Linear(in_features=1980, out_features=990, bias=True)
  (fc3): Linear(in_features=990, out_features=495, bias=True)
  (fc4): Linear(in_features=495, out_features=100, bias=True)
  (sigmoid): Sigmoid()
  (dropout): Dropout(p=0.2, inplace=False)
)

# Testing feasibility and time needed for optimization

### Function for testing feasibility.



In [55]:
def feasibility_test(batch_size, y_pred, test_models, indices):
    """
    Function to test the feasibility of the solution predicted by the neural network.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    The number of violated constraints is the number of non zero elements in IISConstr.
    
    Args:
    batch_size (int): batch size
    y_pred (npt.NDArray): predictions of the neural network
    test_models (List): list of gurobi models for each instance in the test set
    indices (List): list of indices of binary variables
    

    Returns:
    n_violated_constraints (List): list of number of violated constraints for each instance in the test set    
    """
    
    n_violated_constraints = []

    # convert predictions of N_samples, N_variables to binary
    y_pred_binary = np.where(y_pred > 0.5, 1, 0)
    
    # Compute the weights for each training instance
    for i in range(len(test_models)):
        
        model = test_models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = indices

        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
        except gb.GurobiError:
            print("Model is feasible")
            n_violated_constraints.append(0)
            continue
            
        
        # get number of violated constraints
        IISConstr = model.getAttr("IISConstr", model.getConstrs())

        # count number of non zero elements in IISConstr        
        n_violated_constraints.append(np.count_nonzero(IISConstr))
        
    return n_violated_constraints

### Append the gurobi test instances into a list

In [58]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)
# model_files = os.listdir("instances/mip/data/COR-LAT")
model_files = pkl.load(open("Data/corlat/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))

for i in range(len(test_indices)):
    model = gb.read("Data/corlat/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02


In [59]:
model.ModelSense
# if -1, minimize, if 1, maximize
print("Model objective sense: ", model.ModelSense)

Model objective sense:  -1


In [60]:
obj = model.getObjective()
print(model.getAttr("Sense"))

['<', '=', '=', '=', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<', '<',

### Get number of violated constraints for each test instance

In [61]:
n_violated_constraints = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)
    print(i)
    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    n_violated_constraints_batch = feasibility_test(batch_size_test, outputs.detach().cpu().numpy(), test_models_batch, binary_indices)
    
    n_violated_constraints.append(n_violated_constraints_batch)
    #

0
1
Model is feasible
2
Model is feasible
3
Model is feasible
Model is feasible
Model is feasible
4
Model is feasible
5
6
Model is feasible
Model is feasible
7
Model is feasible
8
9
10
Model is feasible
11
Model is feasible
12


In [62]:
# flatten n_violated_constraints
n_violated_constraints = [item for sublist in n_violated_constraints for item in sublist]

In [63]:
print("Average number of violated constraints: ", np.mean(n_violated_constraints))
print("Length of n_violated_constraints: ", len(n_violated_constraints))
print(n_violated_constraints)

Average number of violated constraints:  2.3575
Length of n_violated_constraints:  400
[24, 28, 1, 1, 1, 1, 1, 1, 27, 2, 2, 1, 1, 27, 36, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 0, 2, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 27, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 1, 1, 77, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 2, 14, 2, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 11, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 2, 1, 0, 1, 1, 1, 2, 1, 2, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 47, 2, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 14, 21, 2, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 16, 1, 1, 2, 1, 1, 1,

### Test the average optimization time without any warm start or equality constrain set.

The naive optimization routine.

In [64]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)

model_files = pkl.load(open("Data/corlat/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)

# loop through all test models and calculate average optimization time
opt_time = []
for i in range(len(test_models)):
    model = test_models[i]
    model.Params.Threads = 1
    model.optimize()
    print("Optimization time for model ", i, ": ", model.Runtime)
    opt_time.append(model.Runtime)

print("Average optimization time: ", np.mean(opt_time))

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02
Optimization time for model  0 :  0.20464301109313965
Optimization time for model  1 :  2.814563035964966
Optimization time for model  2 :  2.0997819900512695
Optimization time for model  3 :  2.0235610008239746
Optimization time for model  4 :  1.327219009399414
Optimization time for model  5 :  0.9989018440246582
Optimization time for model  6 :  1.5220320224761963
Optimization time for model  7 :  0.9181039333343506
Optimization time for model  8 :  0.00653386116027832
Optimization time for model  9 :  2.900223970413208
Optimization time for model  10 :  0.19867801666259766
Optimization time for model  11 :  0.06766915321350098
Optimization time for model  12 :  0.0037429332733154297
Optimization time for model  13 :  0.22120189666748047
Optimization time for model  14 :  0.0034461021423339844
Optimization time for model  15 :  0.49202394485473633
Optimization time for model  16 :  0.198295831

### Function to calculate the optimization time for each instance in the test set if we use $\color{lightblue}\text{warm start}$ to find optimal solution.

In [65]:
def calculate_diving_opt_time(models, binary_indices, y_pred):
    
    """
    Function to calculate the optimization time for each instance in the test set if we use warm start to find the optimal solution.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    If the model is infeasible, we will set the bounds of the binary variables to 0 and 1, and set the starting value of the binary variables to the value predicted by the neural network.
    We will then optimize the model and record the optimization time.

    Args:
    models: list of gurobi models for each instance in the test set
    binary_indices: list of indices of binary variables
    y_pred: predictions of the neural network
    
    Returns:
    opt_time: list of optimization time for each instance in the test set
    """
    
    opt_time = []
    
    for i in range(len(models)):
        model = models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = binary_indices
        
        y_pred_binary = np.where(y_pred > 0.5, 1, 0)
        
        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
            infeasible_flag = True
        except gb.GurobiError:
            print("Model is feasible")
            infeasible_flag = False
            continue
        
        if infeasible_flag:
            for j in range(len(instanceBinaryIndices)):
                if modelVars[instanceBinaryIndices[j]].IISLB == 0 and modelVars[instanceBinaryIndices[j]].IISUB == 0:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    # for each index in binary_indices, set the value of the corresponding variable to the value predicted by model
                    # modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                    # modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])                 
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
                    modelVars[instanceBinaryIndices[j]].setAttr("Start", y_pred_binary[i, j])
                    
                    # else if the variable is in the IIS, 
                    # get the relaxed variable and 
                    # set the bounds to 0 and 1 for the relaxed binary variables
                else:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
        
        else:
            for j in range(len(instanceBinaryIndices)):
                modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                # modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                # modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
                modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
                modelVars[instanceBinaryIndices[j]].setAttr("Start", y_pred_binary[i, j])
        
        model.Params.Threads = 1
        model.optimize()
        print("Optimization time for model ", i, ": ", model.Runtime)
        opt_time.append(model.Runtime)
        
    return opt_time



In [66]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)

model_files = pkl.load(open("Data/corlat/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    
# loop through all test models and calculate average optimization time
opt_time = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)

    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    opt_time_batch = calculate_diving_opt_time(test_models_batch, binary_indices, outputs.detach().cpu().numpy())
    
    opt_time.append(opt_time_batch)
    
# save opt_time
with open("Data/corlat/opt_time.pickle", "wb") as f:
    pkl.dump(opt_time, f)

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02
Optimization time for model  0 :  0.21126794815063477
Optimization time for model  1 :  2.7527899742126465
Optimization time for model  2 :  1.5212550163269043
Optimization time for model  3 :  1.369001865386963
Optimization time for model  4 :  0.42718982696533203
Optimization time for model  5 :  0.2759220600128174
Optimization time for model  6 :  0.396716833114624
Optimization time for model  7 :  0.9202249050140381
Optimization time for model  8 :  0.0068509578704833984
Optimization time for model  9 :  1.1727099418640137
Optimization time for model  10 :  0.48308396339416504
Optimization time for model  11 :  0.01904892921447754
Optimization time for model  12 :  0.003921985626220703
Optimization time for model  13 :  0.221466064453125
Optimization time for model  14 :  0.003690004348754883
Optimization time for model  15 :  0.7606759071350098
Optimization time for model  16 :  0.1981639862

In [67]:
# flatten opt_time
opt_time_flat = [item for sublist in opt_time for item in sublist]
print("Average optimization time: ", np.mean(opt_time_flat))

Average optimization time:  1.2182969812782345


### Calculate the optimization time for each instance in the test set if we use $\color{lightblue}\text{equality constraint}$ to find optimal solution

In [68]:
def calculate_equality_constraint_opt_time(models, binary_indices, y_pred):
    """
    Function to calculate the optimization time for each instance in the test set if we use equality constraint to find the optimal solution.
    For each instance in the test set, we will relax the binary variables to continuous variables with bounds of 0 and 1, and set the value of the binary variables to the value predicted by the neural network.
    We will then compute the IIS to find the list of violated constraints and variables.
    If the model is infeasible, we will set the bounds of the binary variables to 0 and 1, and set the starting value of the binary variables to the value predicted by the neural network.
    We will then optimize the model and record the optimization time.

    Args:
        models: list of gurobi models for each instance in the test set
        binary_indices: list of indices of binary variables
        y_pred: predictions of the neural network

    Returns:
        opt_time: list of optimization time for each instance in the test set
    """
    
    opt_time = []
    
    for i in range(len(models)):
        model = models[i]
        
        modelVars = model.getVars()
        
        instanceBinaryIndices = binary_indices
        
        y_pred_binary = np.where(y_pred > 0.5, 1, 0)
        
        # need to relax the binary variables to continuous variables with bounds of 0 and 1, we can use the setAttr method to change their vtype attribute
        for j in range(len(instanceBinaryIndices)):
            modelVars[instanceBinaryIndices[j]].setAttr("VType", "C")

            # for each index in firstInstanceTestBinaryIndices, set the value of the corresponding variable to the value predicted by xgboost
            modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
            modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        
        # Compute the IIS to find the list of violated constraints and variables
        try:
            model.computeIIS()
            infeasible_flag = True
        except gb.GurobiError:
            print("Model is feasible")
            infeasible_flag = False
            continue
        
        if infeasible_flag:
            for j in range(len(instanceBinaryIndices)):
                if modelVars[instanceBinaryIndices[j]].IISLB == 0 and modelVars[instanceBinaryIndices[j]].IISUB == 0:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    # for each index in binary_indices, set the value of the corresponding variable to the value predicted by model
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])                 
                    
                    # else if the variable is in the IIS, 
                    # get the relaxed variable and 
                    # set the bounds to 0 and 1 for the relaxed binary variables
                else:
                    modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                    modelVars[instanceBinaryIndices[j]].setAttr("LB", 0)
                    modelVars[instanceBinaryIndices[j]].setAttr("UB", 1)
        
        else:
            for j in range(len(instanceBinaryIndices)):
                modelVars[instanceBinaryIndices[j]].setAttr("VType", "B")
                modelVars[instanceBinaryIndices[j]].setAttr("LB", y_pred_binary[i, j])
                modelVars[instanceBinaryIndices[j]].setAttr("UB", y_pred_binary[i, j])
        
        model.Params.Threads = 1
        model.optimize()
        print("Optimization time for model ", i, ": ", model.Runtime)
        opt_time.append(model.Runtime)
        
    return opt_time



# Equality Constraint test optimization time

In [69]:
test_models = []
gurobi_env = gb.Env()
gurobi_env.setParam("OutputFlag", 0)

model_files = pkl.load(open("Data/corlat/train_test_data/pickle_filenames.pkl", "rb"))
# rename from .pickle to .lp
filename = []
for i in range(len(model_files)):
    filename.append(model_files[i].replace(".pickle", ".lp"))
    
for i in range(len(test_indices)):
    model = gb.read("Data/corlat/instances/" + filename[test_indices[i]], env=gurobi_env)
    test_models.append(model)
    
# loop through all test models and calculate average optimization time
opt_time = []
for i, data in enumerate(valid_loader):
    inputs, labels = data
    
    inputs = inputs.to(device)
    # labels = labels.to(device)
    
    outputs = net(inputs)
    
    # get slices of test_models according to batch size
    len_test_models = len(test_models)

    test_models_batch = test_models[i*batch_size: min((i+1)*batch_size, len_test_models)]
    
    opt_time_batch = calculate_equality_constraint_opt_time(test_models_batch, binary_indices, outputs.detach().cpu().numpy())
    
    opt_time.append(opt_time_batch)
    
# save opt_time
with open("Data/corlat/opt_time_equality_constraint.pickle", "wb") as f:
    pkl.dump(opt_time, f)

Set parameter Username
Academic license - for non-commercial use only - expires 2024-06-02
Optimization time for model  0 :  0.0006139278411865234
Optimization time for model  1 :  0.0001938343048095703
Optimization time for model  2 :  0.0001881122589111328
Optimization time for model  3 :  0.00017189979553222656
Optimization time for model  4 :  0.0001499652862548828
Optimization time for model  5 :  0.00014209747314453125
Optimization time for model  6 :  0.00011801719665527344
Optimization time for model  7 :  0.00017309188842773438
Optimization time for model  8 :  0.006464958190917969
Optimization time for model  9 :  0.0001800060272216797
Optimization time for model  10 :  0.25037503242492676
Optimization time for model  11 :  0.03884005546569824
Model is feasible
Optimization time for model  13 :  0.0001780986785888672
Model is feasible
Optimization time for model  15 :  0.00022411346435546875
Optimization time for model  16 :  0.00014090538024902344
Optimization time for model

In [70]:
# flatten opt_time
opt_time_flat = [item for sublist in opt_time for item in sublist]
print("Average optimization time: ", np.mean(opt_time_flat))

Average optimization time:  0.44385798549165534
