# Test define a set of data with time t0, t1, t2 and have one machine learning model to predict into this multiple times

-----------
How to do:

- To predict multiple time intervals with the same model, it is necesary define a set of gurobipy as time and then define a dataframe with the index the time and make model.predict(dataframe)
- The models, as the problem was defined, has primary variables, secundary variables, target and observed variables. All the variables are decision variables in optimization but the observed variables are fixed values. Also, they are fixed across the time of optimization

## Root folder and read env variables

In [1]:
import os
# fix root path to save outputs
actual_path = os.path.abspath(os.getcwd())
list_root_path = actual_path.split('\\')[:-2]
root_path = '\\'.join(list_root_path)
os.chdir(root_path)
print('root path: ', root_path)

root path:  D:\github-mi-repo\Gurobi-ML-tips-modeling


In [2]:
import os
from dotenv import load_dotenv, find_dotenv # package used in jupyter notebook to read the variables in file .env

""" get env variable from .env """
load_dotenv(find_dotenv())

""" Read env variables and save it as python variable """
PROJECT_GCP = os.environ.get("PROJECT_GCP", "")

## ---> LOAD THE PROCESS THAT PREDICT OUTPUT Y2 OF PROCESS B.  Y2 = f(Z1, X2, O5, O6)

In [3]:
import pickle
import pandas as pd
import numpy as np

#gurobi
import gurobipy_pandas as gppd
from gurobi_ml import add_predictor_constr
import gurobipy as gp

### 0. Load data
This data will be use to get values to generate a instance of the ml model

In [4]:
name_process = 'process_b_y2'  # LOAD THE MODEL THAT PREDICT OUTPUT Y2 OF PROCESS B.  Y2 = f(Z1, X2, O5, O6)

# load X_test
path_X_test = f'artifacts/data_training/{name_process}/X_test.pkl'
X_test = pd.read_pickle(path_X_test)

# load y_test
path_y_test = f'artifacts/data_training/{name_process}/y_test.pkl'
y_test = pd.read_pickle(path_y_test)

### 1. Load Artifacts to connect ML to gurobi

#### 1.1 pkl model

In [5]:
path_model_to_test = f'artifacts/models/{name_process}/lr.pkl'
model_ml_to_test = pd.read_pickle(path_model_to_test)
model_ml_to_test

### 1.2 Define list of features and target for each model

In [6]:
X_test

Unnamed: 0,Z1,X2,O5,O6
521,125.827767,1.600197,4.554699,0.088144
737,86.789512,0.578117,7.962496,3.903218
740,91.836144,6.023679,0.332634,2.712363
660,85.987786,3.098621,3.222811,0.905589
411,85.749421,5.509697,3.307710,4.956510
...,...,...,...,...
408,110.079863,6.547218,9.923618,8.871421
332,110.343326,3.887887,5.537505,4.707576
208,94.335957,7.043643,3.328544,0.851833
613,118.819866,2.913836,4.735953,0.224784


In [7]:
######################## model  ########################

list_features = ['Z1', 'X2', 'O5', 'O6']

list_features_controlables = ['Z1', 'X2']

list_target = ['Y2']

### 1.3 Read master tag and sort features according its order

In [8]:
# read table master tag
path_list_features_target_to_optimization = f'config/config_ml_models_development/MasterTable_{name_process}.xlsx'
maestro_tags = pd.read_excel(path_list_features_target_to_optimization)

### sort list of features according the order in master table
list_features = [tag for tag in maestro_tags['TAG'].tolist() if tag in list_features]
list_features_controlables = [tag for tag in maestro_tags['TAG'].tolist() if tag in list_features_controlables]

## 2. Create gurobi model

In [9]:
# create model
m = gp.Model('modelo')

Restricted license - for non-production use only - expires 2025-11-24


### 3. Create decision variables
- Decision variables that are features in ml models
- Decicion variable that is the output in ml models

In [10]:
# define set
list_set_time = ['t0', 't1', 't2', 't3', 't4', 't5', 't6']
index_set_time = pd.Index(list_set_time)
index_set_time

Index(['t0', 't1', 't2', 't3', 't4', 't5', 't6'], dtype='object')

In [11]:
# create decision variables - features ml model
var_Z1 = gppd.add_vars(m, index_set_time, name = "decision variable Z1"
                                     )

var_X2 = gppd.add_vars(m, index_set_time, name = "decision variable X2"
                                     )

In [12]:
# crete decision variables - output ml model
var_Y2 = gppd.add_vars(m, index_set_time, name = "decision variable Y2"
                                     )

In [13]:
# "compile"
m.update()

In [14]:
# see decision var created
var_Y2

t0    <gurobi.Var decision variable Y2[t0]>
t1    <gurobi.Var decision variable Y2[t1]>
t2    <gurobi.Var decision variable Y2[t2]>
t3    <gurobi.Var decision variable Y2[t3]>
t4    <gurobi.Var decision variable Y2[t4]>
t5    <gurobi.Var decision variable Y2[t5]>
t6    <gurobi.Var decision variable Y2[t6]>
Name: decision variable Y2, dtype: object

### 4. Create instance of Machine learning model using decision var of gurobi (decision var in optimization)
The observed variables has fixed values, so this values doesn't change across the time

In [15]:
######################## generate instance NO controlables features for model ########################

# list feature NC
list_features_no_vc = list(set(list_features) - set(list_features_controlables))

# generate dataframe with input values. In this example is the mean value
#df_input_values = X_test[list_features_no_vc].mean().to_frame().T
df_input_values = np.array(X_test[list_features_no_vc].mean().to_frame().T).tolist()

# generate dataframe instance_no_controlables with the time set
instance_no_controlables = pd.DataFrame(df_input_values, index = index_set_time, columns = list_features_no_vc)
instance_no_controlables

Unnamed: 0,O6,O5
t0,4.53373,5.216518
t1,4.53373,5.216518
t2,4.53373,5.216518
t3,4.53373,5.216518
t4,4.53373,5.216518
t5,4.53373,5.216518
t6,4.53373,5.216518


In [16]:
######################## genrate instance - features no controlables + decision vars ########################

# create instance with controlables variables. sorted according the list of features. ES MUY IMPORTANTE QUE ESTÉ ORDENADO LAS VARIABLES DE DECUISIÓN DE ACUERDO A LA LISTA DE FEATURES
instance_controlables = pd.DataFrame([var_Z1, var_X2]).T # ADD DECISION VARIABLES
instance_controlables.columns = list_features_controlables # rename columns

# append features controlables with no controlables
instance = pd.concat([instance_no_controlables, instance_controlables], axis = 1)
instance = instance[list_features] # sort features

In [17]:
instance

Unnamed: 0,Z1,X2,O5,O6
t0,<gurobi.Var decision variable Z1[t0]>,<gurobi.Var decision variable X2[t0]>,5.216518,4.53373
t1,<gurobi.Var decision variable Z1[t1]>,<gurobi.Var decision variable X2[t1]>,5.216518,4.53373
t2,<gurobi.Var decision variable Z1[t2]>,<gurobi.Var decision variable X2[t2]>,5.216518,4.53373
t3,<gurobi.Var decision variable Z1[t3]>,<gurobi.Var decision variable X2[t3]>,5.216518,4.53373
t4,<gurobi.Var decision variable Z1[t4]>,<gurobi.Var decision variable X2[t4]>,5.216518,4.53373
t5,<gurobi.Var decision variable Z1[t5]>,<gurobi.Var decision variable X2[t5]>,5.216518,4.53373
t6,<gurobi.Var decision variable Z1[t6]>,<gurobi.Var decision variable X2[t6]>,5.216518,4.53373


In [18]:
###### load ml constraint ######
pred_constr = add_predictor_constr(gp_model = m, 
                                                    predictor = model_ml_to_test, 
                                                    input_vars = instance, # instance pandas gurobi
                                                    output_vars = var_Y2, # target
                                                    name = f'model_predict'
                                                   )
pred_constr.print_stats()

Model for model_predict:
147 variables
35 constraints
105 quadratic constraints
Input has shape (7, 4)
Output has shape (7, 1)

Pipeline has 3 steps:

--------------------------------------------------------------------------------
Step            Output Shape    Variables              Constraints              
                                                Linear    Quadratic      General
std_scaler            (7, 4)           42           28            0            0

poly_feat            (7, 15)          105            0          105            0

lin_reg               (7, 1)            0            7            0            0

--------------------------------------------------------------------------------


#### NOTE IN THIS PART YOU CAN SEE IF THE MODEL CAN CONNNECT TO GUROBI!

### 5. Define objective optimization
Objetive that no generate infeasibility

In [19]:
var_Y2.sum() # sum across time

<gurobi.LinExpr: decision variable Y2[t0] + decision variable Y2[t1] + decision variable Y2[t2] + decision variable Y2[t3] + decision variable Y2[t4] + decision variable Y2[t5] + decision variable Y2[t6]>

In [20]:
m.setObjective(var_Y2.sum(),
               gp.GRB.MINIMIZE)

#### 6. Optimize and get optimal values

In [21]:
# solve
m.optimize()

Gurobi Optimizer version 11.0.0 build v11.0.0rc2 (win64 - Windows 10.0 (19043.2))

CPU model: Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz, instruction set [SSE2|AVX|AVX2]
Thread count: 6 physical cores, 12 logical processors, using up to 12 threads

Optimize a model with 35 rows, 168 columns and 161 nonzeros
Model fingerprint: 0x88ec3882
Model has 105 quadratic constraints
Coefficient statistics:
  Matrix range     [4e-03, 2e+01]
  QMatrix range    [1e+00, 1e+00]
  QLMatrix range   [1e+00, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [5e+00, 5e+00]
  RHS range        [5e+00, 1e+02]
  QRHS range       [1e+00, 1e+00]
Presolve removed 28 rows and 49 columns

Continuous model is non-convex -- solving as a MIP

Presolve removed 34 rows and 163 columns
Presolve time: 0.15s
Presolved: 11 rows, 6 columns, 26 nonzeros
Presolved model has 3 bilinear constraint(s)
         in product terms.
         Presolve was not able to compute smaller bounds for these variables.
         Conside

In [22]:
#### know the status of the model - 2 a optimal solution was founded
# docu: https://www.gurobi.com/documentation/current/refman/optimization_status_codes.html#sec:StatusCodes
m.Status

2

In [23]:
# get optimal values and save in a dataframe
######## create a dataframe with set as index
solution = pd.DataFrame(index = index_set_time)

######################## save optimal values - features of models (only the features) ########################

# model
solution["var_Z1"] = var_Z1.gppd.X
solution["var_X2"] = var_X2.gppd.X


######################## save optimal values - targets of models (some targets are features of the model of the next step) ########################
solution["var_Y2"] = var_Y2.gppd.X  # model


######################## # get value objetive function ########################
opt_objetive_function = m.ObjVal

In [24]:
# show value objetive function
opt_objetive_function

0.0

In [25]:
# show value decision variables
solution

Unnamed: 0,var_Z1,var_X2,var_Y2
t0,4.156942,20.466896,0.0
t1,4.156942,20.466896,0.0
t2,4.156942,20.466896,0.0
t3,4.156942,20.466896,0.0
t4,4.156942,20.466896,0.0
t5,4.156942,20.466896,0.0
t6,0.0,20.294378,0.0
