# Integrate a linear regression in a Gurobi model

We take the model from Janos example:

$
\begin{align}
&\max \sum y_i \\
&\text{subject to:}\\
&\sum x_i \le 100,\\
&y_i = g(x_i, \psi),\\
& 0 \le x \le 2.5.
\end{align}
$

Where, $\psi$ is a vector of fixed features. And $g$ is an affine function computed using the  linear regression of scikit-learn.

Note that differently to Janos, we scale the feature corresponding to $x$ for the linear regression.

In [1]:
import gurobipy as gp
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression

In [2]:
# classify our features between the ones that are fixed and the ones that will be
# part of the optimization problem

fixed_features = ['SAT', 'GPA']
opt_features = ['scholarship']
features = fixed_features + opt_features

### Do our linear regression

Note that we scale the features that are fixed and the ones that will be controled
by variables in the optimization problem separately.
This is to make the formulation simpler afterwards.

In [3]:
historical_data = pd.read_csv(
    'data/college_student_enroll-s1-1.csv', index_col=0)
X = historical_data.loc[:, features]
Y = historical_data.loc[:, 'enroll']
scale_fixfeat = StandardScaler()
scale_optfeat = StandardScaler()
X.loc[:, fixed_features] = scale_fixfeat.fit_transform(
    historical_data.loc[:, fixed_features])
X.loc[:, opt_features] = scale_optfeat.fit_transform(
    historical_data.loc[:, opt_features])

regression = LinearRegression()
regression.fit(X=X, y=Y)

LinearRegression()

In [4]:
# Get the indices of the fixed features and optimal features
fixed_idx = X.columns.get_indexer(fixed_features)
opt_idx = X.columns.get_indexer(opt_features)

### Now start with the optimization model

- Read in our data
- add the x and y variables and the regular matrix constraints

In [5]:
studentsdata = pd.read_csv('data/admissions500.csv', index_col=0)
assert (studentsdata.columns.get_indexer(fixed_features) == fixed_idx).all()

nstudents = studentsdata.shape[0]

m = gp.Model()

x = m.addMVar(nstudents, lb=0, ub=2.5, name=[
              'x[{}]'.format(n) for n in studentsdata.index])
y = m.addMVar(nstudents, lb=-1e50,
              name=['y[{}]'.format(n) for n in studentsdata.index])

m.setObjective(y.sum(), gp.GRB.MAXIMIZE)
m.addConstr(x.sum() <= 100)

Using license file /Users/bonami/gurobi.lic


<(1,) matrix constraint>

### Add the constraint corresponding to the linear regression

- retrieve coefficients of the linear regression
- Add the scaled x variables
- Scale the remainder of the feature
- Add the constraint

In [6]:
# coefficients of the regression
w = regression.coef_
w0 = regression.intercept_

# Add scaled variables corresponding to x
bounds = scale_optfeat.transform(np.array([[0], [2.5]]))[:, -1]
xscale = m.addMVar(nstudents, lb=bounds[0], ub=bounds[1], name=[
                   'x_scale[{}]'.format(n) for n in studentsdata.index])
m.addConstr(x - xscale*scale_optfeat.scale_ == scale_optfeat.mean_)

# Scale remainder of features
A = scale_fixfeat.transform(studentsdata.loc[:, fixed_features].to_numpy())

# Constraints defining y in terms of the linear regression
m.addConstr(y == A@w[fixed_idx] + w[opt_idx]*xscale + w0)

<(500,) matrix constraint *awaiting model update*>

### Finally optimize it

In [7]:
m.optimize()

Gurobi Optimizer version 9.1.2 build v9.1.2rc0 (mac64)
Thread count: 4 physical cores, 8 logical processors, using up to 8 threads
Optimize a model with 1001 rows, 1500 columns and 2500 nonzeros
Model fingerprint: 0x5c187418
Coefficient statistics:
  Matrix range     [1e-01, 1e+00]
  Objective range  [1e+00, 1e+00]
  Bounds range     [1e+00, 2e+00]
  RHS range        [1e-03, 1e+02]
Presolve removed 1001 rows and 1500 columns
Presolve time: 0.01s
Presolve: All rows and columns removed
Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    2.0494642e+02   0.000000e+00   0.000000e+00      0s

Solved in 0 iterations and 0.01 seconds
Optimal objective  2.049464200e+02


Copyright © 2020 Gurobi Optimization, LLC