This exercise explains how we can use linear optimization to find the optimized value of the coefficient and the intercept of the regression line. 

We will also use the scikit learn package to compare the regression coefficient's value obtained using linear programming. 

We will use the pulp package to solve the set of linear optimization. 

More description of the problem is given in the readme file.


In [7]:
!pip install pulp
import pandas as pd
import numpy as np
from sklearn import linear_model



Loading Data - We have used a stock market data, where we are assuming Stock_Index_Price = f(Interest_Rate , Unemployment_Rate)
The general equation will be given by:
Stock Index = b1 * Interest Rate +  b2 * Unemployment Rate + b0

In [8]:
Stock_Market = {'Year': [2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016],
                'Month': [12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1],
                'Interest_Rate': [2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75],
                'Unemployment_Rate': [5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1],
                'Stock_Index_Price': [1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,971,949,884,866,876,822,704,719]        
                }
df = pd.DataFrame(Stock_Market,columns=['Year','Month','Interest_Rate','Unemployment_Rate','Stock_Index_Price']) 
X = df[['Interest_Rate','Unemployment_Rate']] 
Y = df['Stock_Index_Price']
X = X.to_numpy()
Y = Y.to_numpy()
df

Unnamed: 0,Year,Month,Interest_Rate,Unemployment_Rate,Stock_Index_Price
0,2017,12,2.75,5.3,1464
1,2017,11,2.5,5.3,1394
2,2017,10,2.5,5.3,1357
3,2017,9,2.5,5.3,1293
4,2017,8,2.5,5.4,1256
5,2017,7,2.5,5.6,1254
6,2017,6,2.5,5.5,1234
7,2017,5,2.25,5.5,1195
8,2017,4,2.25,5.5,1159
9,2017,3,2.25,5.6,1167


### Regression Coefficient using Scikit Learn:

In [9]:
from sklearn.linear_model import LinearRegression
lr = LinearRegression().fit(X,Y)

In [27]:
print("Value obtained using Scikit Learn \n\nb1:{} \nb2:{} \nb0:{}".format(lr.coef_[0],lr.coef_[1],lr.intercept_))

Value obtained using Scikit Learn 

b1:345.54008701056574 
b2:-250.14657136938058 
b0:1798.4039776258546


### Using linear optimization to minimize the sum of absotule deviations:

In [11]:
pip install pulp



In [12]:
import pulp as plp
from pulp import *

In [28]:
prob = LpProblem(name = "LP_Sum of Absotule Deviations:")
prob.sense = LpMinimize



In [29]:
b0 = LpVariable(name = "b0", cat = LpContinuous)
b1 = LpVariable(name = "b1", cat = LpContinuous)
b2 = LpVariable(name = "b2", cat = LpContinuous)
Z = {i: LpVariable(name = "Z_{}".format(i), cat = LpContinuous) for i in range(24)}

In [30]:
for i in range(24):
    prob += Z[i] >= Stock_Market['Stock_Index_Price'][i] - b0 - b1*Stock_Market['Interest_Rate'][i] - b2*Stock_Market['Unemployment_Rate'][i]
for i in range(24):
    prob += Z[i] >= -(Stock_Market['Stock_Index_Price'][i] - b0 - b1*Stock_Market['Interest_Rate'][i] - b2*Stock_Market['Unemployment_Rate'][i])

In [31]:
Objective = lpSum(Z[i] for i in range(24))
prob.setObjective(Objective)
prob.solve()

1

In [32]:
print("Objective Function Value = ", value(prob.objective) )
print("\n")
print("Optimal Solution:")
for v in [b1,b2,b0]:
    if v in [b1,b0,b2]:
        print(v.name, "=", v.varValue)

Objective Function Value =  1226.6


Optimal Solution:
b1 = 348.0
b2 = -218.0
b0 = 1604.8
