# Advanced Linear Regression (statsmodels)

### Import Libraries

In [1]:
import numpy as np
import statsmodels.api as sm

### Define Data and Transform Input

In [2]:
x = [[0,1],[5,1],[15,2],[25,5],[35,11],[45,15],[55,34],[60,35]]
y = [4,5,20,14,32,22,38,43]
x,y = np.array(x), np.array(y)

### Add Column of 1's to Calculate the Intercept(b0)

In [3]:
x = sm.add_constant(x) # Intercept
print(f'x : \n{x}')
print(f'\ny : {y}')

x : 
[[ 1.  0.  1.]
 [ 1.  5.  1.]
 [ 1. 15.  2.]
 [ 1. 25.  5.]
 [ 1. 35. 11.]
 [ 1. 45. 15.]
 [ 1. 55. 34.]
 [ 1. 60. 35.]]

y : [ 4  5 20 14 32 22 38 43]


### Create and Train Model

**OLS** : Ordinary Least Squares 

**Important** : First **argument** is the Dependent Variable (Output)

In [4]:
model = sm.OLS(y,x).fit()

In [5]:
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.862
Model:                            OLS   Adj. R-squared:                  0.806
Method:                 Least Squares   F-statistic:                     15.56
Date:                Sun, 14 Feb 2021   Prob (F-statistic):            0.00713
Time:                        11:54:51   Log-Likelihood:                -24.316
No. Observations:                   8   AIC:                             54.63
Df Residuals:                       5   BIC:                             54.87
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          5.5226      4.431      1.246      0.2



**Adjusted R<sup>2</sup>** : R<sup>2</sup> Corrected according to Number of Input Features.

In [6]:
print(f'Coefficient of Determination : {model.rsquared}')
print(f'Adjusted Coefficient of Determination : {model.rsquared_adj}')
print(f'Regression Coefficients : {model.params}')

Coefficient of Determination : 0.8615939258756777
Adjusted Coefficient of Determination : 0.8062314962259488
Regression Coefficients : [5.52257928 0.44706965 0.25502548]


### Predictions
Two Ways to get Prediction Result.

In [7]:
print(f'Predictions : \n{model.fittedvalues}\n')
print(f'Predictions : \n{model.predict(x)}')

Predictions : 
[ 5.77760476  8.012953   12.73867497 17.9744479  23.97529728 29.4660957
 38.78227633 41.27265006]

Predictions : 
[ 5.77760476  8.012953   12.73867497 17.9744479  23.97529728 29.4660957
 38.78227633 41.27265006]


### Predictions on New Data

In [8]:
x_new = sm.add_constant(np.arange(10).reshape((-1,2)))
print(x_new)

[[1. 0. 1.]
 [1. 2. 3.]
 [1. 4. 5.]
 [1. 6. 7.]
 [1. 8. 9.]]


In [9]:
y_new = model.predict(x_new)
print(f'Predictions : {y_new}')

Predictions : [ 5.77760476  7.18179502  8.58598528  9.99017554 11.3943658 ]
