https://www.datascience.com/blog/7-methods-to-fit-linear-model-python

# Several Methods for Multi-Linear Regression in Python

In [None]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
%matplotlib inline

## Read in Data (Advertising Data Set)

In [2]:
# read data into pandas DataFrame
data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0)
data.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


## Using Matrix Method

See [Wikipedia page](https://en.wikipedia.org/wiki/Ordinary_least_squares) under the Linear Model section and Matrix/vector formulation subsection. 

${\displaystyle {\hat {\boldsymbol {\beta }}}=(\mathbf {X} ^{\rm {T}}\mathbf {X} )^{-1}\mathbf {X} ^{\rm {T}}\mathbf {y} .}$

${\displaystyle \mathbf {X} ={\begin{bmatrix}X_{11}&X_{12}&\cdots &X_{1p}\\X_{21}&X_{22}&\cdots &X_{2p}\\\vdots &\vdots &\ddots &\vdots \\X_{n1}&X_{n2}&\cdots &X_{np}\end{bmatrix}},\qquad {\boldsymbol {\beta }}={\begin{bmatrix}\beta _{1}\\\beta _{2}\\\vdots \\\beta _{p}\end{bmatrix}},\qquad \mathbf {y} ={\begin{bmatrix}y_{1}\\y_{2}\\\vdots \\y_{n}\end{bmatrix}}.}$

In [114]:
X = np.matrix([ np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))
              ]).T    


y = np.asarray(data.sales)

In [115]:
beta = ((X.T * X)**-1) * X.T * np.vstack(y)
beta

matrix([[ 4.57646455e-02],
        [ 1.88530017e-01],
        [-1.03749304e-03],
        [ 2.93888937e+00]])

## Numpy's linalg.lstsq

https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.linalg.lstsq.html

In [117]:
A = np.matrix([ np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper),
                np.ones(len(data))
              ]).T

B = np.asarray(data.sales)


In [119]:
A.shape  # (200, 2)
B.shape  # (200,)

np.linalg.lstsq(a=A, b=B)

  after removing the cwd from sys.path.


(array([ 4.57646455e-02,  1.88530017e-01, -1.03749304e-03,  2.93888937e+00]),
 array([556.8252629]),
 4,
 array([2455.16525463,  398.53142158,  194.499542  ,    5.40339101]))

## Statsmodels.ols
http://www.statsmodels.org/dev/index.html

In [120]:
import statsmodels.formula.api as smf

In [121]:
data.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


In [122]:
result = smf.ols(formula='sales ~ TV + radio + newspaper', data=data).fit()

In [123]:
result.params

Intercept    2.938889
TV           0.045765
radio        0.188530
newspaper   -0.001037
dtype: float64

In [124]:
print(result.summary())

                            OLS Regression Results                            
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.3
Date:                Fri, 15 Mar 2019   Prob (F-statistic):           1.58e-96
Time:                        09:45:23   Log-Likelihood:                -386.18
No. Observations:                 200   AIC:                             780.4
Df Residuals:                     196   BIC:                             793.6
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9389      0.312      9.422      0.0

In [125]:
print(result.summary2())

                 Results: Ordinary least squares
Model:              OLS              Adj. R-squared:     0.896   
Dependent Variable: sales            AIC:                780.3622
Date:               2019-03-15 09:45 BIC:                793.5555
No. Observations:   200              Log-Likelihood:     -386.18 
Df Model:           3                F-statistic:        570.3   
Df Residuals:       196              Prob (F-statistic): 1.58e-96
R-squared:          0.897            Scale:              2.8409  
------------------------------------------------------------------
                Coef.   Std.Err.     t     P>|t|    [0.025  0.975]
------------------------------------------------------------------
Intercept       2.9389    0.3119   9.4223  0.0000   2.3238  3.5540
TV              0.0458    0.0014  32.8086  0.0000   0.0430  0.0485
radio           0.1885    0.0086  21.8935  0.0000   0.1715  0.2055
newspaper      -0.0010    0.0059  -0.1767  0.8599  -0.0126  0.0105
--------------------

## Scikit-learn
Don't need the `np.ones(len(data))` here.

In [126]:
from sklearn.linear_model import LinearRegression

In [127]:
linreg = LinearRegression( )

X = data.loc[:, ['TV', 'radio', 'newspaper']]
y = np.asarray(data.sales)

In [128]:
X.shape, y.shape

((200, 3), (200,))

In [129]:
result = linreg.fit(X, y) 

In [130]:
# print the intercept and coefficients
print(result.intercept_)
print(result.coef_)

2.9388893694594085
[ 0.04576465  0.18853002 -0.00103749]


```



```
Those that do not have multivairate linear regression capabilities:

- Scipy's polyfit with deg=1 (first order polynomial is a line)
- Numpy's polyfit (same as scipy's)
- Scipy's linregress under scipy.stats.linregress
- Scipy's curve_fit under scipy.optimize.curve_fit