https://www.datascience.com/blog/7-methods-to-fit-linear-model-python

# Several Methods for Multi-Linear Regression in Python

Reorganize:

start with truly linear least squares then polyfits, then scipy optimize curve fit then scipy optimize any funtion. 

OR BY package

In [166]:
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
# allow plots to appear directly in the notebook
%matplotlib inline

## read in data

In [167]:
# read data into a DataFrame
data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0)
data.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


## Scikit-learn

In [168]:
from sklearn.linear_model import LinearRegression

linreg = LinearRegression( )

X = data.loc[:, ['TV', 'radio', 'newspaper']]
y = np.asarray(data.sales)

result = linreg.fit(X, y) 

# print the intercept and coefficients
print(result.intercept_)
print(result.coef_)

2.9388893694594085
[ 0.04576465  0.18853002 -0.00103749]


Just like numpy.linalg.lstsq, the X needs to be 2D, so we attached a second column of ones. 

## Using Matrices/Vector Algebra

See [Wikipedia page](https://en.wikipedia.org/wiki/Ordinary_least_squares) under the Linear Model section and Matrix/vector formulation subsection. 

${\displaystyle {\hat {\boldsymbol {\beta }}}=(\mathbf {X} ^{\rm {T}}\mathbf {X} )^{-1}\mathbf {X} ^{\rm {T}}\mathbf {y} .}$

${\displaystyle \mathbf {X} ={\begin{bmatrix}X_{11}&X_{12}&\cdots &X_{1p}\\X_{21}&X_{22}&\cdots &X_{2p}\\\vdots &\vdots &\ddots &\vdots \\X_{n1}&X_{n2}&\cdots &X_{np}\end{bmatrix}},\qquad {\boldsymbol {\beta }}={\begin{bmatrix}\beta _{1}\\\beta _{2}\\\vdots \\\beta _{p}\end{bmatrix}},\qquad \mathbf {y} ={\begin{bmatrix}y_{1}\\y_{2}\\\vdots \\y_{n}\end{bmatrix}}.}$

In [169]:
# STILL NEED AN ADDITIONAL COLUMN OF 1'S HERE. 
X = np.matrix([ np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))]).T    # shape (200, 4)


y = np.asarray(data.sales)

beta = ((X.T * X)**-1) * X.T * np.vstack(y)
beta

matrix([[ 4.57646455e-02],
        [ 1.88530017e-01],
        [-1.03749304e-03],
        [ 2.93888937e+00]])

## 1. Scipy's polyfit, 1st order polynomial is a line.

    scipy.polyfit(x = X , y = y , deg=1)

Scipy's polyfit will not work for multi-variate linear regression because the x array is expected to be a 1D vector. Trying to pass an array of multiple dimensions, or a matrix, will return the following error: 
    
    TypeError: expected 1D vector for x
    
If you look at the bottom of the docstring for this function, you will see a path to a file called "polynomial.py" that is located within the numpy package. This means that scipy is using numpy.polyfit, which tells us there is no point in trying the numpy.polyfit either. 

    ~/anaconda3/lib/python3.7/site-packages/numpy/lib/polynomial.py

In [170]:
import scipy

In [177]:
scipy.polyfit?

In [174]:
# STILL NEED AN ADDITIONAL COLUMN OF 1'S HERE. 
X = np.matrix([ np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))]).T    # shape (200, 4)

y = np.asarray(data.sales)

scipy.polyfit(x = X , y = y , deg=1)

TypeError: expected 1D vector for x

In [175]:
X = np.vstack( [np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))] ).T

y = np.asarray(data.sales)

scipy.polyfit(x = X , y = y , deg=1)

TypeError: expected 1D vector for x

## 2. Numpy's polyfit

scipy.polyfit uses numpy.polyfit

Since scipy's won't work for multi-variate linear regression, neither will numpy's. 

## 3. Scipy's linregress
scipy.stats.linregress does not have multi-variable linear regression capabilitiy. 

## 4. Scipy's optimize.curve_fit

In [79]:
import scipy

In [96]:
scipy.optimize.curve_fit?

In [120]:
def line(x, m, b):
    x = np.asarray(x)
    return m*x+b

line_v = np.vectorize(line)

In [139]:
# np.ones(len(data))
# A = np.asarray(list(zip(    np.asarray(data.TV), 
#                             np.asarray(data.radio), 
#                             np.asarray(data.newspaper), 
#                             np.ones(len(data))    )))

# A = np.vstack( [np.asarray(data.TV), 
#                 np.asarray(data.radio), 
#                 np.asarray(data.newspaper), 
#                 np.ones(len(data)) ] ).T

# STILL NEED AN ADDITIONAL COLUMN OF 1'S HERE. 
A = np.matrix([ np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))]).T    # shape (200, 4)



B = np.asarray(data.sales)

In [140]:
A.shape, B.shape

((200, 4), (200,))

In [141]:
scipy.optimize.curve_fit(f=line, xdata=A, ydata= B)

ValueError: operands could not be broadcast together with shapes (200,4) (200,) 

In [142]:
scipy.optimize.curve_fit(f=line, xdata=A, ydata= np.vstack(B))

ValueError: object too deep for desired array

error: Result from function call is not a proper array of floats.

In [151]:
scipy.optimize.curve_fit(f=line, 
                         xdata = np.matrix([data.TV, data.radio, data.newspaper]), 
                         ydata = np.vstack( np.asarray(data.sales) ), 
                         p0 = np.asarray([1,1]))

ValueError: operands could not be broadcast together with shapes (3,200) (200,1) 

In [143]:
# with initial estimates
scipy.optimize.curve_fit(f=line, xdata=data.TV, ydata=data.sales, p0 = np.asarray([1,1]))

(array([0.04753664, 7.03259358]), array([[ 7.23936702e-06, -1.06449463e-03],
        [-1.06449463e-03,  2.09620159e-01]]))

## 5. Numpy's linalg.lstsq

https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.linalg.lstsq.html

Both examples of A work. 

In [74]:
A = np.vstack( [np.asarray(data.TV), 
                np.asarray(data.radio), 
                np.asarray(data.newspaper), 
                np.ones(len(data))] ).T

In [75]:
B = np.asarray(data.sales)

In [76]:
A.shape, B.shape

((200, 4), (200,))

In [77]:
np.linalg.lstsq(a=A, b=B)

  """Entry point for launching an IPython kernel.


(array([ 4.57646455e-02,  1.88530017e-01, -1.03749304e-03,  2.93888937e+00]),
 array([556.8252629]),
 4,
 array([2455.16525463,  398.53142158,  194.499542  ,    5.40339101]))

## Statsmodels.ols
http://www.statsmodels.org/dev/index.html

In [47]:
import statsmodels.formula.api as smf

In [48]:
?smf.ols

In [52]:
data.head()

Unnamed: 0,TV,radio,newspaper,sales
1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9


In [66]:
result = smf.ols(formula='sales ~ TV + radio + newspaper', data=data).fit()

In [67]:
result.params

Intercept    2.938889
TV           0.045765
radio        0.188530
newspaper   -0.001037
dtype: float64

In [68]:
print(result.summary())

                            OLS Regression Results                            
Dep. Variable:                  sales   R-squared:                       0.897
Model:                            OLS   Adj. R-squared:                  0.896
Method:                 Least Squares   F-statistic:                     570.3
Date:                Fri, 11 Jan 2019   Prob (F-statistic):           1.58e-96
Time:                        16:23:27   Log-Likelihood:                -386.18
No. Observations:                 200   AIC:                             780.4
Df Residuals:                     196   BIC:                             793.6
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      2.9389      0.312      9.422      0.0

In [69]:
print(result.summary2())

                 Results: Ordinary least squares
Model:              OLS              Adj. R-squared:     0.896   
Dependent Variable: sales            AIC:                780.3622
Date:               2019-01-11 16:23 BIC:                793.5555
No. Observations:   200              Log-Likelihood:     -386.18 
Df Model:           3                F-statistic:        570.3   
Df Residuals:       196              Prob (F-statistic): 1.58e-96
R-squared:          0.897            Scale:              2.8409  
------------------------------------------------------------------
                Coef.   Std.Err.     t     P>|t|    [0.025  0.975]
------------------------------------------------------------------
Intercept       2.9389    0.3119   9.4223  0.0000   2.3238  3.5540
TV              0.0458    0.0014  32.8086  0.0000   0.0430  0.0485
radio           0.1885    0.0086  21.8935  0.0000   0.1715  0.2055
newspaper      -0.0010    0.0059  -0.1767  0.8599  -0.0126  0.0105
--------------------