# Linear Regression

**References:**
- https://www.coursera.org/learn/guided-tour-machine-learning-finance/notebook/7Z9sN/linear-regression

## Least-Squares solution

Normal equation

\begin{equation}
    \mathbf{\hat{\beta}}
    =
    \left( \mathbf{X}^{T} \mathbf{X} \right)^{-1} \mathbf{X}^{T} \mathbf{y}
\end{equation}

## Linear Regression on data with linear features

\begin{equation}
    y(x)
    =
    a + b_1 \cdot X_1 + b_2 \cdot X_2 + b_3 \cdot X_3 + \sigma \cdot \varepsilon
\end{equation}

where $ \varepsilon \sim N(0, 1) $ is a Gaussian noise, and $ \sigma $ is its volatility, 
with the following choice of parameters:

- $ a = 1.0 $

- $ b_1, b_2, b_3 = (0.5, 0.2, 0.1) $

- $ \sigma = 0.1 $

- $ X_1, X_2, X_3 $ will be uniformally distributed in $ [-1,1] $

## Linear Regression on data with non-linear features

\begin{equation}
    y(x)
    =
    a + w_{00} \cdot X_1 + w_{01} \cdot X_2 + w_{02} \cdot X_3 + + w_{10} \cdot X_1^2 
+ w_{11} \cdot X_2^2 + w_{12} \cdot X_3^2 +  \sigma \cdot \varepsilon
\end{equation}

where

- $ w = [[1.0, 0.5, 0.2],[0.5, 0.3, 0.15]]  $

- and the rest of parameters is as above, with the same values of $ X_i $

In [60]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
import tensorflow as tf
from tensorflow.python.layers import core as core_layers
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
%matplotlib inline

In [4]:
def reset_graph(seed=42):
    """
    Utility function to reset current tensorflow computation graph
    and set the random seed 
    """
    # to make results reproducible across runs
    tf.reset_default_graph()
    tf.set_random_seed(seed)
    np.random.seed(seed)

In [13]:
def generate_data(n_points=10000, n_features=3, use_nonlinear_data=True, 
                    noise_std=0.1, train_test_split = 4):
    """
    Arguments:
    n_points - number of data points to generate
    n_features - a positive integer - number of features
    use_nonlinear - if True, generate non-linear data
    train_test_split - an integer - what portion of data to use for testing
    
    Return:
    X_train, Y_train, X_test, Y_test, n_train, n_features
    """
    
    # Linear data or non-linear data?
    if use_nonlinear_data:
        weights = np.array([[1.0, 0.5, 0.2],[0.5, 0.3, 0.15]])
    else:
        weights = np.array([1.0, 0.5, 0.2])
        
    
    bias =   np.ones(n_points).reshape((-1,1))
    low  = - np.ones((n_points,n_features),'float')
    high =   np.ones((n_points,n_features),'float')
        
    np.random.seed(42)
    X = np.random.uniform(low=low, high=high)
    
    np.random.seed(42)
    noise = np.random.normal(size=(n_points, 1))
    noise_std = 0.1
    
    if use_nonlinear_data:
        Y = (weights[0,0] * bias + np.dot(X, weights[0, :]).reshape((-1,1)) + 
             np.dot(X*X, weights[1, :]).reshape([-1,1]) +
             noise_std * noise)
    else:
        Y = (weights[0] * bias + np.dot(X, weights[:]).reshape((-1,1)) + 
             noise_std * noise)
    
    n_test = int(n_points/train_test_split)
    n_train = n_points - n_test
    
    X_train = X[:n_train,:]
    Y_train = Y[:n_train].reshape((-1,1))

    X_test = X[n_train:,:]
    Y_test = Y[n_train:].reshape((-1,1))
    
    return X_train, Y_train, X_test, Y_test, n_train, n_features

In [52]:
X_train, Y_train, X_test, Y_test, n_train, n_features = generate_data(use_nonlinear_data=False)
X_train.shape, Y_train.shape

((7500, 3), (7500, 1))

In [53]:
X_train = np.c_[np.ones(len(X_train)) , X_train ]

## Linear regression with `numpy`

In [59]:
betahat_np = np.matmul(np.matmul( np.linalg.inv(np.matmul(X_train.T,X_train)), X_train.T), Y_train)
betahat_np

array([[ 0.99946227],
       [ 0.99579039],
       [ 0.499198  ],
       [ 0.20019798]])

Test

In [54]:
X_train.shape

(7500, 4)

In [55]:
X_train.T.shape

(4, 7500)

In [56]:
X_train.T @ X_train

array([[  7.50000000e+03,   7.63529630e+01,   1.10196223e+01,
         -4.91540004e+01],
       [  7.63529630e+01,   2.46507830e+03,   6.49185120e+00,
          2.66539829e+01],
       [  1.10196223e+01,   6.49185120e+00,   2.53214517e+03,
         -2.81560831e+01],
       [ -4.91540004e+01,   2.66539829e+01,  -2.81560831e+01,
          2.51282096e+03]])

In [57]:
X_train

array([[ 1.        , -0.25091976,  0.90142861,  0.46398788],
       [ 1.        ,  0.19731697, -0.68796272, -0.68801096],
       [ 1.        , -0.88383278,  0.73235229,  0.20223002],
       ..., 
       [ 1.        , -0.41969273,  0.7395476 ,  0.49515624],
       [ 1.        , -0.15112479,  0.42105835,  0.73537775],
       [ 1.        ,  0.95130426,  0.90470301, -0.34934435]])

## Linear Regression with `sklearn`

In [72]:
help(LinearRegression)

Help on class LinearRegression in module sklearn.linear_model.base:

class LinearRegression(LinearModel, sklearn.base.RegressorMixin)
 |  Ordinary least squares Linear Regression.
 |  
 |  Parameters
 |  ----------
 |  fit_intercept : boolean, optional
 |      whether to calculate the intercept for this model. If set
 |      to false, no intercept will be used in calculations
 |      (e.g. data is expected to be already centered).
 |  
 |  normalize : boolean, optional, default False
 |      If True, the regressors X will be normalized before regression.
 |      This parameter is ignored when `fit_intercept` is set to False.
 |      When the regressors are normalized, note that this makes the
 |      hyperparameters learnt more robust and almost independent of the number
 |      of samples. The same property is not valid for standardized data.
 |      However, if you wish to standardize, please use
 |      `preprocessing.StandardScaler` before calling `fit` on an estimator
 |      with

In [108]:
lr = LinearRegression(fit_intercept=False)

In [109]:
lr.fit(X_train, Y_train)

LinearRegression(copy_X=True, fit_intercept=False, n_jobs=1, normalize=False)

In [110]:
betahat_sklearn = lr.coef_
betahat_sklearn

array([[ 0.99946227,  0.99579039,  0.499198  ,  0.20019798]])

## Linear Regression with `tensorflow`

In [None]:
theta = 

In [None]:
with tf.Session() as sess:
    theta_value = theta.eval()

In [112]:
x = tf.Variable(3, name='x')

In [118]:
y = tf.Variable(1)

In [119]:
x.graph

<tensorflow.python.framework.ops.Graph at 0x7fddc78f5b70>

In [120]:
y.graph

<tensorflow.python.framework.ops.Graph at 0x7fddc78f5b70>

In [113]:
x.name

'x:0'

In [116]:
y.name

'Variable:0'