<a href="https://colab.research.google.com/github/dhan16/colabs/blob/master/ml/Linear_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Linear Regression: Theory







Given data of n observations {**x<sub>i</sub>**, y<sub>i</sub>}<sub>i=1:n</sub> with y<sub>i</sub> a scalar response and **x<sub>i</sub>** a column vector of size p:
* y<sub>i</sub> = β<sub>1</sub>x<sub>i1</sub> + ... + β<sub>p</sub>x<sub>ip</sub> + ε<sub>i</sub> = x<sub>i</sub><sup>T</sup>β + ε<sub>i</sub>

* or, in vector form: y<sub>i</sub> = **x<sub>i</sub>**<sup>T</sup>**β** + ε<sub>i</sub>

* or, stacking the n equations together in matrix notation: **y** = X**β** + **ε**

### Ordinary Least Squares Solution
Find the coefficients **β** which fit the equations "best", **$\hat{β}$** = arg min S(**β**), where

* S(**β**) = ||**y** - X**β**||<sup>2</sup>

* The solution is: **$\hat{β}$** = (X<sup>T</sup>X)<sup>-1</sup> X<sup>T</sup>**y**

where (X<sup>T</sup>X)<sup>-1</sup> X<sup>T</sup> is  the Moore–Penrose pseudoinverse matrix of X

### Reference
* https://en.wikipedia.org/wiki/Ordinary_least_squares#Linear_model


## Visualisation and data functions

In [None]:
#@title plot2D() plot3D() show()
from matplotlib.font_manager import X11FontDirectories
import matplotlib.pyplot as plt

def plot2D(x, y, beta):
  plt.plot(x, y, 'o', label='data')
  xx = np.linspace(min(x), max(x), 101)
  yy = beta[0] + beta[1]*xx
  plt.plot(xx, yy, label='least squares fit, $y = a + bx$')
  plt.xlabel('x')
  plt.ylabel('y')
  plt.legend(framealpha=1, shadow=True)
  plt.grid(alpha=0.25)
  plt.show()


def scale(x, a, b):
  '''Scale numpy array to be between min a and max b'''
  range = np.amax(x) - np.amin(x)
  return x / range * (b - a) + a


def plot3D(x, y, beta):
  # https://www.kaggle.com/code/spidy20/3d-visualization-of-multiple-linear-regression/notebook
  # https://gist.github.com/aricooperdavis/c658fc1c5d9bdc5b50ec94602328073b
  fig = plt.figure(figsize=(10,10))
  ax = fig.add_subplot(111, projection='3d')
  ax.set_xlabel("X1")
  ax.set_ylabel("X2")
  ax.set_zlabel("y")
  ax.scatter(x[:,0], x[:,1], y, marker='.', color='red')
  
  x1_min, x1_max = min(x[:,0]), max(x[:,0])
  x2_min, x2_max = min(x[:,1]), max(x[:,1])
  xs = scale(np.tile(np.arange(101), (101,1)), x1_min, x1_max)
  ys = scale(np.tile(np.arange(101), (101,1)).T, x2_min, x2_max)
  zs = beta[0] + xs*beta[1]+ ys*beta[2]
  ax.plot_surface(xs,ys,zs, alpha=0.3)
  plt.show()


def show(x, y, beta):
  print(beta)

  if len(beta) == 2:
    plot2D(x, y, beta)
  elif len(beta) == 3:
    plot3D(x, y, beta)

In [None]:
#@title randomLinearData() make_regression()
# https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/validation_and_test_sets.ipynb?utm_source=mlcc&utm_campaign=colab-external&utm_medium=referral&utm_content=validation_tf2-colab&hl=en
import numpy as np

def randomLinearData(beta, n):
  np.random.seed(100) # seed random number generator
  x = 1000 * np.random.rand(n)
  e = 100 * np.random.rand(len(x))
  y = beta[0] + x*beta[1] + e
  return ([[e] for e in x], y)

from sklearn import datasets
def make_regression(n_samples, n_features, intercept=10):
  return datasets.make_regression(n_samples=n_samples,#number of samples
                                      n_features=n_features,#number of features
                                      n_informative=n_features,#number of useful features 
                                      bias=intercept,
                                      noise=10,#bias and standard deviation of the guassian noise
                                      coef=True,#true coefficient used to generated the data
                                      random_state=0) #set for same data points for each run

## Linear Regression:  OLS methods

1. **$\hat{β}$** = (X<sup>T</sup>X)<sup>-1</sup> X<sup>T</sup>**y**. https://cmdlinetips.com/2020/03/linear-regression-using-matrix-multiplication-in-python-using-numpy/
2. **$\hat{β}$** = X<sup>pseudo-inverse</sup> **y**
3. scipy.linalg.lstsq https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html#scipy.linalg.lstsq
4. sklearn LinearRegression https://cmdlinetips.com/2020/03/linear-regression-using-matrix-multiplication-in-python-using-numpy/


In [None]:
import numpy as np
from scipy.linalg import lstsq
from sklearn.linear_model import LinearRegression

''' Create X_mat with first column as 1's from x'''
def make_X_mat(x):
  X_mat=np.vstack((np.ones(len(x)), np.array(x).T)).T
  # print(X_mat)
  return X_mat

# 1. https://cmdlinetips.com/2020/03/linear-regression-using-matrix-multiplication-in-python-using-numpy/
def ols_with_inverse(X_mat, y):
  return np.linalg.inv(X_mat.T.dot(X_mat)).dot(X_mat.T).dot(y)

# 2.
def ols_with_pseudoinverse(X_mat, y):
  return np.linalg.pinv(X_mat).dot(y)

# 3. https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html#scipy.linalg.lstsq
def ols_with_numpy_lstsq(X_mat, y):
  beta_hat, res, rnk, s = lstsq(X_mat, y)
  return beta_hat
  
# 4. sklearn.linear_model.LinearRegression https://cmdlinetips.com/2020/03/linear-regression-using-matrix-multiplication-in-python-using-numpy/
def ols_with_sklearn(X_mat, y):
  lr = LinearRegression().fit(x, y)
  beta_hat = np.insert(lr.coef_, 0, lr.intercept_, axis=0)
  return beta_hat

def show_olss(x, y, inv=False, pinv=False, lstsq=False, sklearn=True):
  X_mat=make_X_mat(x)
  if inv:
    show(x, y, ols_with_inverse(X_mat, y))
  if pinv:
    show(x, y, ols_with_pseudoinverse(X_mat, y))
  if lstsq:
    show(x, y, ols_with_numpy_lstsq(X_mat, y))
  if sklearn:
    show(x, y, ols_with_sklearn(X_mat, y))

In [None]:
(x, y) = randomLinearData([50, 3], 100)
show_olss(x, y, inv=True, pinv=True, lstsq=True, sklearn=True)

In [None]:
x, y, coef = make_regression(n_samples=100, n_features=1)
print(coef)
show_olss(x, y)

In [None]:
x, y, coef = make_regression(n_samples=1000, n_features=2, intercept=3)
print(coef)
show_olss(x, y)

# Linear Regression: Gradient descent



https://towardsdatascience.com/calculating-gradient-descent-manually-6d9bee09aa0b

https://www.geeksforgeeks.org/ml-mini-batch-gradient-descent-with-python/

https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931



In [None]:
def show_gd(x, y):
  show_olss(x, y)


In [None]:
x, y, coef = make_regression(n_samples=1000, n_features=10)
print(coef)
show_gd(x, y)

# Linear Regression; Gradient descent with automatic differentiation


* https://mdrk.io/introduction-to-automatic-differentiation/


## Linear Regression: TensorFlow


References

* https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/validation_and_test_sets.ipynb?utm_source=mlcc&utm_campaign=colab-external&utm_medium=referral&utm_content=validation_tf2-colab&hl=en#scrollTo=FBhNIdUatOU6

* https://colab.research.google.com/github/kaustubholpadkar/Predicting-House-Price-using-Multivariate-Linear-Regression/blob/master/Multivariate_Linear_Regression_Python.ipynb

* https://www.coursera.org/projects/regression-automatic-differentiation-tensorflow



In [None]:
#@title Run on TensorFlow 2.x
%tensorflow_version 2.x

In [None]:
#@title Import modules
import numpy as np
import pandas as pd
import tensorflow as tf
from matplotlib import pyplot as plt

pd.options.display.max_rows = 10
pd.options.display.float_format = "{:.1f}".format

# ML

## Feature Engineering

* Load data
* Shuffle
* Split into training, validation and test. 
* Scale label column to meaningful values

Features
* Bucketize, cross 
* Scale features - min max or z scale


References

* https://colab.research.google.com/github/google/eng-edu/blob/main/ml/cc/exercises/representation_with_a_feature_cross.ipynb?utm_source=mlcc&utm_campaign=colab-external&utm_medium=referral&utm_content=representation_tf2-colab&hl=en


## Define and Train Models
* create model
* train model
* plot loss curves
