<a href="https://colab.research.google.com/github/M-H-Amini/MachineLearningMiniCourse/blob/master/MLmini_LinearRegression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# In The Name Of ALLAH
# Machine Learning *mini* Course
### PythonChallenge.ir
### Mohammad Hossein Amini (mhamini@aut.ac.ir)

# Linear Regression

# Introduction

The theoretical stuff has been discussed in the video lectures. Let's implement a little...

First of all, we should import some modules.

In [0]:
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass
import tensorflow as tf
print(tf.__version__)
import numpy as np
import matplotlib.pyplot as plt
import os
import pandas as pd
import glob

# Creating Dataset

In [0]:
x = np.array([np.linspace(10, 50, 15)])
y = 2 + 1.5 * x +  10 * np.random.normal(0, 1, x.shape)
print(x.shape, y.shape)

In [0]:
plt.figure()
plt.plot(x, y, 'rx')
plt.show()

# Linear Regression (Hand Coding!)
Now, we implement our estimator just using **numpy**. In this method, we implement gradients calculation and weight updates (gradient descent) by hand!

Let's implement estimator (hypothesis) function.

In [0]:
def h(x, w):
  x = np.concatenate((np.ones((1, x.shape[1])) ,x))
  return np.dot(np.transpose(x), w)

w = np.array([[1], [0.5]])
print(h(np.array([[1]]), w))

## Visualizing Data and Estimator Result
It is exciting to see the performance with a simple function.

In [0]:
def show(x, y, w):
  predicted = np.transpose(h(x, w))
  plt.figure()
  plt.plot(x, y, 'rx')
  plt.plot(x, predicted, 'bo')
  plt.show()

w = np.array([[1.88], [1.54]])
show(x, y, w)

In [0]:
alpha = 0.0001

def train_step(x, y, w):
  delta_w = -np.dot(x, np.transpose(y) - h(x, w))
  w = w - alpha * delta_w
  return w

def cost(x, y, w):
  return float(np.dot(np.transpose(np.transpose(y) - h(x, w)), np.transpose(y) - h(x, w)) / (2*y.shape[1]))

def train(x, y, max_iters=1000, min_cost=0.1, w=None, verbose=0):
  if w is None:
    w = np.random.rand(2, 1)
  for i in range(max_iters):
    index = np.random.randint(0, x.shape[1])
    w = train_step(x[:, index:index+1], y[:, index:index+1], w)
    if cost(x, y, w) < min_cost:
      break
    if verbose:
      print('Iteration {}: W = '.format(i+1),np.transpose(w), 'Cost = ', cost(x, y, w))
  print("Training Done...")
  print("Cost: {}".format(cost(x, y, w)))
  print("w = ", w.T)
  return w

w = train(x, y, max_iters=1000, min_cost=10 ,verbose=1)

## Visualizing Performance
Let's see the result.

In [0]:
for i in range(x.shape[1]):
  print('Input: {}, Target: {}, Output: {}'.format(np.transpose(x[:,i:i+1]), np.transpose(y[:, i:i+1]), np.transpose(h(x[:, i:i+1], w))))

show(x, y, w)

# Linear Regression (Using Tensorflow 2)
Now let's use **tensorflow**. Some benefits of using tensorflow:


*   We can create more complex models in it without doing some theory stuff like finding gradients by hand!
*   Using some amazing optimizers.
*   Extensive use in deep learning.



In [0]:
X = tf.constant(x, dtype=tf.float32)
Y = tf.constant(y, dtype=tf.float32)

## Visualizing Data and Estimator Result

In [0]:
def tf_show(x, y, w):
  predicted = tf.transpose(tf_h(x, w))
  plt.figure()
  plt.plot(x, y, 'rx')
  plt.plot(x, predicted, 'bo')
  plt.show()

W = tf.Variable(np.random.rand(2, 1), dtype=tf.float32)
tf_show(X, Y, W)

In [0]:
optimizer = tf.optimizers.Adam()

def tf_h(x, w):
  x = tf.concat((tf.ones((1, x.shape[1])), x), 0)
  return tf.matmul(x, w, True)

def tf_loss(x, y, w):
  c = y - tf_h(x, w)
  return tf.matmul(c, c, True)

def tf_cost(X, Y, W):
  a = tf.transpose(Y) - tf_h(X, W)
  return tf.matmul(a, a, True)/(2*X.shape[1])

def tf_train_step(x, y, w, verbose=0):
  with tf.GradientTape() as t:
    J = tf_loss(x, y, w)
    if verbose:
      print('Loss: ',J)
  w_grads = t.gradient(J, w)
  optimizer.apply_gradients(zip([w_grads], [w]))
  return w

def tf_train(X, Y, max_iters=1000, min_cost=0.01, W=None, verbose=0):
  if W is None:
    W = tf.Variable(np.random.rand(2, 1), dtype=tf.float32)
  for i in range(max_iters):
    index = np.random.randint(0, x.shape[1])
    tf_train_step(X[:, index:index+1], Y[:, index:index+1], W)
    cost_value = tf_cost(X, Y, W).numpy()[0][0]
    if verbose:
      print('Cost: ', cost_value)
    if cost_value < min_cost:
      break
  print("Training Done...")
  print("Cost: {}".format(tf_cost(X, Y, W).numpy()[0][0]))
  print("W = ", W)
  return W
  
W = tf.Variable(np.random.rand(2, 1), dtype=tf.float32)
index = np.random.randint(0, x.shape[1])
print('Before: ', W.numpy().T)
tf_train_step(X[:, index:index+1], Y[:, index:index+1], W)
print('After: ', W.numpy().T)
print('Cost: ', tf_cost(X, Y, W).numpy())


In [0]:
W = tf_train(X, Y, W=W)

## Visualizing Performance


In [0]:
tf_show(X, Y, W)

# Linear Regression (Normal Equation)

Use of normal equation, where possible, makes our life a lot easier!


In [0]:
print(x.shape, y.shape)
x1 = np.concatenate((np.ones((1, x.shape[1])), x))
print(x1.shape)
x1 = np.transpose(x1)
y1 = np.transpose(y)
w = np.dot(np.dot(np.linalg.inv(np.dot(x1.T, x1)), x1.T), y1)
print(w)
show(x, y, w)

# California Housing Dataset


In [0]:
ds = pd.read_csv('sample_data/california_housing_train.csv')

In [0]:
ds.head()

In [0]:
print(ds.columns)

In [0]:
ds.describe()

In [0]:
m = ds.mean()
s = ds.std()
print('Mean:\n', m)
print('Standard Deviation:\n', s)

## Method 1
In this method, we simply load the dataset and convert it to *tensors*. After that our life is made easy! we just use the previous functions we implemented by tensorflow.

Now we separate inputs from targets.

In [0]:
ds_arr = np.transpose(np.array(ds))
y = ds_arr[-1:, :]
X = ds_arr[:-1, :]
print(X.shape, y.shape)

Converting data to *tensors*

In [0]:
X = tf.constant(X, dtype=tf.float32)
y = tf.constant(y, dtype=tf.float32)
W = tf.Variable(np.random.rand(X.shape[0]+1, 1), dtype=tf.float32)

Let's do the training now.

In [0]:
W = tf_train(X, y, max_iters=10000, W=W, verbose=0)

Finally, we can see how well we did!

In [0]:
o = tf_h(X, W).numpy().T
for i in range(10):
  print('No: {}'.format(i+1), '\tTarget: {}'.format(ds_arr[-1, i]), '\tPredicted: {}'.format(o[0, i]))

## Method 2
In this method we do a little **preprocess**. We first **normalize** the dataset. This help faster convergence.

Other steps are just like *method 1*.

In [0]:
def normalize(ds):
  mean = np.array(ds.mean())
  mean = mean[np.newaxis, :]
  std = np.array(ds.std())
  std = std[np.newaxis, :]
  X = np.array(ds)
  X = (X-mean)/std
  return X

def invert(X, ds):
  mean = np.array(ds.mean())
  mean = mean[np.newaxis, :]
  std = np.array(ds.std())
  std = std[np.newaxis, :]
  X = (X*std) + mean
  return X

In [0]:
ds_arr = np.transpose(np.array(ds))
normalized_ds_arr = np.transpose(normalize(ds))
y = normalized_ds_arr[-1:, :]
X = normalized_ds_arr[:-1, :]

In [0]:
X = tf.constant(X, dtype=tf.float32)
y = tf.constant(y, dtype=tf.float32)
W = tf.Variable(np.random.rand(X.shape[0]+1, 1), dtype=tf.float32)

In [0]:
W = tf_train(X, y, max_iters=10000, W=W, verbose=0)

In [0]:
mean = np.array(ds.mean())
mean = mean[np.newaxis, :]
std = np.array(ds.std())
std = std[np.newaxis, :]
o = tf_h(X, W).numpy().T
o = (o*std[0,-1])+mean[0,-1]
for i in range(10):
  print('No: {}'.format(i+1), '\tTarget: {}'.format(ds_arr[-1, i]), '\tPredicted: {}'.format(o[0, i]))