# Binary Variable Linear Regression

In [2]:
# Let's import some crucial libraries for building our Univariate Linear Regression Model
import numpy as np                # for some crucial (linear algebra) computation
import pandas as pd               # for data structures and data analysis
import matplotlib.pyplot as plt  # for plotting our results

In [3]:
# For our data, we are going to use our Kaggle Dataset :
data = pd.read_csv('restaurant-univariate.csv')
# For now, we will use 'pandas' to do the reading of data for us.
# In the future, we will have a coding exercise where you should be familiar creating
# a csv reader and writer from scratch

## Function for Univariate Linear Regression

In Grade School, we learned that a Univariate Linear Regression is written as :

$y = mx + b$ where :
1. $y = $ the output
2. $m$ and $b$ are 'weights'
3. $x$ is the feature

#### In Machine Learning, we write the same function as :

$h(x)=\theta_1 * x_1 + \theta_2 * x_2$ where :
1. $\theta_1$ and $\theta_2$ are 'weights'
2. $x_1$ is the default feature and equal to $1$
3. $x_2$ is the single feature for the univariate linear regression.

#### We can re-write this same function into a vectorized format
$h_\theta(x) = \theta_1 * x_1 + \theta_2 * x_2$

$h_\theta(X) = \theta^{T}X$ where :
1. $\theta^{T}$ is a transpose vector : $\begin{bmatrix}\theta_1 & \theta_2 \end{bmatrix}$
2. $X$ is a vector for features : $\begin{bmatrix}x_1 \\ x_2 \end{bmatrix}$

## Questions :

$Q_1$ : During the training phase, which of the following are we trying to tweak?

a.) the weights, b.) the features, c.) the output

$Q_2$ : What is the difference between 'dimensions' and 'features'? (This is a trick question)

$Q_3$ : What is 'Regression'?

$Q_4$ : In a regression line, with each increase or decrease of the standard deviation in feature x, how much is the output $y$ increased? 

In [4]:
# You should retrieve the values of the 'head' so you can retrieve the value.
# --> INSERT BELOW

# Let's first collect the values for x_1 and x_2 and y.
# Make sure to specify the keys of the data!
# By default, all the weights in Theta should be 0!
# UNCOMMENT WHEN READY!
# X = 
# Y = 
# Theta = 

## Quest for the General/Multivariate Linear Regression Function

Now, try to implement a linear regression function when we have 2 features. After that, try doing it for one that is more general a.k.a. works for 2 or more features.

When you are ready, read on.

>The General Multivariate Linear Regression : $h_\theta(X) = \theta^{T}X$ where :
1. $\theta^{T}$ is a transpose vector : $\begin{bmatrix}\theta_1 & \theta_2 & \theta_3 & ... & \theta_n \end{bmatrix}$ where n is the number of features
2. $X$ is a vector for features : $\begin{bmatrix}x_1 \\ x_2 \\x_3 \\ ... \\x_n \end{bmatrix}$ where n is the number of features.

#### Wait...what?? Why does the general linear regression function look the same??
Well yes, it's the same function, if you think about it. All we really need to do is just add more features and weights into both $\theta$ and $X$.

In other words, this general linear regression function is simply a collapsed form of the linear regression function you have learned in grade school.

Really, it's just the power of Linear Algebra...

In [5]:
# HERE, implement the hypothesis we have come up with
# h = Theta * X
# 1. Make sure 'Theta' is a transpose vector and X is a regular vector
# 2. Make sure you round down the output to the second decimal position
# example  : y = 1.27898 -> y = 1.27
# @param Theta : the weights. This is a numpy array (not yet transposed)
# @param X : the features. This is a numpy array
# @return prediction of price, in float
#-------------------------
# UNCOMMENT WHEN READY
#-------------------------
def hypothesis(Theta, X):
    tTheta = np.transpose(Theta)
    result = tTheta * X
    return round(np.sum(result),2)

# TESTS (NOt used for grading)---------------
tTheta = np.array([0.5, 1.7, -2.1])
tX = np.array([1, 121.3, 14.57])
assert(hypothesis(tTheta, tX) == 176.11)
tTheta = np.array([-321.3, 4.3, 7.9])
tX = np.array([1, -49.89, -15.84])
assert(hypothesis(tTheta, tX) == -660.96)
tTheta = np.array([213, 23, 3])
tX = np.array([1, -23, 51])
assert(hypothesis(tTheta, tX) == -163.0)
# Write more of your tests to make sure the hypothesis works!

## Training : The Quest for the Ideal 'Weights'
As mentioned in the past, the goal of 'training' in machine learning is to find the values for the weights of the function so that the function with those weights and features can predict an outcome it has never seen quite accurately.

In Training, we use the 'cost function' to measure how well the predictor line fits to the data we are seeing (known as 'observations' or the 'training set examples').

### The Cost Function
Essentially, the cost of the regression line is the vertical distance between each observation (e.g. training example) to the regression line.

The regression line with the least amount of error is the one we call the 'best fit' line or the final regression line at the end of training.

#### So What is the Cost Function?
As we said, the distance between each point to the line is itself the 'cost' of the regression line.

So, we need :

- Summation to compute the total distance by adding up every point's distance to the regression line
- For each value in the summation, an equation to compute the 'distance'

Let's see the function for a univariate linear regression line.

$J(\theta_0, \theta_1) = 1 \div 2m \sum_{i=1}^{M}(h_{\theta_0,\theta_1}(x_i) - y_i)^{2} $

We keep the '2' to help us later with the partial derivative.

Now, try it on your own to construct a general cost function of multivariate linear regression line. Use your knowledge of Linear Algebra to make this work.

In [None]:
# Implement the Cost Function Here.
# Your Cost Function will take a numpy array of the weights 'Theta',
# a numpy array of features, and a numpy array of actual outputs associated with the features
# and output a scalar value, in float.
#
# NOTE : the size of Y will give you the # of training examples in your training set
#
# WHEN YOU ARE READY, UNCOMMENT THE FUNCTION BELOW.
# def cost(Theta, X, Y):