# Lecture One
---

In this notebook we'll use a [dataset](https://www.kaggle.com/kennethjohn/housingprice) giving the living areas, number of bedrooms and price of $141$ houses from Portland, Oregon.

## Libraries

These is all libraries used in this notebook:

* [pandas](https://pandas.pydata.org/)
* [numpy](https://numpy.org/)
* [seaborn](https://seaborn.pydata.org/)

In [62]:
%matplotlib notebook

import pandas as pd
import numpy as np

# -- Plots
import seaborn as sb
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
from matplotlib.ticker import LinearLocator, FormatStrFormatter

## Supervised Learning

In this section we'll try to explain what means **supervised learning**. Imagine that you want to predict the price of a house using only living area as variable, i.e., given an area of some house you'll tell the respective house price with some rate of confidence.

### Some notations

* $x^{(i)}$: to denote the i-th input variables, in this case, the _i-th_ area. These variables are also called input **features**.  
* $y^{(i)}$: the variable that we want to predict, i.e., the output variable, also called **target**. In our case the target is the price.
* $(x^{(i)},y^{(i)})$: is the **training example**.
* ${(x^{(i)},y^{(i)})\;\; \forall i=1,...,m}$: the **training dataset** that we'll use to learn$^{(1)}$.
* $\mathcal{X}$: the space of input values. In this case $\mathcal{X} = \mathbb{R}$.
* $\mathcal{Y}$: the space of output values. In this case $\mathcal{Y} = \mathbb{R}$.

### Definition

Thus, supervised learning consists in, given a training set, learn (find) a function $h:\mathcal{X}\mapsto \mathcal{Y}$ that $h(x)$ is a _good_ predictor for the corresponding value of $y$. The $h$ function is called **hypothesis**.

> The image illustrate the scenario where a training set _is feed_ to a learning algorithm to learn a function $h(x)$, this function try to predict the value of $y$ given $x$.

<img src="https://raw.githubusercontent.com/antonioMoreira/Andrew-Ng-Lectures/master/img/1.png" width=300px>

> The image shows some _classes_ of problems separated by conditions on the target variable $y$.

<img src="https://raw.githubusercontent.com/antonioMoreira/Andrew-Ng-Lectures/master/img/2.png" >

---

$^{(1)}$The concept (definition) of **learn** in _machine learning_ is much more robust, for now we'll not to abort this topic. 

---

In the next cells we'll show some visualizations of the dataset to make an idea of how the data are distributed. 

In [4]:
dataset = pd.read_csv("./datasets/ex1data2.txt")

print("Number of samples on dataset: ", dataset.shape[0])

dataset.head()

Number of samples on dataset:  47


Unnamed: 0,Area,Bedrooms,Price
0,2104,3,399900
1,1600,3,329900
2,2400,3,369000
3,1416,2,232000
4,3000,4,539900


Plotting the feature **price** as function of **area**, i.e.:

$$\text{price}({area})$$

Giving a two dimensional space.

In [5]:
figure = plt.figure()
plt.xlabel("Area (feet²)", fontsize=20)
plt.ylabel("Price ($1000)", fontsize=20)
plt.scatter(dataset.values[...,0], dataset.values[...,2]/1e3, marker='x')
plt.show()

<IPython.core.display.Javascript object>

Plotting **price** as function of **area** and **number of bedrooms**, i.e.:

$$\text{price}(\text{area},\text{bedrooms})$$

Giving a three dimensional space.

In [6]:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

#for zlow, zhigh in [(-50, -25), (-30, -5)]:
xs = dataset.values[...,0]
ys = dataset.values[...,1]
zs = dataset.values[...,2]/1e3
ax.scatter(xs, ys, zs, marker='x', color='r')

ax.set_xlabel('Area', fontsize=18)
ax.set_ylabel('# bedrooms', fontsize=18)
ax.set_zlabel('Price', fontsize=18)

plt.show()

<IPython.core.display.Javascript object>

In the next image, it's possible to see how our variables, features and target, are distributed in their respective histograms.

In [7]:
dataset.hist()
plt.show()

<IPython.core.display.Javascript object>

## Linear Regression

Now consider all two features: **area** and **# of bedrooms**, thus, the input space is $\mathcal{X} = \mathbb{R}^2$ and:

* $x^{i}_1$: represents **area** of *i-th* house in the training set.
* $x^{i}_2$: represents **# of bedrooms** of *i-th* house in the training set.

To perform supervised learning, we're must decide how we're going to represent functions/hypothesis $h$, further you'll see that is a condition of **bias-variance tradeoff** (an important concept in statistical learning theory). As an initial choice, let's say we decide to approximate $y$ as a linear function of $x$:

> The goal of regression is to predict the value of one or more **continuous** _target_ variables **t** given the value of a D-dimensional vector **x** of _input_ variables.

$$h_{\theta}(x^i) = \theta_0 + \theta_1 x_1^i + \theta_2 x_2^i$$
$$x^{i} = (x_1^{i} \,\, x_2^{i})$$

* $\theta_i$: **Parameters**, also called **wights**, adjusting the space of linear functions mapping $\mathcal{X}$ to $\mathcal{Y}$.
* $\theta_0$: is the **intercept therm**.

### Hyperplanes family

Instead of us continuing our study, let's understand a little bit about hyperplanes family.

For example, we know that $$a x_1 + b x_2 + c = 0 \;\; \text{where} \; a,b,c \in \mathbb{R}$$ represents a family of all straight lines in $\mathbb{R}^2$.

On the below cell, we generate $10$ random tuples with parameters on above equation $t_i = (a \; b \; c);\; i \in [1,10] \text{ and } a,b,c \in \mathbb{R}$, then we plot.

In [8]:
# Generating straight lines by varying randomly the parameters a, b and c
# (!) Careful with the case b=0
np.random.seed(42)

n_lines = 10 #number of random lines
lines = np.random.normal(size=(n_lines,3)) # n_lines parameters randomly generated
x = np.array(range(-300,300))/100

plt.figure()
for parameters in lines:
    y = -(parameters[0]*x + parameters[2])/parameters[1]
    plt.plot(x,y)
        
plt.ylim(-1,1)
plt.xlim(-1,1)
plt.grid(True)        
plt.show()

<IPython.core.display.Javascript object>

A straight line is one specific case of **hyperplane** in $\mathbb{R}^2$, so as a plane in $\mathbb{R}^3$:

$$ a x_1 + b x_2 + c x_3 + d = 0 \;\; \text{where}  \; a,b,c,d \in \mathbb{R} $$


Adding some formal definitions of **hyperplanes**:

> A Hyperplane of a $n$-dimensional space $\mathcal{V}^{(1)}$ is a subspace of dimension $n-1$ (or [codimension](https://en.wikipedia.org/wiki/Codimension) $1$ in $\mathcal{V}$).

$$0 = \theta_0 + \sum_{i=1}^{\infty} \theta_i x_i$$
$$\mathcal{H}(\theta_0, \theta_1,...,\theta_{\infty}) = \theta_0 + \sum_{i=1}^{\infty} \theta_i x_i$$


**But, why hyperplanes?**  
1. To represent the problem as a hyperplane, ex.: linear **regression**.

![img_3.png](./img/3.png)

2. In **classification** problems, separate the input space with respect the output space.  

![img_4.png](./img/4.png)

---
$^{(1)}$ must be an _Euclidian Space_

### Continuing linear regression


To simplify our notation, lets make a convention: $x_0 = 1$ (to multiply the **independent variable**), so:
$$h(x^j) = \sum_{i=0}^{n} \theta_i x_i^j = \theta^T x^j$$
$$x^j = (1 \; x^j_1 \; ... \; x^j_n \;) \;\;\; j = 1,...,m$$
$$\theta = (\theta_0 \; \theta_1 \; ... \; \theta_n ) $$

Now, given a training dataset, how do we pick (or **learn**), the parameters $\theta_i$? 


## LMS Algorithm

One way 

In [122]:
from sklearn.preprocessing import StandardScaler

data = np.delete(dataset.values, 1, 1) # delete # of bedrooms

scaler = StandardScaler()
data_t = scaler.fit_transform(data)

#plt.scatter(data_t[...,0],data_t[...,1])
#plt.plot([-2,3.5],[-2,3.5],color='green')
#plt.grid(True)

x = np.arange(-2,2,4/200) #theta 0
y = np.tan(np.arange(0,np.pi/2,(np.pi)/(2*200))) #theta 1
#z = (1/2*np.shape(data_t)[0])

#x, y = np.meshgrid(x, y)

z = np.transpose([y])*data_t[...,0] - data_t[...,1]
np.shape(z)

(200, 47)

In [82]:
from mpl_toolkits import mplot3d
import numpy as np
import matplotlib.pyplot as plt
x = np.outer(np.linspace(-2, 2, 30), np.ones(30))
y = x.copy().T # transpose
z = np.cos(x ** 2 + y ** 2)

fig = plt.figure()
ax = plt.axes(projection='3d')

ax.plot_surface(x, y, z,cmap='viridis', edgecolor='none')
ax.set_title('Gradient descent')
plt.show()

<IPython.core.display.Javascript object>