# Before you start


**First downloaded the USA_Housing.csv and checker.py from github, link is:** https://github.com/UCLA-LACC-21/M4-ML-AI
**Then you need to drag/upload the two files into this colab session (click the files button on the left of this notebook and drag the files to it).**

There are three examples in this lab, which is designed to let you familiar with basic ML algorithm implementation in python and how to implement them, along with different kinds of useful libraries. For two fo the examples, you need to write one or two lines of code yourself, and the checker function will let you know whether it is correctly implemented or not.

# Linear regression 

First, let's start with a very basic example and implement linear regression from scratch using the formula. Let X be some random numbers that follows normal distribution with mean = 1.5 and stddev = 2.5, and target y is computed using the equaltion y = 2 + 0.3 * X + res, where res is the residual term which also follows normal distribution.

In [None]:
import checker
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

# Generate 'random' data
np.random.seed(0)
X = 2.5 * np.random.randn(100) + 1.5   # Array of 100 values with mean = 1.5, stddev = 2.5
res = 0.5 * np.random.randn(100)       # Generate 100 residual terms
y = 2 + 0.3 * X + res                  # Actual values of Y

# Create pandas dataframe to store our X and y values
df = pd.DataFrame(
    {'X': X,
     'y': y}
)

# Show the first five rows of our dataframe
df.head()

Now we have generated our own data for linear regression. Let's calculate the alpha and beta terms using the equations. Here alpha is the beta0 term in lecture slides, which is the intercept of the line. 

**In this block there is one line of code that you need to implement yourself**, which is to calculate the alpha value based on the formula. Once you finish the implementation, run this block and the checker function will let you know whether you implement it correctly. 

In [None]:
# Calculate the mean of X and y
xmean = np.mean(X)
ymean = np.mean(y)

# Calculate the terms needed for the numator and denominator of beta
df['xycov'] = (df['X'] - xmean) * (df['y'] - ymean)
df['xvar'] = (df['X'] - xmean)**2

# Calculate beta and alpha, no multi-dimentional matrix operations here since we are processing with 1d data.
beta = df['xycov'].sum() / df['xvar'].sum()
#######################################
#your code here to calculate alpha, remember alpha is the intercept of the line
alpha = 
########################################
print(f'alpha = {alpha}')
print(f'beta = {beta}')
checker.check_alpha(alpha)

Generate the predicted y values based on our alpha and beta values, and the input X.

**This is one line of code essentially implement the function of a straight line, you should try to implement it yourself.**

In [None]:
###############################
#implement the one line code to generate y (prediction) according to calculated alpha and beta value and input X
ypred = 
###############################
checker.check_ypred(ypred)

To visualize how our regressor performs, plot the regression results against the actual y we generated. The blue line is our line of best fit, Yₑ = 2.003 + 0.323 X. We can see from this graph that there is a positive linear relationship between X and y. Using our model, we can predict y from any values of X!

In [None]:
# Plot regression against actual data
plt.figure(figsize=(12, 6))
plt.plot(X, ypred)     # regression line
plt.plot(X, y, 'ro')   # scatter plot showing actual data
plt.title('Actual vs Predicted')
plt.xlabel('X')
plt.ylabel('y')

plt.show()

Now we have finished the first example! It's a toy example that makes you understand better what's happening. 

Now let's try another exmaple, unlike the first example, this time we use a real-world dataset: USA Housing. The dataset involves many information of housing including area income, house age, number of rooms, etc., and the goal is to use these information to predict the price of the house. 
In real world, data usually are not 1d like the one we generated our seleves, but involves many dimensions, this USA housing dataset has 6 dimensions for example. In such cases, computing beta values from scratch will be complex. 

Luckily, in real world we usually don't implement from scratch, but use libraries (where others implement the function and all you need to do is import and call it). In this example we will use sklearn library which contains lots of Machine learning models and algorithms (If you want to do data science related work in future, you will be using this library a lot!). Besides sklearn, there are also many useful libraries (e.g., seborn) that can help you to visualize and understand the data better, so that you know what kind of algorithms may work better. In this example we will work you through all these libraries and let you familiar with it. **You don't need to implement anything in this example.**

Fisrt, let's import these libraries and load the dataset from the csv file you uploaded to colab.

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory
%matplotlib inline
import os

df = pd.read_csv('./USA_Housing.csv')
df.head()

Use df.info() to print out the summary of all data colomns (each colomn is one dimension).
Why there is 7 colomns here but I said this dataset contains 6 dimensions?
A: The price column is Y, which is the output of our prediction and should not be a dimension of input.

In [None]:
df.info()

df.describe() gives the stats summary of all the columns.

In [None]:
df.describe()

Print out all the column names

In [None]:
df.columns

seborn (sns) is a very powerful library for visualizing data, it can make you understand the data a lot better if used correctly.
Here we call pairplot to plot the pair-wise relationship between each data columns. As you can see from the plot, some columns has a strong correlation while some seems uncorrelated. 

In [None]:
sns.pairplot(df)

Now we compute the correlation between columns using df.corr(), and then plot the results as a heat map using sns.heatmap for better visualization. Compare this plot to the previous plot, you can see the correlation numbers verify the pair-wise relationship.

In [None]:
sns.heatmap(df.corr(),annot=True)

Now let's start the actual regression implementation. First need to define the input data and target. Here the target y is the price column. Only 5 out of the 6 columns are slected here as input data, the address column is excluded, because it contains no useful information about housing price.

In [None]:
X = df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
       'Avg. Area Number of Bedrooms', 'Area Population']]
y= df['Price']

Now let's import the sklearn library. Then we use train_test_split to split our data read from csv into training set and test set automatically (test set is used to verify our prediction results and shouldn't be seen during training).

Then we implement the linear regression by calling the library. We define the regressor using `regressor = LinearRegression()` and use `regressor.fit(X_train,  y_train)` to compute the alpha and beta values based on the input data.

Yes, it is that simple, with two lines of code the regressor model is already trained.



In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.4, random_state=101)

regressor = LinearRegression()
regressor.fit(X_train,  y_train)

To predict, call `predictions = regressor.predict(X_test)` to get the predicted housing price (ypred)

In [None]:
predictions = regressor.predict(X_test)
print(predictions)

Now let's compare the predicted results with the actual price by scatter plotting

In [None]:
plt.scatter(y_test,predictions)

Seems pretty good! This example shows that linear regression works well on multi-dimension data as well, and more importantly, it shows linear regression is good enough for some real word dataset like this one.

Also, we can see that libraries make our life a lot easier, we can implemnt a linear regression (or a lot other algos) and get the prediction using sklearn with 3 lines of code. Libraries are great, but you should use it with caution as it can hurt you as well. Many people only knows how to call libraries, but don't know what's happening inside at all, and that's what we should avoid.





# Perceptron

Now let try an example with Perceptron, which will be a classification task. In this example I will show you how to implement a perceptron from scratch, whcih is more complex than linear regression. It is designed to help you understand how perceptron works better, you don't have to know how implement from scratch yourself. Just try to understand the code will already be useful.
reference: https://pythonmachinelearning.pro/perceptrons-the-first-neural-networks/

In [None]:
class Perceptron(object):
    """Implements a perceptron network"""
    '''
    We’ll use object-oriented principles and create a class. In order to construct our perceptron,
    we need to know how many inputs there are to create our weight vector.
    The reason we add one to the input size is to include the bias in the weight vector.
    '''
    def __init__(self, input_size, lr=1, epochs=100):
        self.W = np.zeros(input_size+1) # the weight of the perceptron
        # add one for bias
        self.epochs = epochs #number of epochs for training
        self.lr = lr # this is the learning rate for training

    '''
    We’ll also need to implement our activation function. We can simply return 1 if the input is greater than or equal to 0 and 0 otherwise.
    '''
    def activation_fn(self, x):
        return 1 if x >= 0 else 0
    '''
    Besides, we need a function to run an input through the perceptron and return an output.
    Conventionally, this is called the prediction. We add the bias into the input vector. 
    Then we can simply compute the inner product and apply the activation function.
    '''

    def predict(self, x):
        z = self.W.T.dot(x)
        a = self.activation_fn(z)
        return a

    '''
    The previous functions are enough for a forward pass to a perceptron model (given input and weights, the model will generate correct output).
    However, we still nees a function to update the weights, i.e., train the network.
    With the update rule from lecture in mind, we can create a function to keep applying this update rule until our perceptron can correctly classify all of our inputs.
    We need to keep iterating through our training data until this happens; one epoch is when our perceptron has seen all of the training data once.
    Usually, we run our learning algorithm for multiple epochs.
    '''
    def fit(self, X, d):
        for _ in range(self.epochs):
            for i in range(d.shape[0]):
                x = np.insert(X[i], 0, 1)
                y = self.predict(x)
                ########################
                #your code here (should be just one line)
                #how to compute weight? remember here d is the target data for training (fitting)
                #you will know whether the implementation is correct or not after running the next block
                e = 
                ###########################
                #Then update the weights according to the error, note how learning rate is integrated into it
                self.W = self.W + self.lr * e * x 
    '''
    We also want a function to do the actual prediction (inference) which takes an input array.
    We can adapted the code from self.predict
    '''
    def inference(self, X):
        ypred = np.zeros(X.shape[0])
        for i in range(X.shape[0]):
            x = np.insert(X[i], 0, 1)
            ypred[i] = self.predict(x)
        return ypred
    '''
    Now we have finished our own perceptron implementation, we now perform a simple test to verify it.
    '''

In [None]:
#Let's generate some random input and target data
X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 0],
    [1, 1]
])
#y = np.array([0, 0, 0, 0, 1])
y = np.array([0, 1, 1, 1, 1])
#Then call our implemented perceptron class
perceptron = Perceptron(input_size=2)
perceptron.fit(X, y)

checker.check_perceptron(perceptron.W)

#print the weight of trained perceptron
print(perceptron.W)

#print the predicted y
ypred = perceptron.inference(X)
print(ypred)