<a href="https://colab.research.google.com/github/ChristianaKiervin/MachineLearning_GirlScriptScholarship/blob/main/Perceptron_Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Date Created**: May 23, 2021


**Author**: [Christiana Kiervin](https://www.linkedin.com/in/christianakiervin/)

**Instruction by**: [Shivani Shimpi](https://www.linkedin.com/in/shivani-shimpi-5113a8170/)


In [None]:
from google.colab import drive
drive.mount('/content/drive')

# Problem Statement

Train a model to predict a basic linear equation.

Equation:  `y = 10*x` 




##Data Creation

The target here is the dataset we need to teach our model, shown here:


~~~
x = [0, 1, 2, 3, 4, 5,...]
y = [0, 10, 20, 30, 40, 50,...]
~~~


In [4]:
x = [i for i in range(21)] #list comprehension
print('x is: ', x)


y = [i for i in range(10*20+1) if i%10==0]
print('y is: ', y)

x is:  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
y is:  [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200]


## Approach 1

Defining a function for the equation ` y = 10x` to show that ML is actually redundant for this problem.



In [5]:
def tempFunc(x):
  y = 10*x
  return y

for value in x:
  print(tempFunc(value))

0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200


We're showing that predicting the relationship between x and y isn't necessary for such a simple task. In ML, becoming familiar with the types of problems which actually require deep learning and those which could be better solved with functions is important to save time/resources.

## Approach 2

For the sake of learning how to create simple models though, we made an ML model to create the table.

Let's start by splitting our datasets into four parts to represent training and testing data and their labels. 

As a refresher, since we are providing labels with our training data, this is an example of supervised learning.


- `xTrain` for training data
- `yTrain` for training labels
- `xTest` for testing data
- `yTest` for testing labels

In [6]:
print(f'This is x: {x}')
print(f'This is y: {y}')

This is x: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
This is y: [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200]


In [7]:
xTrain = x[:-5]  #slice it: consider only some values of the list x. You can slice in ascending or descending format.

yTrain = y[:-5]  #we're getting all items except the last 5. If you wanted the last 5 only, put the colon at the end like [-5:]

xTest = x[-5:] #Test data
yTest = y[-5:] #Test Labels

print(f'''
Training Data: 

xTrain : {xTrain}
yTrain : {yTrain}


Testing Data:

xTest : {xTest}
yTest : {yTest}

''')




Training Data: 

xTrain : [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
yTrain : [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150]


Testing Data:

xTest : [16, 17, 18, 19, 20]
yTest : [160, 170, 180, 190, 200]




In [8]:
import tensorflow as tf
from tensorflow import keras

Next, compile the model.

In [9]:
#perceptron model

model = tf.keras.Sequential([
                             tf.keras.layers.Dense(units=1, input_shape=[1])  #unit is the number of neurons and input shape is the shape of the input (1D in this case)
])

model.compile(optimizer='adam', loss='mae') #other optimizers: sgd, rmsprop. adamax, adagrad. For loss you could use mse as well (because this is regression, which we know since we are trying to learn to solve y given only x)

We know error is the difference between predicted labels and the actual labels. It represents how well the model is performing.


### Mean Absolute Error

$${\displaystyle \mathrm {MAE} ={\frac {\sum _{i=1}^{n}\left|y_{i}-x_{i}\right|}{n}}}
$$

where,

- $y_{i}$ denotes the true label (the labels you have)
- $x_{i}$ denotes the predicted labels that the model outputs 


---

This equation takes the difference in values of the real and predicted labels for every single value. Sum all those values and divide by the total number to get the mean loss/cost. It has to be an absolute value because we can't deal with negatives.

When the cost value is closer to 0, we know the model is learning and predicting things well.


In [10]:
model.fit(x=xTrain, y=yTrain, validation_data=(xTest, yTest), epochs=5000)  #epochs is the number of times you want to train your model

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 2501/5000
Epoch 2502/5000
Epoch 2503/5000
Epoch 2504/5000
Epoch 2505/5000
Epoch 2506/5000
Epoch 2507/5000
Epoch 2508/5000
Epoch 2509/5000
Epoch 2510/5000
Epoch 2511/5000
Epoch 2512/5000
Epoch 2513/5000
Epoch 2514/5000
Epoch 2515/5000
Epoch 2516/5000
Epoch 2517/5000
Epoch 2518/5000
Epoch 2519/5000
Epoch 2520/5000
Epoch 2521/5000
Epoch 2522/5000
Epoch 2523/5000
Epoch 2524/5000
Epoch 2525/5000
Epoch 2526/5000
Epoch 2527/5000
Epoch 2528/5000
Epoch 2529/5000
Epoch 2530/5000
Epoch 2531/5000
Epoch 2532/5000
Epoch 2533/5000
Epoch 2534/5000
Epoch 2535/5000
Epoch 2536/5000
Epoch 2537/5000
Epoch 2538/5000
Epoch 2539/5000
Epoch 2540/5000
Epoch 2541/5000
Epoch 2542/5000
Epoch 2543/5000
Epoch 2544/5000
Epoch 2545/5000
Epoch 2546/5000
Epoch 2547/5000
Epoch 2548/5000
Epoch 2549/5000
Epoch 2550/5000
Epoch 2551/5000
Epoch 2552/5000
Epoch 2553/5000
Epoch 2554/5000
Epoch 2555/5000
Epoch 2556/5000
Epoch 2557/5000
Epoch 2558/5000
Epoch 2

<tensorflow.python.keras.callbacks.History at 0x7f3580289910>

##Notes from Shivani:

`val_loss` denotes how far your model's prediction was from the actual label.

So let's say if you give the input `x = 10` to your model you are expecting the ideal output to be 100, why? Because `y = 10*x = 10*10 = 100`.

Now you would get 100 if you're not using Machine Learning.
If you use the Approach 1 (that works on Crisp / Boolean Logic) you would get an exact 100, but if you use machine learning (that uses fuzzy logic) you would get the value close to 100 but never exactly 100. 

In the current scenario it would be `10*x ± val_loss = 10*10 ± 97.4830`

Because the validation loss is 97.4830, and our intention is to reduce the loss and bring it down as closer to zero as much as we can.