# Intro to Neural Networks Assignment

## Define the Following:
You can add image, diagrams, whatever you need to ensure that you understand the concepts below.

### Input Layer: 
The input layers is the layer that receives the input from our dataset. (also known as the visible layer), it's the only layer exposed to the data and interacts directly with the data.
### Hidden Layer:
The hidden layer can only be accessed through the input layer and this is where the functions are performed.
### Output Layer:
The output layer is the final layer where we receive some result that is modified by an activation function with the purpose of transforming said result into a format that makes sense for the context of our problem.
### Neuron:
A neuron can visualize a single node within our neural network where on the left to represent the dendrites, we have our input layer, the center being the axon reprented by the weights and bias, and the end of the neuron where the axon terminal would be, would be represented by the output layer.
### Weight:
The weight can be thought of as a transformation of the input layer before being exposed to the transformation by the activation function.  The weights may sometimes be followed by some bias to represent a constant value. 
### Activation Function:
The activation function is found in each node that decides how much signal is passed onto the next layer. This is especially important in neural networks because it's telling the input it needs to pass this test through the activation function to make it to the next step.  
### Node Map:
A node map can be thought of as a visual diagram of the architecture or 'topology' of our neural network. It's like a flow chart that shows the path from inputs to outputs. 
### Perceptron:
The first and simplest kind of neural network that is represented by a signle node or neuron of a neural network with nothing else. 
It can take any number of inputs and spit out an output. What a neuron does is it takes each of the input values, multiplies each of them by a weight, sums all of these products up, and then passes the sum through an activation function and the result of which is hte final value. 


## Inputs -> Outputs

### Explain the flow of information through a neural network from inputs to outputs. Be sure to include: inputs, weights, bias, and activation functions. How does it all flow from beginning to end?

#### Your Answer Here

We first have inputs that can be represented by the data from our dataset that will pass through our neural network and at the end of our neural network we'll have outputs that will be labeled representations of whatever we are trying to predict. In the process of our neural networks we'll have assigned some weights and bias to help the neural networks assume certain features of what the output should be. Before we get an output, the output layer is checked with an activation function that will determine whether or not the weighted inputs that are passed through the neural network carry enough signal to pass onto the final layer. 

## Write your own perceptron code that can correctly classify a NAND gate. 

| x1 | x2 | y |
|----|----|---|
| 0  | 0  | 1 |
| 1  | 0  | 1 |
| 0  | 1  | 1 |
| 1  | 1  | 0 |

In [1]:
##### Your Code Here #####

import numpy as np
np.random.seed(1)

inputs = np.array([
    [0,0,1],
    [1,0,1],
    [0,1,1],
    [1,1,0]
])

correct_outputs = [[0],[1],[1],[0]]

In [2]:
def sigmoid(x):
    return 1/ (1+np.exp(-x))

def sigmoid_derivative(x):
    return sigmoid(x) * (1 - sigmoid(x))

In [3]:
weights = 2 * np.random.random((3,1)) - 1

weights

array([[-0.16595599],
       [ 0.44064899],
       [-0.99977125]])

In [4]:
weighted_sum = np.dot(inputs,weights)
weighted_sum

array([[-0.99977125],
       [-1.16572724],
       [-0.55912226],
       [ 0.274693  ]])

In [5]:
activated_outputs = sigmoid(weighted_sum)
activated_outputs

array([[0.2689864 ],
       [0.23762817],
       [0.36375058],
       [0.56824466]])

In [6]:
error = correct_outputs - activated_outputs
error

array([[-0.2689864 ],
       [ 0.76237183],
       [ 0.63624942],
       [-0.56824466]])

In [7]:
adjustments = error * sigmoid_derivative(activated_outputs)
adjustments

array([[-0.06604473],
       [ 0.18792752],
       [ 0.15391469],
       [-0.13118329]])

In [8]:
#update weights
weights += np.dot(inputs.T, adjustments)
weights

array([[-0.10921176],
       [ 0.46338038],
       [-0.72397378]])

In [9]:
for iteration in range(10000):
    weighted_sum = np.dot(inputs,weights)
    
    # activate!
    activated_outputs = sigmoid(weighted_sum)
    
    # calculate the error
    error = correct_outputs - activated_outputs
    
    # calculate weight wadjustments - watch linked videos for more intuition!
    adjustments = error * sigmoid(activated_outputs)
    
    # Update weights
    weights += np.dot(inputs.T, adjustments)

print('Weights after training')
print(weights)

print('Ouptut after training')
print(activated_outputs)
    

Weights after training
[[-0.22491407]
 [-0.22491407]
 [ 0.82736567]]
Ouptut after training
[[0.69579763]
 [0.64621699]
 [0.64621699]
 [0.38940163]]


## Implement your own Perceptron Class and use it to classify a binary dataset like: 
- [The Pima Indians Diabetes dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/diabetes.csv) 
- [Titanic](https://raw.githubusercontent.com/ryanleeallred/datasets/master/titanic.csv)
- [A two-class version of the Iris dataset](https://raw.githubusercontent.com/ryanleeallred/datasets/master/Iris.csv)

You may need to search for other's implementations in order to get inspiration for your own. There are *lots* of perceptron implementations on the internet with varying levels of sophistication and complexity. Whatever your approach, make sure you understand **every** line of your implementation and what its purpose is.

In [0]:
##### Your Code Here #####

## Stretch Goals:

- Research "backpropagation" to learn how weights get updated in neural networks (tomorrow's lecture). 
- Implement a multi-layer perceptron. (for non-linearly separable classes)
- Try and implement your own backpropagation algorithm.
- What are the pros and cons of the different activation functions? How should you decide between them for the different layers of a neural network?