In order to successfully complete this assignment you must do the required reading, watch the provided videos and complete all instructions.  The embedded survey form must be entirely filled out and submitted on or before **11:59pm on Monday November 2**.  Students must come to class the next day prepared to discuss the material covered in this assignment. 

# Pre-Class Assignment: Artificial Neural Networks

This entire Artificial Neural Networks module is from Neural Networks Demystified by @stephencwelch.  We have streamlined the content to better fit the format of the class. However, if you have questions or are just curious I highly recommend downloading everything from the following git repository.  It is a great reference to have:

    git clone https://github.com/stephencwelch/Neural-Networks-Demystified


### Goals for today's pre-class assignment 

1. [The architecture of Artificial Neural Networks](#The_architecture_of_Artificial_Neural_Networks)
2. [Data flow: forward propagation](#forward_propagation)
3. [Exploring A Neural Network](#Exploring_A_Neural_Network)
4. [The Universal Approximation Theorem](#The_Universal_Approximation_Theorem)
4. [Assignment wrap-up](#Assignment_wrap-up)

-----
<a name="The_architecture_of_Artificial_Neural_Networks"></a>

# 1. The architecture of Artificial Neural Networks

Watch the following video:

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('bxe2T-V8XRs',width=640,height=360, cc_load_policy=True)

We will use the data from the video above:
$$X = \left[\begin{matrix} 3 & 5 \\ 5 & 1 \\ 10 & 2 \end{matrix}\right] \hspace{1cm} , \hspace{1cm}y = \left[ \begin{matrix} 75 \\ 82 \\ 93 \end{matrix}\right] $$


### Step 1: Inicialize your inputs


&#9989; **<font color=red>DO THIS:</font>** Create two numpy arrays to store the values of the variables $X$ and $y$, as well as their normalized counterparts 
$X_{norm}$ and $y_{norm}$. Call these python variables ```X```, ```X_norm```, ```y```, and ```y_norm```

In [None]:
import numpy as np

In [None]:
X = np.array(([3,5],[5,1],[10,2]),dtype=float)
y = np.array(([75],[82],[93]),dtype=float)
X_norm = np.linalg.norm(X)
y_norm = np.linalg.norm(y)

-----
<a name="forward_propagation"></a>

# 2. Data flow: forward propagation

Data in a neural network flows via a process called **forward propagation**. Watch the following video:

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('UJwK6jAStmg',width=640,height=360, cc_load_policy=True)

&#9989; **<font color=red>QUESTION:</font>** How many input layers, hidden layers and output layers are there in the neural network shown in the video? Modify the following variables to have their correct value.

In [None]:
# Put your answer here
inputLayerSize = 2
outputLayerSize = 1
hiddenLayerSize = 3

## Step 2:  Initialize random weights

&#9989; **<font color=red>DO THIS:</font>** Randomly Initialize two numpy arrays ```W1``` and ```W2```, of the right dimensions, to store the weights (zero-one) in the synapses between input layer --> hidden layer, and hidden layer --> output layer. 

In [None]:
W1 = np.random.rand(2,3)
W2 = np.random.rand(3,1)

## Step 3: Multipuly the normalized input matrix by $W^{(1)}$
$$Z^{(2)} = X  W^{(1)} $$ 
Here is the code using the numpy dot matrix. If you get an error you may have initilized the size of your variables incorrectly. Make sure the second dimention of ```X_norm``` matches the first dimention of ```W1```:

In [None]:
Z2 = np.dot(X_norm, W1)
Z2

&#9989; **<font color=red>DO THIS:</font>** Implement and test the sigmoid function 

$$a(z) = \frac{1}{1 + e^{-z}} $$ 

The implemented  sigmoid function should take as input a numpy array and return a numpy array of the same dimension, with the function $f$ applied to each entry.

In [None]:
# your code here:
def sigmoid(z):
    # apply sigmoid activation function
    return 1/(1+np.exp(-1*z))

Test your sigmoid funciton using the following testing code:

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
testInput = np.arange(-6,6,0.01)
plt.plot(testInput, sigmoid(testInput), linewidth= 2)
plt.grid(1)

## Step 4: Apply the sigmodal funciton to  $Z^{(2)}$
$$a^{(2)} = f({Z^{(2)}})$$ 
Here is the code to apply the sigmod function to $Z^{(2)}$ and display the results

In [None]:
a2 = sigmoid(Z2)
a2

## Step 5:  multiply $A^{(2)}$ by $W^{(2)}$ to get $Z^{(3)}$
$$Z^{(3)} = A^{(2)}  W^{(2)} $$ 

In [None]:
Z3 = np.dot(a2, W2)
Z3

## Step 6: Apply the sigmod function

&#9989; **<font color=red>DO THIS:</font>** Apply the sigmod function again to $Z^{(3)}$ to produce $\hat{y}$
$$\hat{y} = f({Z^{(3)}})$$ 

In [None]:
# your code here:
yHat = 0

## Final Comparison
&#9989; **<font color=red>DO THIS:</font>** Now compare the estimation output ($\hat{y}$) to the actual output ```y_norm```.  

In [None]:
y_norm

In [None]:
yHat

Of course the results from forward propagation suck; no surprises here, the weights have not been properly chosen. That's what training a network does:  the goal is to find a combination of weights so that the result of forward propagation fits the intended output data as best as possible. 

We will be covering this topic in class.

----
<a name="Exploring_A_Neural_Network"></a>

# 3. Exploring A Neural Network

Please go to the following website : http://playground.tensorflow.org/

There, you'll have the opportunity to play with an actual neural network (e.g., choosing its architecture and the type of activation function) for classification purpose. 

---
<a name="The_Universal_Approximation_Theorem"></a>
# 4. The Universal Approximation Theorem

Please think about how the following theorem below relates to the topic at hand.


> In the mathematical theory of artificial neural networks, the universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of $R^n$, under mild assumptions on the activation function. The theorem thus states that simple neural networks can represent a wide variety of interesting functions when given appropriate parameters; however, it does not touch upon the algorithmic learnability of those parameters.

>One of the first versions of the theorem was proved by George Cybenko in 1989 for sigmoid activation functions.

>Kurt Hornik showed in 1991 that it is not the specific choice of the activation function, but rather the multilayer feedforward architecture itself which gives neural networks the potential of being universal approximators.
<p style="text-align: right;">From: Wikipidia - https://en.wikipedia.org/wiki/Universal_approximation_theorem</p>

### Some Math
Let $\varphi(\cdot)$ be a nonconstant, bounded, and monotonically-increasing continuous function. Let $I_m$ denote the $m$-dimensional unit hypercube $[0,1]^m$. The space of continuous functions on $I_m$ is denoted by $C(I_m)$. Then, given any function $f\in C(I_m)$ and $\epsilon>0$, there exists an integer $N$, real constants $v_i, b_i \in \mathbb{R}$ and real vectors $\mathbf{w}_i \in \mathbb{R}^m$, where $i = 1, \ldots, N$ such that if we define: 
$$F(\mathbf{x}) = \sum_{i=1}^N v_i \cdot\varphi \big(\langle\mathbf{w}_i , \mathbf{x}\rangle + b_i\big) $$
then 
$$|F(\mathbf{x}) - f(\mathbf{x})| < \epsilon $$

for all $x\in I_m$. In other words, functions of the form $F(\mathbf{x})$ are dense in $C(I_m)$.

&#9989; **<font color=red>QUESTION:</font>** In simplest terms, why do we care about the Universal Approximation Theorem?

Put your answer to the above question here.

----
<a name="Assignment_wrap-up"></a>
# 5. Assignment wrap-up

Please fill out the form that appears when you run the code below.  **You must completely fill this out in order to receive credit for the assignment!**

[Direct Link to Google Form](https://cmse.msu.edu/cmse802-pc-survey)


If you have trouble with the embedded form, please make sure you log on with your MSU google account at [googleapps.msu.edu](https://googleapps.msu.edu) and then click on the direct link above.

&#9989; **<font color=red>Assignment-Specific QUESTION:</font>** In simplest terms, why do we care about the Universal Approximation Theorem?

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>**  Summarize what you did in this assignment.

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>**  What questions do you have, if any, about any of the topics discussed in this assignment after working through the jupyter notebook?

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>**  How well do you feel this assignment helped you to achieve a better understanding of the above mentioned topic(s)?

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>** What was the **most** challenging part of this assignment for you? 

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>** What was the **least** challenging part of this assignment for you? 

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>**  What kind of additional questions or support, if any, do you feel you need to have a better understanding of the content in this assignment?

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>**  Do you have any further questions or comments about this material, or anything else that's going on in class?

Put your answer to the above question here

&#9989; **<font color=red>QUESTION:</font>** Approximately how long did this pre-class assignment take?

Put your answer to the above question here

In [None]:
from IPython.display import HTML
HTML(
"""
<iframe 
	src="https://cmse.msu.edu/cmse802-pc-survey?embedded=true" 
	width="100%" 
	height="1200px" 
	frameborder="0" 
	marginheight="0" 
	marginwidth="0">
	Loading...
</iframe>
"""
)

---------
### Congratulations, we're done!

To get credit for this assignment you must fill out and submit the above Google From on or before the assignment due date.

### Course Resources:


- [Website](https://msu-cmse-courses.github.io/cmse802-f20-student/)
- [ZOOM](https://msu.zoom.us/j/97272546850)
- [Syllabus](https://docs.google.com/document/d/e/2PACX-1vT9Wn11y0ECI_NAUl_2NA8V5jcD8dXKJkqUSWXjlawgqr2gU5hII3IsE0S8-CPd3W4xsWIlPAg2YW7D/pub)
- [Schedule](https://docs.google.com/spreadsheets/d/e/2PACX-1vQRAm1mqJPQs1YSLPT9_41ABtywSV2f3EWPon9szguL6wvWqWsqaIzqkuHkSk7sea8ZIcIgZmkKJvwu/pubhtml?gid=2142090757&single=true)



Written by Dirk Colbry, Michigan State University
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.