In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from data_as_geometry_src import *

***
# Notebook: Data as geometry
*Introduction to intelligent systems - the toolbox* 

** Objectives **: The objectives of this notebook is to understand (i) ..., (ii) ... and (iii) ...

***

## Introduction
We will in this notebook investigate how the following simple neural network works:

<img src="src/figures/network.PNG",width=500,height=500>

The network is called a feed forward network, because information is only flowing from left to right. The network has two input neurons, three neurons in the hidden layer and a single output unit. 

## Data
We will adjust our network such that it can distingues between two classes of 2D points, a inner class of red points and outer ring of blue points.

In [None]:
# Lets generate 500 points from two classes
N = 500
X, Y = generate_circle_data(N)
plot_data(X, Y)

### Exercise
* Can you explain how the 2D points were generated?
* Adjust the weights below such that the 3 lines alltogether divide the region of the red points from the region of the blue points (as good as possible). Hint: (1) what is the connection between the weights and the black 3 arrows in the plot? (2) what is the connection between the lines and the black arrows?

In [None]:
widget = widget1()

In [None]:
# Rerun this cell each time you update a value in the above widget
W = np.array([[w.value for w in widget[0]], 
              [w.value for w in widget[1]],
              [w.value for w in widget[2]]])

print('Input weights')
print(W)

# Plot the decision boundaries
plot_decision_lines(X, Y, W)
plot_decision_contour(X, Y, W)

### Exercise
* The three plots above shows the activation of the three neurons in the hidden layer. We can compute the activation of the hidden neuron $h_1$ as:
$$ h_1 = \tanh(w_{11}^{(1)} \cdot x + w_{21}^{(1)} \cdot y + w_{10}^{(1)}) $$
Write down the the activate of the two other hidden units.

* The expression you just wrote down can be simplified using vector notation and inner products. Given two vectors $a=[a_1, a_2]$ and $b=[b_1, b_2]$ then their inner product is given by $a\cdot b = a^T b = a_1 b_1 + a_2 b_2$. Define a vector $\textbf{x}$ and a vector $\textbf{w}$ and write down the activation for the three hidden neurons using vector notation and inner products.
 
* Adjust the components of the normal vector to find the plane that best seperates the red and the blue points in the 3D space

* Based on the results printed in the bottom of the cell, compute the accuracy and error rate:
$$ \text{accuracy} = \dfrac{\text{number of correctly classified points}}{\text{total number of points}} $$
$$ \text{error rate} = \dfrac{\text{number of wrongly classified points}}{\text{total number of points}} $$
what is the connection between the accuracy and error rate of an classifier?

* Go back and try adjusting some of the weights. Can you optimize your classifier i.e. get a better accuracy?

In [None]:
%matplotlib notebook
%matplotlib notebook
widget = widget2()

In [None]:
# Rerun this cell each time you update a value in the above widget
N = np.array([n.value for n in widget])
plot3d_space(X, Y, W, N)

### Optional exercise: tensorflow playground
The neural network percented here is very simple, but the principals hold for more complicated network. Try opening [tensorflow playground](https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle&regDataset=reg-plane&learningRate=0.03&regularizationRate=0&noise=0&networkShape=4,2&seed=0.30449&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false) where we can play around with number of neurons in our network, activation function ect. Try out the following:
- Can you recreate the network we have just investigated?
- What happens with the decision boundary when we change activation function from $tanh(\cdot)$ to $relu(\cdot)$?
- Try changing the dataset to the spiral. Can you find a network that can classify this data?