<img style="text-align:left;" src="https://github.com/Microsoft/sqlworkshops/blob/master/graphics/solutions-microsoft-logo-small.png?raw=true" alt="Microsoft">
<br>

# Neural Networks and Deep Learning Example
## Text Recognition with Neural Networks

In this example you will get a brief introduction to Deep Learning in Python with Standard libraries. 

First, let's import the `numpy` library and define some Sigmoid functions we'll use in our example: 


In [2]:
import numpy as np

def sigmoid(x):
    return 1.0/(1 + np.exp(-x))

def sigmoid_derivative(x):
    return x * (1.0 - x)

print("Libraries and initial functions defined.") 

Libraries and initial functions defined.


## Create Neural Network Class

The next thing we need is a Neural Network (*Sometimes called an Artificial Neural Network or ANN*) to perform the shape recognition. ([You can learn more about Neural Networks here](http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html)). 

Bascially, a Neural Network takes inputs, moves those values through *layers*, and then sends along the output as a prediction. Another important concept is the idea that the `Nodes` (functions within the layers) can have `weights` which are values that affect the calculations done at the Node. ([More on Weights and Bias here](https://medium.com/fintechexplained/neural-networks-bias-and-weights-10b53e6285da))

<img src="https://upload.wikimedia.org/wikipedia/en/5/54/Feed_forward_neural_net.gif" height="152" alt="Layers">

*Image credit: Wikipedia*

Let's set up a Class in Python for our `ANN` code:


In [2]:
class NeuralNetwork:
    def __init__(self, x, y):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],4) 
        self.weights2   = np.random.rand(4,1)                 
        self.y          = y
        self.output     = np.zeros(y.shape)

print("Initial Neural Network Class created.")

Initial Neural Network Class created.


## Adding feedforward to a 2-Layer ANN 

Now we'll add some functionality to our `ANN` Class. One of the earliest types of ANN's was the [feedforward neural network](https://en.wikipedia.org/wiki/Feedforward_neural_network). This network moves the data from the input (*input nodes*) through various layers of algorithms (*hidden nodes*) and then on to the output (*output nodes*) in only one direction- it does not cycle or loop back to any nodes.

The simplest version of these moves directly from the input to one layer of Nodes and then on to the output. We'll need a more complex ANN, [called a 2-Layer (or multi-layer) ANN](https://youtu.be/s8pDf2Pt9sc).

Using this kind of ANN allows us to evaluate the shapes in the inputs we'll send, along with the Weights we want to set up: 

In [3]:
class NeuralNetwork:
    def __init__(self, x, y):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],4) 
        self.weights2   = np.random.rand(9,1)                 
        self.y          = y
        self.output     = np.zeros(y.shape)
    def feedforward(self):
        # assumes bias is zero
        # 2-layer network
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))

print("feedforward and 0-bias added to the ANN.")

feedforward and 0-bias added to the ANN.


## Setting up the Loss Functions, Back-Propigation, Gradient Decent, and the Chain Rules

A `Loss Function` is another important concpet in `ANN's`. A [Loss Function](https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/) measures the difference between the predicted value of the function and the `label` or true value. The lower the value, the better the model. 

As the data moves through the model, the model itself can use the previous result from a layer to tune itself by fine-tuning the weights the layer recieves. This is called *Back-Propagation*.  ([Learn more about Back-Propagation here](https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/))

To get to the prediction in the fastest way possible, we'll use a `Gradient Descent` to help gthe Loss Function. It's based on a mathematical principal that minimizes the distance to get to a point. [You can read more about Gradient Descent here.](https://peterroelants.github.io/posts/neural-network-implementation-part01/)

Putting this all together, we use the `Chain Rule`, which is a format of programming that uses derivatives to "chain" the computation. It's a bit complex, so [here's a fun video that helps with understanding it](https://www.youtube.com/watch?v=sDv4f4s2SB8). 

OK - putting that into code, lets create our final Neural Network class:


In [4]:
class NeuralNetwork:
    def __init__(self, x, y):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],9) 
        self.weights2   = np.random.rand(9,1)                 
        self.y          = y
        self.output     = np.zeros(y.shape)
    
    def feedforward(self):
        # assumes bias is zero
        # 2-layer network
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))
    
    def backprop(self):
        # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
        d_weights2 = np.dot(self.layer1.T, (2*(self.y - self.output) * sigmoid_derivative(self.output)))
        d_weights1 = np.dot(self.input.T,  (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) * sigmoid_derivative(self.layer1)))
        # update the weights with the derivative (slope) of the loss function
        self.weights1 += d_weights1
        self.weights2 += d_weights2

print("Chain Rule created for ANN with backprop added.")

Chain Rule created for ANN with backprop added.


## Using the ANN to Predict a Graphical Letter

Our Neural Network is ready to work - let's take a typical example for Deep Learning: Image Processing. One of the earliest experiments was to have a set of hand-written letters and numbers (like those on an envelope) and attempt to have the computer figure out the text even though it is written in several styles. The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that we can use to train our ANN, and then we can use that to figure out whether a letter is an "L" or not. 

<img src="https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png" height="152" alt="Layers">



We've taken the first step and "cut out" the image of the letter "L" below - and some that are not L's. Visually, you can instantly tell whether the image is an L or not, but your brain is using several systems (ocular, brain, training and others) to know that. A computer has to do it with math, and specifically our ANN we built. To do that, we'll have to take these images, and convert them to numbers.

Here are the raw images, cut out and boxed:

<img src="https://github.com/microsoft/sqlworkshops/blob/master/graphics/ML-AI-DL-Letters.png?raw=TRUE" height="152" alt="Layers">

To make numbers out of them, let's first put them in a grid:

<img src="https://github.com/microsoft/sqlworkshops/blob/master/graphics/ML-AI-DL-Letters-Blocked.png?raw=TRUE" height="152" alt="Layers">

So now we have five two-dimensional arrays of numbers we could put in the blocks. But we would like a single array, to have the ANN guess at all of them at once.  Here's what we can do: 

We can "unroll" each block top to bottom, and put them on a single line. Then we can take that line, and put numbers showing whether a block is filled or not. Then we can stack all of the lines of unrolled boxes into a single array. Here's what those three steps look like:

<img src="https://github.com/microsoft/sqlworkshops/blob/master/graphics/ML-AI-DL-Letters-Array.png?raw=TRUE" height="350" alt="Layers">

Now we have a single array (in some programs, a *tensor*) that we can send to the ANN as an input. 

## Setting up the data to send to the ANN

So now we have our letters converted to letters for training, and we can add one additional piece of information: We know whether it's really an **L** or not. This is a "Label" that we want to find. Now we can train the network for what we want to predict. 

In [5]:
dataset = [[0,1,0,0,1,1,0,0,0,1],
          [1,0,0,1,1,0,0,0,0, 1],
          [0,0,1,0,1,1,0,0,0, 0],
          [0,0,0,0,1,0,1,1,0, 0],
          [0,0,0,1,0,0,1,1,0, 1]
          ]
dataset = np.array(dataset)
print('Here is the full set of data, along with the "Label" at the end: \n', dataset)

X = dataset[:,:9]
print('Here are the "Features" in the data set: \n', X)

Y = np.array([[d] for d in dataset[:,-1]])
print('And here are the "Labels" in the data set: \n', Y)

Here is the full set of data, along with the "Label" at the end: 
 [[0 1 0 0 1 1 0 0 0 1]
 [1 0 0 1 1 0 0 0 0 1]
 [0 0 1 0 1 1 0 0 0 0]
 [0 0 0 0 1 0 1 1 0 0]
 [0 0 0 1 0 0 1 1 0 1]]
Here are the "Features" in the data set: 
 [[0 1 0 0 1 1 0 0 0]
 [1 0 0 1 1 0 0 0 0]
 [0 0 1 0 1 1 0 0 0]
 [0 0 0 0 1 0 1 1 0]
 [0 0 0 1 0 0 1 1 0]]
And here are the "Labels" in the data set: 
 [[1]
 [1]
 [0]
 [0]
 [1]]


## Instantiate the ANN using the Data

With the X's and Y's defined as input, we can now use the Neural Network. Let's send in the data:

In [6]:
nn = NeuralNetwork(X,Y)
print("Created", nn)

Created <__main__.NeuralNetwork object at 0x000002952D1B02E8>


## Add in feedfoward and back-propagation

As our final step, we will iterate the feedforward steps and the back-propagation we saw earlier. We'll then print out the predictions:

In [14]:
for i in range(1500):
    nn.feedforward()
    nn.backprop()

for i in range(0,5):
    print('The 2-layer Neural Network is',  round(nn.output[i][0]*100,2), '% sure that image', i, 'is an "L"')

The 2-layer Neural Network is 99.75 % sure that image 0 is an "L"
The 2-layer Neural Network is 99.95 % sure that image 1 is an "L"
The 2-layer Neural Network is 0.23 % sure that image 2 is an "L"
The 2-layer Neural Network is 0.31 % sure that image 3 is an "L"
The 2-layer Neural Network is 99.75 % sure that image 4 is an "L"


Reference: [How to build your own Neural Network from scratch in Python](https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6)