# Building Neural Network with Python

### Three basic function of neural network

There may be more functions needed, but for now we will be using the following functions to make a start

- Initialization : to set the number of input, hidden and output nodes
- Train : refine the weights after being given a training set example to learn from
- Query : give an answer from the output nodes after being given an input

In [1]:
# neural network class definition
class Neural_Network():
    
    # initialize the neural network
    def __init__():
        pass
    
    # train the neural network
    def train():
        pass
    
    # query the neural network
    def query():
        pass

### Initializing the Network

We need to set the number of input, hidden and output layer nodes which lead to defining the shape and size of the neural network. Let's set them by parameters of the neural network object. This way, we can easily retain the choice to create new neural networks of different sizes in other situation.

We will try to develop code for a neural network which tries to keep as many useful options open, and assumptions to a minimum, so that the code can easily be used for different needs. Same class could be able to create a small network and a large one as well.

Setting parameters is a way to set size, number of nodes, learning rate ... etc

In [2]:
# neural network class definition
class Neural_Network():
    
    # initialize the neural network
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        # set number of nodes in each input, hidden, output layer
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        # learning rate
        self.lr = learningrate
        pass
    
    
    # train the neural network
    def train():
        pass
    
    # query the neural network
    def query():
        pass

For example, let's create a small neural network object with 3 nodes in each layer and a learning rate of 0.5

In [3]:
# number of input, hidden and output nodes
input_nodes = 3
hidden_nodes = 3
output_nodes = 3

# learning rate
learning_rate = 0.3

In [4]:
# create instance of neural network
n = Neural_Network(input_nodes, hidden_nodes, output_nodes, learning_rate)

So far, we've told the neural network object how many input, hidden and output layer nodes we want, but nothing has really been done about it

### Weights, the Heart of the Network

Next step is to create the network of nodes and links. The most important part of the network is the link weights. **Link weights** are used to calculate the signal being fed forward, the error as it's propagated backwards, and links are refined in an attempt to improve the network

As in matrix, we can create:

$ \text{W}_{\text{input hidden}} $ : (hidden_nodes by input_nodes) matrix for the weights for links between the input and hidden layers.

$ \text{W}_{\text{hidden output}} $ : (output_nodes by hidden_nodes) matrix for the links between the hidden and output layers.

The initial value of the link weights should be small and random. The following numpy function generates an array of values selected randomly betwen 0 and 1, where the size is (rows by columns)

Weights can be negative not just positive in a range of -1.0 ~ +1.0 

Here for simplicity, we'll just subtract 0.5 from each of the valuesto have a range between -0.5 ~ +0.5

In [7]:
import numpy as np

In [9]:
np.random.rand(3, 3) - 0.5

array([[ 0.05860441,  0.28547311,  0.27055259],
       [-0.13555499,  0.28206397, -0.46915288],
       [-0.05571964,  0.28739358,  0.3721042 ]])

The following code creates the two link weight matrics using the `self.inods`, `self.hnodes` and `self.onodes` to set the right size.

#### Slightly more sophisticated weights

Weights can be sampled from a normal probability distribution centered around zero and with a standard deviation that is related to the number of incoming links into a node, $1 / \sqrt{\text{(number of incoming links)}}$

We've set the center of the normal distribution to 0.0. The expression for the standard deviation related to the number of nodes in the next layer which is simply raising the number of nodes to the power of -0.5 (squared root , reciprocal)

### Queryint the Network

`query()` function takes the input to a neural network and returns the network's ouput. We need to pass the input signals from the input layer of nodes, through the hidden layer and out of the final output layer. We also need to use the link weights to moderate the signals as they feed into any given hidden or output node, and use sigmoid activation function to squish the signal coming out of those nodes

The following shows how the matix of weights for the link between the input and hidden layers can be combined with the matrix of inputs to give the signals into the hidden layer nodes:

- $\text{X}_\text{hidden}  = \text{W}_\text{input_hidden} \cdot \text{I}$

or we can use matrix multiplication approach in Numpy Python

To get the signals emerging from the hidden node, we simply apply the sigmoid squashing function to each of these emerging signals.

- $\text{O}_\text{hidden}  = \text{sigmoid} \; (\text{X}_\text{hidden})$

The scipy Python library has a set of special functions, and the sigmoid function is called `expit()`

In [13]:
import scipy.special

The following defines the activation function we want to use insdie the nueral network's initialization section. The lambda function takes x and returns scipy.special.expit(x) which is the sigmoid function

Now we want to apply the activation function to the combined and moderated signals into the hidden nodes. The signals emerging from the hidden layer nodes are in the matrix called hidden_outputs

Let's take a look at the final output layer.

### Code so far

In [29]:
import numpy as np
import scipy.special

# neural network class definition
class Neural_Network():
    
    # initialize the neural network
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):

        
        # set number of nodes in each input, hidden, output layer
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes

        
        # link weight matrics, wih and who
        # weights inside the arrays are w_i_j, where link is from node_i to node_j in the next layer
        # w11, w21
        # w12, w22 etc
        self.wih = np.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = np.random.normal(0.0, pow(self.onodes, -0.5), (self.onodes, self.hnodes))
        
                    
        # learning rate
        self.lr = learningrate
                    
                    
        # activation function is the sigmoid function
        self.activation_function = lambda x: scipy.special.expit(x)
        
        pass
    
    
    # train the neural network
    def train():
        pass
    
                    
                    
    # query the neural network
    def query(self, inputs_list):
        
        # convert inputs list to 2d array
        inputs = np.array(inputs_list, ndmin=2).T
        
           
        #calculate signals into hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)          
        
        # calculate signals into final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
                    
        return final_outputs

Let's test the code. The following shows the creation of small network with 3 nodes in each of the input, hidden and output layers, and queries it with a randomly chosen input of (1.0, 0.5, -1.5)

In [30]:
input_nodes = 3
hidden_nodes = 3
output_nodes = 3

learning_rate = 0.3

n = Neural_Network(input_nodes, hidden_nodes, output_nodes, learning_rate)

In [31]:
n.query([1.0, 0.5, -1.5])

array([[0.45528079],
       [0.47809968],
       [0.47913207]])

### Training the Network

There are two phases of training;
1. Calculating the output for a given training example just as `query()` does it.
2. Taking the calculated output, comparing it with the desired output, and using the dfference to guide the updating of the network weights by backpropagating the errors to inform how the link weights are refined

The code for first part is almost exactly the same as that in the `query()` fucntion, because we're feeding forward the signal from the input layer to the final output layer in exactly the same way.

The only difference is that we have an additional parameter, `targets_list`. You can't train the network without the training examples which include the desired or target answer.

Now we are going to improve weights based on the error between the calculated and target output

First, we need to calculate the error, which is the difference between the desired target output provided by the training example, and the actual calculated value.

We can calculate the back-propagated errors for the hidden layer nodes. Remember how we split the errors according to the connected weights, and recombine them for each hidden layer node. We worked out the matrix form of this calculation as;

$$\text{errors}_\text{hidden} = \text{weights}^T_{\text{ hidden_output}} \cdot \text{errors}_{\text{ output}} $$

or in Python,

So far, we have what we need to refine the weights at each layer. 

For the weights between the hidden and final layer, we use the `output_errors`. For the weights between the input and hidden layers, we use these `hidden_errors` we just calculated

For updating the weight for the link between a node j and a node k in the next layer in matrix form;
 
$$ \bigtriangleup  W_{jk} = \alpha * E_k * \text{sigmoid}(O_k) * (1-\text{sigmoid}(O_k) \cdot O^T_j$$

The alpha is the learning rate, and the sigmoid is the squashing activation function. The `*` multiplication is the normal element by element multiplication, and the `.` dot is the matrix dot product.

The matrix of outputs from the previous layer is transposed, and this means the column of outputs becomes a row of outputs.

In Python, 

The code for the other weights between the input and hidden layers will be very similar. We just explot the symmetry and rewrite the code replacing the names so that they refer to the previous layers.

### The Final Neural Network

In [33]:
import numpy as np
import scipy.special

# neural network class definition
class Neural_Network():
    
    # initialize the neural network
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
        
        # set number of nodes in each input, hidden, output layer
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        
        # link weight matrics, wih and who
        # weights inside the arrays are w_i_j, where link is from node_i to node_j in the next layer
        # w11, w21
        # w12, w22 etc
        self.wih = np.random.normal(0.0, pow(self.inodes, -0.5), (self.hnodes, self.inodes))
        self.who = np.random.normal(0.0, pow(self.onodes, -0.5), (self.onodes, self.hnodes))
    
        # learning rate
        self.lr = learningrate
                    
        # activation function is the sigmoid function
        self.activation_function = lambda x: scipy.special.expit(x)
        
        pass
    
    
    # train the neural network
    def train(self, inputs_list, targets_list):
        #convert inputs list to 2d array
        inputs = np.array(inputs_list, ndmin=2).T
        targets = np.array(targets_list, ndmin=2).T

        # calculate signals into hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)

        # calculate signals into final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layers
        final_outputs = self.activation_function(final_inputs)

        # error is the (target - actual)
        output_errors = targets - final_outputs
        
        # hidden layer error is the output_errors, split by weights, recombined at hidden nodes
        hidden_errors = np.dot(self.who.T, output_errors)
        
        # update the weights for the links between the hidden and output layers
        self.who += self.lr * np.dot((output_errors * final_outputs * (1.0 - final_outputs)), np.transpose(hidden_outpus))
        
        # update the weights for the links between the hidden and output layers
        self.wih += self.lr * np.dot((hidden_errors * hidden_outputs * (1.0 - hidden_outputs)), np.transpose(inputs))
        
        pass

                    
                    
    # query the neural network
    def query(self, inputs_list):
        
        # convert inputs list to 2d array
        inputs = np.array(inputs_list, ndmin=2).T
        
           
        #calculate signals into hidden layer
        hidden_inputs = np.dot(self.wih, inputs)
        # calculate the signals emerging from hidden layer
        hidden_outputs = self.activation_function(hidden_inputs)          
        
        # calculate signals into final output layer
        final_inputs = np.dot(self.who, hidden_outputs)
        # calculate the signals emerging from final output layer
        final_outputs = self.activation_function(final_inputs)
                    
        return final_outputs