Skip to content
This repository has been archived by the owner on May 28, 2019. It is now read-only.

Introduction to Neural Network: Feedforward

Ji Yang edited this page Feb 11, 2017 · 3 revisions

You can find this article and source code at my GitHub repo

A Deadly Simple Neural Network

Have you ever heard of or look at some materials about the neural network, if so, a model looks like the one in below figure should not be a stranger to you.

An artificial neural network model

Well, that's a little bit complicated, so what about this one?

Diagram of a simple neural network

Let's have a brief explanation for each component in the figure. Each circle represents a unit (or a neuron). And each square represents a calculation. The left most three units form the input layer. The neuron with an h inside is the only neuron the output layer of this neural network has.

The input to the output unit

Recall that for a biological neuron, there exists a threshold for such a neuron to be activated. In our neural network, the neuron will calculate the input with an activation function and send the result as the output. One of the biggest advantages of the activation function is that the function could be any function. It means you can use step, polynomial or sigmoid functions of your choice to build your model. The output unit returns the result of f(h), where h is the input to the output unit, and y is the output of the neural network.

if you let f(h) = h as your activation function, then the output of the network will be this, note that here y = f(h).

The output

You are correct if you think this is just the linear regression model. Once you start using activation functions that are continuous and differentiable, it's possible to train the network using gradient descent. We gonna need the first derivative of the activation function. Before we dive into the training process, let's code the dead simple neural network in Python. For the activation function, we will use the sigmoid function. Don't worry if you think this network can only make a prediction by feedforward but learn (get trained) from backpropagation.

The sigmoid function

import numpy as np

def sigmoid(x):
    # sigmoid function
    return 1/(1 + np.exp(-x))

inputs = np.array([0.7, -0.3])
weights = np.array([0.1, 0.8])
bias = -0.1

# calculate the output
output = sigmoid(np.dot(weights, inputs) + bias)

print('Output:')
print(output)

You can find the code here.

Your First 2-Layer NN

Now, you would have a basic idea of how a neural network makes predictions. In a real-world problem, such a simple neural network may not be very helpful for your problem. A new concept needs to introduced here, the hidden layer.

A network with 3 input units, 2 hidden units and 1 output unit

In the previous simple network, our weight is a vector, but for a more common case, our weight should be a matrix looks like below (and this is the weight matrix represented in the figure above).

Weights matrix for 3 input units and 2 hidden units

You may get the idea of calculating h1 from the 2-layer neural network structure. Let's name it a mathematical formula.

The formula of calculating hidden layer inputs

And for our case,

The matrix multiplication for calculating hidden layer inputs of our network

Note: The weight indices have changed in the above image and no longer match up with the labels used in the earlier diagrams. That's because, in matrix notation, the row index always precedes the column index, so it would be misleading to label them the way we did in the neural net diagram.

Weight matrix shown with labels matching earlier diagrams.

Remember, the above is not a correct view of the indices, but it uses the labels from the earlier neural net diagrams to show you where each weight ends up in the matrix.

Combine with the formula we learned from the first section, we can implement the 2-layer neural network! The activation function used here is the sigmoid function.

Things to do:

  • Calculate the input to the hidden layer.
  • Calculate the hidden layer output.
  • Calculate the input to the output layer.
  • Calculate the output of the network.
import numpy as np

def sigmoid(x):
    # sigmoid function
    return 1/(1+np.exp(-x))

# Network size
N_input = 3
N_hidden = 2
N_output = 1

np.random.seed(42)
# Make some fake data
X = np.random.randn(4)

weights_in_hidden = np.random.normal(0, scale=0.1, size=(N_input, N_hidden))
weights_hidden_out = np.random.normal(0, scale=0.1, size=(N_hidden, N_output))

hidden_layer_in = np.dot(X, weights_in_hidden)
hidden_layer_out = sigmoid(hidden_layer_in)

print('Hidden-layer Output:')
print(hidden_layer_out)

output_layer_in = np.dot(hidden_layer_out, weights_hidden_out)
output_layer_out = sigmoid(output_layer_in)

print('Output-layer Output:')
print(output_layer_out)

You can find the code here.

Reference

Thanks for reading. If you find any mistake/typo in this blog, please don't hesitate to let me know, you can reach me by email: jyang7[at]ualberta.ca