Creating a **Dense Layer Class**, we will begin with two methods. This is also called sometimes fully connected.

In [1]:
# Dense layer
class Layer_Dense:
    
    # layer of initialization
    def __init__(self, n_inputs, n_neurons):
        # initialize weights and biases
        pass # using pass statement as placeholder
    
    # Forward pass
    def forward(self, inputs):
        # calculate output values from inputs, weights and biases
        pass # using pass statement as placeholder

Weights are often initialized on random for a model, but this is not always the case, you can have something else than random like a pre-trained model. But we will for  now use random initialization.  
  
Next we have forward method. When we pass data through a model from start to end this is called forward pass. You can also have data loop back. But we will perform a regular forward pass.  
  
Adding to the Layer_Dense class code we will add the random initialization of weights and biases:

In [4]:
import numpy as np
import nnfs

In [3]:
# Layer initialization
def __init__(self, n_inputs, n_neurons):
    self.weights = 0.01 * np.random.randn(n_inputs, n_neurons)
    self.biases = np.zeros((1, n_neurons))

We are setting weights to be random and biases to be 0. We are initializing weights to be (inputs, neurons) rather than (neurons, inputs). We do this ahead instead of transposing every time we perform a forward pass. The reason for 0 biases, this is the most common initialization for biases. Sometime you want to try something else, like when you have dead neurons. This is related to activation functions. It is possible for weights * inputs + biases not to meet the threshold of the step function, it means that the neuron outputs a 0. This is not as such a big issue. But with inreasing number of neurons outputting 0 this will lead to that the network is in its essence non-trainable or called "dead".

The `np.random.randn` and the `np.zeros` are methods to initialize arrays. The `np.random.randn` will generate normally distributed with mean 0 and sigma 0 random numbers. In general, neural networks work best with values between -1 and +1 which we will see eventually. We will multipy this normal distribution for weights with the scalar 0.01 to get numbers a few magnitudes smaller. This is because otherwise the model will take longer time to fit the data during training process and starting value will be disproportionally larger compared to updates being made during training. The idea is to start the model with small non-zero values that wont affect the training. We can experiment using other values of the scalar.  
  
The `np.random.rand` takes dimension sizes as parameters and creates the output array with this shape.

In [5]:
nnfs.init()

print(np.random.rand(2,5))

[[0.5488135  0.71518937 0.60276338 0.54488318 0.4236548 ]
 [0.64589411 0.43758721 0.891773   0.96366276 0.38344152]]


In [6]:
print(np.random.rand(2,5))

[[0.79172504 0.52889492 0.56804456 0.92559664 0.07103606]
 [0.0871293  0.0202184  0.83261985 0.77815675 0.87001215]]


The print out is a 2x5 array which is an array of shape of (2,5).  
  
The `np.zeros()` takes a array shape as argument and returns an array of the shape filled with 0´s.

In [7]:
print(np.zeros((2,5)))

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


We will use this to initialize the biases with shape (1, n_neurons) as a row_vector so we can add it to the dot product later without need of transposing it.

In [8]:
# Example of how to initialize weight and biases
import numpy as np
import nnfs

nnfs.init()

n_inputs = 2
n_neurons = 4

weights = 0.01 * np.random.randn(n_inputs, n_neurons)
biases = np.zeros((1, n_neurons))

print(weights)
print(biases)

[[ 0.01764052  0.00400157  0.00978738  0.02240893]
 [ 0.01867558 -0.00977278  0.00950088 -0.00151357]]
[[0. 0. 0. 0.]]


**Lets run all the code at once!**

In [13]:
# Full layer_Dense class so far:
import numpy as np
import nnfs
from nnfs.datasets import spiral_data

nnfs.init()

class Layer_Dense:
    
    # layer of initialization
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.01 * np.random.rand(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))
    
    # Forward pass
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases
        
# Generate some data to use the new class instead of our hardcoded calculations
# to perform a forward pass

# dataset creation
X, y = spiral_data(samples=100, classes=3)

# create Dense Layer with 2 input features and 3 output values
dense1 = Layer_Dense(2, 3)

# perform a forward pass of our training data through this layer
dense1.forward(X)

# check the output of the first few samples
print(dense1.output[:5])

[[0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [5.97437393e-05 3.68895635e-05 8.37819971e-05]
 [1.46999708e-04 9.15808050e-05 1.40212578e-04]
 [2.07372344e-04 1.30936343e-04 5.65641167e-05]
 [2.86790368e-04 1.80765084e-04 1.03854705e-04]]


The output is 5 rows of data that have 3 values each. Each of the 3 values is the values from the 3 neurons in the dense1 layer after passing in each of the samples. We have a neural network model that is still missing the activation function.