# Building a Neural Network From Scratch: Layers

In this notebook, we'll work on building a computational neuron and a neural network layer from scratch (Note: we will not implement training/backpropagation from scratch)

1. Neuron implementation in vanilla Python - scalar input, scalar output
2. Activation functions - how to implement, sigmoid, ReLU
3. Neuron layers - properly parallelized neural network layers in NumPy

## Neurons in Vanilla Python

A neuron includes the following:
1. A weight - the slope value or multiplication parameter (is a matrix in multidimensional settings)
2. A bias - the y-intercept value in the linear equation (is a vector in multidimensional settings)
3. An activation function - the nonlinear equation that allows for higher approximation capability

In [1]:
class BasicNeuron:
    def __init__(self, weight: float, bias: float) -> None:
        self.weight = weight
        self.bias = bias

    def __call__(self, x: float) -> float:
        return self.weight * x + self.bias # running the linear transform on the input

In [2]:
# creating a neuron
weight = 2
bias = 1

neuron = BasicNeuron(weight, bias)

neuron(2) # should be 5

5

## Adding Activation Functions

There are many different types of activation functions:
- ReLU - a function that returns the input value if its positive and 0 if it's negative
- Sigmoid - a curve that lies between 0 and 1 in its identity form

In [3]:
# implementing ReLU from scratch

def relu(x):
    return max(0, x) # returns 0 if x < 0

# testing relu

print(relu(2)) # should be 2
print(relu(-2)) # should be 0

2
0


In [6]:
# Implementing sigmoid from scratch
from math import e # euler's number, used for the formula

def sigmoid(x):
    return 1 / (1 + e ** (-x))

print(sigmoid(0)) # should be 0.5
print(sigmoid(-10)) # should be near zero
print(sigmoid(10)) # should be near 1

0.5
4.539786870243442e-05
0.9999546021312976


In [7]:
# updated neuron with activation
from typing import Callable # type used for a function passed as an argument

class Neuron:
    def __init__(self, weight: float, bias: float, activation: Callable) -> None:
        self.w = weight
        self.b = bias
        self.activation = activation
    
    def __call__(self, x: float):
        o = self.w * x + self.b
        return self.activation(o)

In [8]:
# test with relu

neuron_relu = Neuron(1.5, 0.2, relu)
neuron_relu(2) # should be 3.2

3.2

In [9]:
# test with sigmoid

neuron_sig = Neuron(1.5, 0.2, sigmoid)
neuron_sig(0.5) # should be ≈0.72

0.7211151780228631

There are many other activation functions to try/implement, but we won't get into all of them here

Some to look for later:
- Tanh - hyperbolic tangent
- ELU - exponential linear unit
- SiLU - sigmoid linear unit
- GLU - gated linear unit (parameterized)

## Parallelized Neuron Layers in NumPy

Computing every individual neuron's output in a layer is tedious and requires lots of extra work from our computer

Instead, we can provide the inputs to all the neurons in a layer as a vector. The layer can then output a vector

In [23]:
import numpy as np

# defining an activation relu that can work on numpy arrays

def parallelized_relu(x: np.array):
    return np.vectorize(relu)(x)

# our new neuron will look very similar to the original, because numpy can convert regular operations into parallelized vector ones
# the main change we need to make is our types

class ParallelizedNeuron:
    def __init__(self, weight: np.matrix, bias: np.array, activation: Callable) -> None:
        self.w = weight
        self.b = bias
        self.act = activation
    
    def __call__(self, x: np.array) -> np.array: # here we get and return an array because we get all neuron inputs as a vector
        o = np.matmul(self.w, x) + self.b
        return self.act(o)

In [24]:
# testing our implementation

# defining how many neurons of input (the input vector size) and how many neurons in output (output vector size)
input_size = 5
output_size = 3

# generating a weight matrix with random values
weight = np.random.rand(output_size, input_size) # this is a projection matrix from the input size to output size, dimensions are input_size x output_size

# our bias is added once the vector is transformed from input_size to output_size, it should be of size output_size
bias = np.random.rand(output_size)

neuron = ParallelizedNeuron(weight, bias, parallelized_relu)

In [25]:
# testing our neuron

x = np.random.rand(input_size, 1)

neuron(x)

array([[1.34583703, 2.11412817, 1.85956705],
       [0.81612443, 1.58441558, 1.32985445],
       [1.2486747 , 2.01696585, 1.76240472]])

Now that we use NumPy, we can leverage vectorized operations to compute neural network layers faster

We will soon switch to PyTorch, which has the ability to calculate gradients of vectors and matrices, which the neural network will use to learn over time