# Neural Network from Scratch
In this notebook, I implement a configurable "vanilla" fully connected neural network from scratch, then use it to create text embeddings.

I built this project to practice translating mathematical operations (e.g., forward and backward prop, gradient descent) into vectorized code.

Author: [Ryan Parker](https://github.com/rparkr)

## Required packages

In [16]:
# built-in packages
import random  # pseudorandom number generation for setting seed values
from typing import Union, Optional  # type annotations for functions

# third-party packages
import numpy as np  # array-based computation

# Functions


In [176]:
def sigmoid(z: Union[float, np.ndarray]) -> Union[float, np.ndarray]:
    '''Calculate probabilities using the sigmoid function on input data, z.
    The sigmoid, or logistic, function is used as an activation for
    neurons of a neural network because it is non-linear.

    Parameters
    ----------
    z: {float, np.ndarray}
        The input to be passed through the sigmoid function.
        Typically, input values come from a linear combination
        of input values with weights and a bias term. That is,
        some form of y = wx + b.

    Returns
    -------
    sigmoid_value: {float, np.ndarray}
        The output of the sigmoid function, in the same
        shape as the input.
    '''
    return 1 / (1 + np.exp(-z))


def sigmoid_df():
    pass


def relu(x_input: Union[float, np.ndarray]) -> Union[float, np.ndarray]:
    '''Compute the Rectified Linear Unit activation function.'''
    return np.maximum(x_input, 0)


def linear(
        x_input: np.ndarray,
        weights: np.ndarray,
        bias: Union[float, np.ndarray] = 0.0) -> np.ndarray:
    '''Compute a linear function: y = wx + b.
    
    Returns
    -------
    y_hat: the computed value(s) after multiplying
        inputs by the weights and adding a bias.
    '''
    # Matrix-multiplication with the @ operator
    y_hat = (x_input @ weights) + bias
    return y_hat

# Network architecture
The model is a fully connected network, with configurable parameters for the number of layers and the number of neurons per layer. To enable that flexibility, I've chosen to implement the model as a class that can be instantiated into a neural network object that can be trained and used for inference.

In [158]:
class nn():
    '''Create a fully connected neural network with a configurable number of layers and neurons.'''
    def __init__(
            self,
            input_dim: int,
            n_layers: int = 2, 
            neurons_per_layer: Union[int, list[int]] = [3, 1], 
            bias: Union[bool, list[bool]] = True,
            seed: int = random.randint(0, 99_999)):
        '''Instantiate a neural network as a model object that performs
        computations on input data and returns the outputs
        of those computations. Parameters (weights and biases) are
        intitialized to random values with mean 0 and standard deviation of 1.

        Parameters
        ----------
        input_dim: int
            The dimension of the inputs to the model (i.e, the number
            of features per input sample).
        n_layers: int, default = 2
            The number of layers in the network including all hidden layers
            and the output layer.
        neurons_per_layer: {int, list[int]}, default = [3, 1]
            The number of neurons in each layer of the network. If an `int`,
            then all layers will have the same number of neurons. If a
            `list`, then its `len()` must equal `n_layers`, where each
            element in the list holds the number of neurons in the 
            corresponding layer of the network.
        bias: {bool, list[bool]}, default = True
            Whether to include a bias term for the neurons in the network.
            If `True`, all neurons in the network will have a bias term.
            If a list of `bool` values is provided, each layer of the network
            will (or won't) have a bias term for its neurons based on the
            corresponding element in the list. If a list, the number of
            elements must equal `n_layers`. 
        seed: int
            For reproducibility, set a seed that will be used
            when initializing parameter values. 

        Returns
        -------
        self: an instance of `nn` with randomly-initialized weights.

        Methods
        -------
        forward(input): compute a forward pass through the network,
            returning the output from the final layer.
        backward(): compute a backward pass through the network,
            storing the partial derivatives of the parameters
            with respect to the loss.
        zero_grad(): clears (zeros-out) the stored gradients.
        step(learning_rate): update the parameters by taking
            a step (scaled by learning_rate) in the negative
            direction of the gradient.

        Attributes
        ----------
        w: a list of the network's layers, in order.
            Each layer is a numpy array of weights.
        b: a list of the bias values in each layer of the network,
            in order.
        n_params: the total number of parameters in the network,
            including weights and biases.
        gradients: a list of the gradients for each of the parameters
            in the network, computed after calling backward().
        input_dim: the size of the input that will be passed into the
            network. The network will expect all inputs to have this
            same dimension. Since each input sample is a vector, this
            is the number of features in an input row.
        n_layers: The number of layers in the network, including hidden
            layers and the output layer (excluding the input).
        neurons_per_layer: An integer or a list of integers representing
            the number of neurons in each layer.
        bias: A boolean value or a list of boolean values representing
            whether each layer has a bias term added to it.
        '''
        
        # Validate arguments
        if type(input_dim) != int:
            raise TypeError(f"input_dim must be an integer, but the provided value was: {type(input_dim)}")

        if type(n_layers) != int:
            raise TypeError(f"n_layers must be an integer, but the provided value was: {type(n_layers)}")
        
        if type(neurons_per_layer) not in (int, list):
            raise TypeError(f"neurons_per_layer must be an int or a list of int, not: {type(neurons_per_layer)}")
        elif type(neurons_per_layer) == list:
            if len(neurons_per_layer) != n_layers:
                raise ValueError(f"If neurons_per_layer is a list, it must have the same number of elements as n_layers ({n_layers}).")
            if any([type(i) != int for i in neurons_per_layer]):
                raise TypeError(f"neurons_per_layer must be an int or a list of int. Not all elements provided in neurons_per_layer were of the int type.")
        
        if type(bias) not in (bool, list):
            raise TypeError(f"bias must be a bool or a list of bool, not {type(bias)}")
        elif type(bias) == list:
            if len(bias) != n_layers:
                raise ValueError(f"If bias is a list, it must have the same number of elements as n_layers ({n_layers}).")
            if any([type(i) != bool for i in bias]):
                raise TypeError(f"bias must be a bool or a list of bool. Not all elements provided in bias were of the bool type.")

        if type(seed) != int:
            raise TypeError(f"Seed must be an int value, not {type(seed)}")

        # Update object's parameters (equivalent to self.n_layers = n_layers; self.bias = bias; ...)
        self.__dict__.update(locals())

        # If neurons_per_layer and bias are individual values, convert them 
        # to lists to use when creating layers
        if type(neurons_per_layer) == int:
            neurons_per_layer = [neurons_per_layer for _ in range(n_layers)]
        if type(bias) == bool:
            bias = [bias for _ in range(n_layers)]
    
        # Create layers (parameters)
        self.w = []
        self.b = []
        input_sizes = [input_dim]
        input_sizes.extend(neurons_per_layer[:-1])
        for n in range(n_layers):
            w, b = self._linear_layer(
                input_sizes[n], neurons_per_layer[n], bias[n], seed=seed)
            self.w.append(w)
            self.b.append(b)
        
        # The second term counts bias terms only for layers with a bias
        self.n_params = (np.dot(input_sizes, neurons_per_layer) 
                         + np.dot(bias, neurons_per_layer))


    def _linear_layer(self, input_size: int, n_neurons: int, bias: bool, seed: int):
        '''Create a linear layer. Used when constructing the network
        at time of instantiation.
        
        The input size determines the number of weights (rows) and the
        number of neurons determines the number of columns. The bias
        will be a 1D NumPy array with len() equal to the n_neurons.
        '''
        # set up random number generator
        rng = np.random.default_rng(seed)
        # Generate the weight and bias arrays
        w = rng.normal(size=(input_size, n_neurons))
        if bias:
            b = rng.normal(size=(n_neurons))
        else:
            b = 0
        return w, b


    def forward(self, input: Union[float, np.ndarray], activation: str='relu'):
        '''Pass inputs through the network and return the outputs from
        the final network layer.
        
        Parameters
        ----------
        input: {float, np.ndarray}
            Input to the network, either a single sample or a batch of samples
            with shape: m_samples, n_features_per_sample.
        activation: {'sigmoid', 'relu'}, default='relu'
            The activation function to use after passing input through each
            linear layer.
        '''
        # Validate input
        input_size = input.shape[1] if input.ndim > 1 else input.shape[0]
        if input_size != self.input_dim:
            raise ValueError(f"Passed input of size {input_size} does not match"
                             f" expected size of input_dim: {self.input_dim}.")
        
        # Compute at each layer and pass to the next
        x = input
        for n in range(self.n_layers):
            y_hat = linear(x_input=x, weights=self.w[n], bias=self.b[n])
            if activation == 'sigmoid':
                y_hat = sigmoid(y_hat)
            else:
                y_hat = relu(y_hat)
            # Set input for next layer as output of previous
            x = y_hat
        
        return y_hat
        
        
    

In [159]:
model = nn(input_dim=10, n_layers=4, neurons_per_layer=[3, 4, 6, 2], bias=False)

In [175]:
x_input = np.arange(30).reshape(3, 10)
model.forward(x_input, activation='sigmoid')

array([[0.61917585, 0.25379785],
       [0.61917596, 0.25379776],
       [0.61917596, 0.25379776]])