Documentation

Documentation of the nano-keras library

Overview

Models

Layers

Optimizers

Loss functions

Activation functions

Weight regulizers

Models

In this section we'll cover the main part of nano-keras, the models available.

Currently there is only 1 available and that being the NN class available at

from nano_keras.models import NN

The design and functionality of it is very simmilar to that of keras.models.Sequential as it also allows the user to stack layers on top of each other

Now with that out of the way let's get straight to what the NN class allows the user to do

Functions available:

init

get_weights

set_weights

add

summary

compile

feed forward

backpropagate

train

evaluate

save

load

NN.init()

NN.init is the initalizer for the NN class used to create an instance of it

Usage example:

from nano_keras.models import NN

model = NN()

Paramteres the function takes:

name (str): Name of the model. Defaults to NN

NN.get_weights()

NN.get_weights() returns the weights of the model

It does so by looping over each layer and then calling np.copy() on each layer's weights.

We had to use np.copy() as if we simply appended the weights they would change if we did anything to them

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN() # Initalizing the NN class with default parameteres
model.add(Input(5)) # Adding the Input layer with shape of (5,)
model.add(Dense(1, "relu")) # Adding the output layer

model.compile() # Compiling the model and generating it's weights

weights = model.get_weights() # Calling the NN.get_weights() function and assigning the output to weights parameter

The function doesn't take any paramters

NN.set_weights()

NN.set_weights() is used to set the weights of the model with the one you desire

It loops over the layers and checkf if the layer type has parameters, then tries to set the weights, if it fails to do so it exits the program

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np

model = NN() # Initalizing the NN with default paramteres
model.add(Input(5)) # Adding the input layer
model.add(Dense(25, "relu")) # Adding the output layer

model.compile() # Compiling the model, which generates the weights

input_weights = np.array([]) # Input layer has no weights so we initalize it with an empty array
dense_weights = np.random.randn(5, 25) # Dense layer weights shape is: (previous layer output, current layer units)

model.set_weights([input_weights, dense_weights]) # Setting the weights to those values

NN.add()

NN.add() adds another layer to the model. It stacks them on top of each other as I haven't implemented anything other yet.

Note that the first layer has to be an Input layer

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN() # Creating the instance of NN class

model.add(Input(5)) # Adding the Input layer. **It has to be the first element** as currently it's the only one with input_shape parameter
model.add(Dense(25, "relu")) # Adding the first layer
# You can add as many layers as you want

Parameters the function takes:

Layer (Layer): Instance of the layer you wish to add with all the required paramters set

NN.summary()

NN.summary() is used to print out the model's architecture with all the output shapes, number of parameters and the name of the model

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN() # Creating the instance of NN class
model.add(Input(5)) # Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Dense(25, "relu", name="Hidden layer")) # Adding the first hidden layer
model.add(Dense(5, "relu", name="Output layer")) # Adding the output layer

model.compile() # Used to compile the model and generate the weights. It is explained in the further part of the documentation

model.summary() # Calling the NN.summary() function with default parameters

"""
It prints out the following:

Model: NN
_________________________________________________________________
Layer (type)                Output Shape              Param #
=================================================================
Input (Input)                (None, 5)                 0

Hidden layer (Dense)         (None, 25)                150

Output layer (Dense)         (None, 5)                 130

=================================================================
Total params: 280 (2.188 kb)
_________________________________________________________________
"""

Paramteres the function takes:

line_length (int): How long do you want the lines to be. Defaults to 65

NN.compile()

Function to compile the model and generate it's weights

It takes in the loss_function, optimizer, metrics and the type of weight data and assign them to self variables

Later it calls self.generate_weights() to create the weights and biases of the model

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer with shape (5,)
model.add(Dense(5, "relu")) # Adding the first hidden layer with 5 neurons and relu activation function
model.add(Dense(1 "sigmoid")) # Adding the output layer with 1 neuron and sigmoid activation function

# Compiling the model with mse loss function and NAdam optimizer, also the metrics is the accuracy as we want to see the accuracy during training
model.compile(loss="mse", optimizer="nadam", metrics="accuracy")

Paramters the function takes:

loss (Loss | str): Loss function the model should use, it can be either an initalized class or a name of the loss

optimizer (Optimizer | str): Optimizer the model should use to update it's weights and biases. Once again it can be either the Optimizer class instance or the name of the optimizer in the str format.

metrics (str): Metrics of the model. For now the only option is accuracy but that might change in the future. If it's set to accuracy, accuracy of the model will be displayed during training

weight_data_type (np.float_): Data type of the weights and biases. It can reduce the size and training time of the model if you set it to a lower value. Default is np.float64

NN.feed_forward()

NN.feed_forward() is the implemenetation of the feed forward algorithm on the whole network

Feed forward algorithm takes in input data and passes it to later layers which calculate the output based on the data

The output data is then passed to the next layer and that continues up until the last layer - the output layer

In nano-keras it simply calls the subsequent layers feed_forward function

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np

x = np.random.randn(25) # Initalizing some random values using the normal distribution in the shape of (25,)

model = NN() # Creating the instance of NN class
model.add(Input(25)) # Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Dense(25, "relu")) # Adding the first hidden layer
model.add(Dense(5, "relu")) # Adding the second hidden layer
model.add(Dense(1, "relu")) # Adding the output layer

model.compile() # Compiling the model. Will be explained later in the documentation

model.feed_forward(x) # Calling the feed forward function with x as it's data

Paramteres the function takes:

X (np.ndarray): X dataset on which we want the model to compute an output

is_training (bool): Paramter to specify whether we are in the training loop or no. It changes the behaviour of a few layers, as for example: Dropout layer doesn't drop connections when it's set to False

NN.backpropagate()

NN.backpropagate() implemenents the backpropgation algorithm in nano-keras.

Backpropagation is used to update the weights of the model based on the error it had during feed_forward

Similarly to NN.feed_forward we also call the backpropagation function of each layer during the training, the only difference is that we go from back to the front

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np

X = np.random.randn(25, 5)  # We have 25 elements each with 5 inputs
y = np.zeros((25, 1))

model = NN()  # Creating the instance of NN class
# Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Input(5))
model.add(Dense(1, "relu", name="Output layer"))  # Adding the output layer

model.compile()  # Used to compile the model and generate the weights. It is explained in the further part of the documentation

# Calling the get_weights method as I want to see if the backpropagation updated the weights
previous_weights = model.get_weights()

model.backpropagate(X, y, 2, 1)  # Calling the backpropagation

# Calling get_weights to get the current weights
new_weights = model.get_weights()

# Checking if the weights are the same, they aren't meaning they have changed and our backpropagation works
print(np.array_equal(previous_weights, new_weights))

NN.train()

NN.train() is used to train the model on specified data. It's the equivalent to keras.models.Sequential.fit() function, although without batches

It works by having a loop over all the epochs and calling self.backpropagate() on the data, later it calls self.evaluate() and handles the callbacks, finnaly it prints the progres and continues

Usage example:

# For the full code see https://github.com/MarcelWinterot/nano-keras/blob/main/demos/demo1.py
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
from nano_keras.losses import MSE
from nano_keras.optimizers import NAdam
from nano_keras.callbacks import EarlyStopping

model = NN() # Initalizing the model
model.add(Input(5)) # Adding the input layer with shape of (5,)
model.add(Dense(25, "relu")) # Adding the first hidden layer with 25 neurons and relu activation function
model.add(Dense(10, "relu")) # Adding the second hidden layer with 10 neurons and relu activation function
model.add(Dense(5, "relu")) # Adding the third hidden layer with 5 neurons and relu activation function
model.add(Dense(1, "sigmoid")) # Adding the output layer with 1 neurons and sigmoid activation function

optimizer = NAdam() # Initalizing the optimizer
loss = MSE() # Initalizing the loss function we'll use

stop = EarlyStopping(5, "val_accuracy", restore_best_weights=True) # Initalizing early stopping callback with patience of 5, which monitors val_accuracy and when the training stops it restores the best weights

model.compile(loss, optimizer, metrics="accuracy") # Compiles the model which generates the weights and assigns the loss and optimizer

model.train(X_train, y_train, 50, verbose=2,
            validation_data=(X_test, y_test), callbacks=stop) # Calls the train function

Paramters the function takes:

X (np.ndarray): X dataset on which the model predicts

y (np.ndarray): y dataset, which is the correct answer to each prediction the model made

epochs (int): How many training loops do we want the model to make

callbacks (EarlyStopping): Callbacks the model should use, it's not required and when it's not set we don't use any

verbose (int): Controls what we print out during training. 0 - nothing, 1 - only epoch/epochs with accuracy and loss, 2 - all information

validation_data (tuple[np.ndarray]): What validation data do you want to use to check val_loss and val_accuracy. It should be (X_validation, y_validation). If it isn't set there will be no validation predictions after each epoch.

NN.evaluate()

NN.evaluate function is used to check the model's loss and accuracy on given data

It works by using self.feed_forward() on given X data, and then using self.loss_function.compute_loss() on the predicted data and y data.

Accuracy is calculated by checking if the predicted data argmax is the same as y data argmax

Note that accuracy is only calculated if the metrics has "accuracy" in it

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np

model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer with shape (5,)
model.add(Dense(5, "relu")) # Adding the output layer with 5 neurons and relu activation function

model.compile(metrics="accuracy") # Compiling the model with default params and accuracy as it's metrics

X = np.random.randn(25, 5) # Creating X dataset
y = np.random.randn(25, 5) # Creating y dataset

loss, acc = model.evaluate(X, y) # Calling the evaluate function with X and y as it's params, and the rest are default

NN.save()

Function to save the models weights and biases to a file

It works by getting first the weights and biases of the model, appending them to an array and saving it using np.save()

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN()  # Creating the model
model.add(Input(5))  # Adding the input layer into the model
# Adding the dense layer with 25 neurons and relu activation function into the model
model.add(Dense(25, "relu"))
# Adding the output layer with 5 neurons and relu activation funtcion into the model
model.add(Dense(5, "relu"))

model.compile()  # Compiling the model with default arguments

model.save("./saved_model")  # Saving the model at ./saved_model path

Paramteres the function takes:

file_path (str): Path where we want to save the model at. Don't add the file extansion as numpy handles it

NN.load()

Function to load and set the models params from a file

It works by opening the file and loading the data and then looping over the layers and assigning the parameters to the weights and biases

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dense

model = NN()  # Creating the model
model.add(Input(5))  # Adding the input layer into the model
# Adding the dense layer with 25 neurons and relu activation function into the model
model.add(Dense(25, "relu"))
# Adding the output layer with 5 neurons and relu activation funtcion into the model
model.add(Dense(5, "relu"))

model.compile()  # Compiling the model with default parameters

# Loading the model paramters and assigning them
model.load("./saved_model.npy")

Paramteres the function takes:

file_path (str): The path where we saved the model at, this time you have to add the extension as otherwise it doesn't work

Layers

Information about how all the layers implemented in nano-keras work and how to use them

We'll cover:

But before diving into the layers, let me explain what function does each layer class have

Functions of each layer class

init()

Initalizes the layer using given paramters. Varies depending on the layer

output_shape()

Computes the output shape of the model. It's parameters are:

layers[list] - list of all the layers

current_layer_index[int] - what layer are we currently on

repr()

Overwrites the print statement, so that it has a good looking print

call()

Call function (feed forward) for the layer. It's parameters are:

X[np.ndarray] - x dataset on which we want to predict

is_training[bool] - specifies if we are in the training loop or no

backpropagate()

Backpropgation algorithm for each layer. It's parameters are:

gradient[np.ndarray] - input gradient to the layer

optimizer[Optimizer | list[Optimizer]] - optimizer the layer should use to update the models parameters

And now let's move onto the layers

Input

The simplest layer we will cover in here, but one of the most important if not the most important

Input layer is responsible for getting training data and passing it to the next layers

In nano-keras it needs to be the first layer of the network as I haven't implemented input_shape paramter in any other layer

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input

model = NN() # Creating the model
model.add(Input(input_shape = (28, 28, 1))) # Adding the Input layer with input shape of (28, 28, 1)

Paramaters the layer takes:

input_shape (tuple | int): Input shape of the neural network. It's the shape of items in the dataset passed on to the model during training

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Input

Dense

Dense layer also known as fully connected layer connects every neuron from the previous layer to each neuron in the current layer forming a dense and a interconnected network

Each connection is associated with a weight, which the network adjusts during training to learn patterns and make predictions.

Usage example:

from nano_keras.models import NN
from nao_keras.layers import Dense, Input

model = NN() # Creating the model
model.add(Input(input_shape=5)) # Adding the Input layer with input shape of (5,)
model.add(Dense(units=5, activation="relu")) # Adding the Dense layer with 5 neurons and relu activation function, all the other paramters have default values

Paramaters the layer takes:

units (int): The number of neurons in the current layer

activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.

weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is random

regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Dense

Dropout

Dropout layer drops connections between neurons to prevent over fitting

In nano_keras it is used the same as Dense layer but with the addition of the dropped connections

The dropped connections are set to 0

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Dropout

model = NN() # Creating the model
model.add(Input(input_shape=5)) # Adding the input layer with shape of (5,)
# Adding the Dropout layer with 5 neurons and relu activation function, and dropout_rate = 0.2 ,all the other paramters have default values
model.add(Dropout(units=25, activation="relu", dropout_rate = 0.2))

Parameters the layer takes

units (int): The number of neurons in the current layer

activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.

dropout_rate (float): The percentage of connections we should drop. Note that the number after the dot is the percentage, so for example 0.2 means 20%

weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is random

regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Dropout

Flatten

Flatten layer is used to flatten the output of the previous layer from a n dimensional array to 1 dimensional array

One of it's uses is in a CNN to pass the output of a 2d layer to a 1d layer

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Flatten, Dense

model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape=(28, 28, 1)
model.add(Flatten()) # Adding the flatten layer
model.add(Dense(120, "relu")) # Adding the output layer with 120 neurons and relu activation function

Paramters the layer takes:

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Flatten

Reshape

Reshape layer is used to change the shape of the output from the previous layer into a desired shape without changing it's values

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Reshape

model = NN() # Creating the model
model.add(Input(25)) # Adding the input layer with input_shape of (25,)
model.add(Reshape(target_shape=(5, 5, 1))) # Adding the Reshape layer with target_shape of (5, 5, 1)

Paramters the layer takes:

target_shape (tuple): Shape the layer should reshape the input into

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Reshape

MaxPool1D

MaxPool1D is a 1d pooling layer used to reduces the size of data passed to the next layer

It uses a small pooling window which it slides over the input data

It then picks the maximum value in the pool and places it in the output array

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, MaxPool2D

model = NN() # Creating the model
model.add(Input(10)) # Adding the input layer with input_shape of (10,)
model.add(MaxPool1D(pool_size=2, strides=2)) # Adding the MaxPool1D layer with pool size of 2 and strides = 2. The output is (5,)

Parameters the layer takes:

pool_size (int): The size of the pooling window. The bigger it is the smaller the output will be. Defaults to 2

strides (int): Step of the pooling window. Also the bigger it is the smaller the output will be. Defaults to 2

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to MaxPool1D

MaxPool2D

Another max pooling layer but this time for 2d data

Another change compared to MaxPool1D is that the pooling window is 2d

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, MaxPool2D

model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape of (28, 28, 1)
model.add(MaxPool2D(pool_size=(2, 2) strides=(2, 2))) # Adding MaxPool2D layer with pool_size of (2, 2) and strides of (2, 2). Output is (14, 14, 1)

Parameters the layer takes:

pool_size (tuple): The size of the pooling window. The bigger it is the smaller the output will be. Defaults to (2, 2)

strides (tuple): Step of the pooling window. Also the bigger it is the smaller the output will be. Defaults to (2, 2)

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to MaxPool2D

Conv1D

Conv1D is a 1d convolutional layer used to extract features from the input channels

It uses 1d kernel and filters to calculate the output. First the kernel slides over the data.

Then we use filters to perform a dot product between filters and data under the kernel

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Conv1D

model = NN() # Creating the model
model.add(Input(28, 1)) # Adding the input layer with input_shape of (28, 1)
model.add(Conv1D(filters=32, kernel_size=2, strides=2)) # Output is (14, 32) # Adding Conv1D layer with 32 filters, kernel_size of 2 and strides = 2.

Parameters the layer takes:

filters (int): Number of filters the layer should use. Defaults to 1

kernel_size (int): Size of the kernel the layer uses. The bigger it is the smaller the output. Defaults to 2

strides (int): By how much should the kernel move. The bigger it is the smaller the output. Defaults to 2

activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.

weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is he

regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Conv1D

Conv2D

Last layer currently implemented is Conv2D

It's the second convolutional layer in nano-keras, but this time it uses 2d kernel

Unlike the name says, it actually is meant for 3d data being for example: (height, width, channels) when it's an image

It's default implementation uses a kernel which slides over the input data and then performs a dot operation between filters and the data under the kernel.

But we can speed it up using the im2col technique which converts the image into columns which allows us to abandon for loops and instead use np.dot(weights, x).

Usage example:

from nano_keras.models import NN
from nano_keras.layers import Input, Conv2D

model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape = (28, 28, 1)
model.add(Conv2D(32, kernel_size=(2, 2), strides=(2, 2))) # Adding Conv2D layer with 32 filters, kernel_size of (2, 2) and strides = (2, 2). Output is (14, 14, 32)

Parameters the layer takes:

filters (int): Number of filters the layer should use. Defaults to 1

kernel_size (tuple): Size of the kernel the layer uses. The bigger it is the smaller the output. Defaults to (2, 2)

strides (tuple): By how much should the kernel move. The bigger it is the smaller the output. Defaults to (2, 2)

activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.

weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is he

regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting

name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to Conv2D

Optimizers

Loss functions

Activation functions

Weight regulizers