-
Notifications
You must be signed in to change notification settings - Fork 0
Documentation
MarcelWinterot edited this page Nov 22, 2023
·
6 revisions
from nano_keras.models import NN
The design and functionality of it is very simmilar to that of keras.models.Sequential as it also allows the user to stack layers on top of each other
from nano_keras.models import NN
model = NN()
We had to use np.copy() as if we simply appended the weights they would change if we did anything to them
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Initalizing the NN class with default parameteres
model.add(Input(5)) # Adding the Input layer with shape of (5,)
model.add(Dense(1, "relu")) # Adding the output layer
model.compile() # Compiling the model and generating it's weights
weights = model.get_weights() # Calling the NN.get_weights() function and assigning the output to weights parameter
It loops over the layers and checkf if the layer type has parameters, then tries to set the weights, if it fails to do so it exits the program
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np
model = NN() # Initalizing the NN with default paramteres
model.add(Input(5)) # Adding the input layer
model.add(Dense(25, "relu")) # Adding the output layer
model.compile() # Compiling the model, which generates the weights
input_weights = np.array([]) # Input layer has no weights so we initalize it with an empty array
dense_weights = np.random.randn(5, 25) # Dense layer weights shape is: (previous layer output, current layer units)
model.set_weights([input_weights, dense_weights]) # Setting the weights to those values
NN.add() adds another layer to the model. It stacks them on top of each other as I haven't implemented anything other yet.
Note that the first layer has to be an Input layer
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Creating the instance of NN class
model.add(Input(5)) # Adding the Input layer. **It has to be the first element** as currently it's the only one with input_shape parameter
model.add(Dense(25, "relu")) # Adding the first layer
# You can add as many layers as you want
NN.summary() is used to print out the model's architecture with all the output shapes, number of parameters and the name of the model
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Creating the instance of NN class
model.add(Input(5)) # Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Dense(25, "relu", name="Hidden layer")) # Adding the first hidden layer
model.add(Dense(5, "relu", name="Output layer")) # Adding the output layer
model.compile() # Used to compile the model and generate the weights. It is explained in the further part of the documentation
model.summary() # Calling the NN.summary() function with default parameters
"""
It prints out the following:
Model: NN
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
Input (Input) (None, 5) 0
Hidden layer (Dense) (None, 25) 150
Output layer (Dense) (None, 5) 130
=================================================================
Total params: 280 (2.188 kb)
_________________________________________________________________
"""
It takes in the loss_function, optimizer, metrics and the type of weight data and assign them to self variables
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer with shape (5,)
model.add(Dense(5, "relu")) # Adding the first hidden layer with 5 neurons and relu activation function
model.add(Dense(1 "sigmoid")) # Adding the output layer with 1 neuron and sigmoid activation function
# Compiling the model with mse loss function and NAdam optimizer, also the metrics is the accuracy as we want to see the accuracy during training
model.compile(loss="mse", optimizer="nadam", metrics="accuracy")
loss (Loss | str): Loss function the model should use, it can be either an initalized class or a name of the loss
optimizer (Optimizer | str): Optimizer the model should use to update it's weights and biases. Once again it can be either the Optimizer class instance or the name of the optimizer in the str format.
metrics (str): Metrics of the model. For now the only option is accuracy but that might change in the future. If it's set to accuracy, accuracy of the model will be displayed during training
weight_data_type (np.float_): Data type of the weights and biases. It can reduce the size and training time of the model if you set it to a lower value. Default is np.float64
Feed forward algorithm takes in input data and passes it to later layers which calculate the output based on the data
The output data is then passed to the next layer and that continues up until the last layer - the output layer
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np
x = np.random.randn(25) # Initalizing some random values using the normal distribution in the shape of (25,)
model = NN() # Creating the instance of NN class
model.add(Input(25)) # Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Dense(25, "relu")) # Adding the first hidden layer
model.add(Dense(5, "relu")) # Adding the second hidden layer
model.add(Dense(1, "relu")) # Adding the output layer
model.compile() # Compiling the model. Will be explained later in the documentation
model.feed_forward(x) # Calling the feed forward function with x as it's data
is_training (bool): Paramter to specify whether we are in the training loop or no. It changes the behaviour of a few layers, as for example: Dropout layer doesn't drop connections when it's set to False
Backpropagation is used to update the weights of the model based on the error it had during feed_forward
Similarly to NN.feed_forward we also call the backpropagation function of each layer during the training, the only difference is that we go from back to the front
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np
X = np.random.randn(25, 5) # We have 25 elements each with 5 inputs
y = np.zeros((25, 1))
model = NN() # Creating the instance of NN class
# Adding the input layer, it is required as none of the other layers have the input_shape parameter
model.add(Input(5))
model.add(Dense(1, "relu", name="Output layer")) # Adding the output layer
model.compile() # Used to compile the model and generate the weights. It is explained in the further part of the documentation
# Calling the get_weights method as I want to see if the backpropagation updated the weights
previous_weights = model.get_weights()
model.backpropagate(X, y, 2, 1) # Calling the backpropagation
# Calling get_weights to get the current weights
new_weights = model.get_weights()
# Checking if the weights are the same, they aren't meaning they have changed and our backpropagation works
print(np.array_equal(previous_weights, new_weights))
NN.train() is used to train the model on specified data. It's the equivalent to keras.models.Sequential.fit() function, although without batches
It works by having a loop over all the epochs and calling self.backpropagate() on the data, later it calls self.evaluate() and handles the callbacks, finnaly it prints the progres and continues
# For the full code see https://github.com/MarcelWinterot/nano-keras/blob/main/demos/demo1.py
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
from nano_keras.losses import MSE
from nano_keras.optimizers import NAdam
from nano_keras.callbacks import EarlyStopping
model = NN() # Initalizing the model
model.add(Input(5)) # Adding the input layer with shape of (5,)
model.add(Dense(25, "relu")) # Adding the first hidden layer with 25 neurons and relu activation function
model.add(Dense(10, "relu")) # Adding the second hidden layer with 10 neurons and relu activation function
model.add(Dense(5, "relu")) # Adding the third hidden layer with 5 neurons and relu activation function
model.add(Dense(1, "sigmoid")) # Adding the output layer with 1 neurons and sigmoid activation function
optimizer = NAdam() # Initalizing the optimizer
loss = MSE() # Initalizing the loss function we'll use
stop = EarlyStopping(5, "val_accuracy", restore_best_weights=True) # Initalizing early stopping callback with patience of 5, which monitors val_accuracy and when the training stops it restores the best weights
model.compile(loss, optimizer, metrics="accuracy") # Compiles the model which generates the weights and assigns the loss and optimizer
model.train(X_train, y_train, 50, verbose=2,
validation_data=(X_test, y_test), callbacks=stop) # Calls the train function
callbacks (EarlyStopping): Callbacks the model should use, it's not required and when it's not set we don't use any
verbose (int): Controls what we print out during training. 0 - nothing, 1 - only epoch/epochs with accuracy and loss, 2 - all information
validation_data (tuple[np.ndarray]): What validation data do you want to use to check val_loss and val_accuracy. It should be (X_validation, y_validation). If it isn't set there will be no validation predictions after each epoch.
It works by using self.feed_forward() on given X data, and then using self.loss_function.compute_loss() on the predicted data and y data.
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
import numpy as np
model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer with shape (5,)
model.add(Dense(5, "relu")) # Adding the output layer with 5 neurons and relu activation function
model.compile(metrics="accuracy") # Compiling the model with default params and accuracy as it's metrics
X = np.random.randn(25, 5) # Creating X dataset
y = np.random.randn(25, 5) # Creating y dataset
loss, acc = model.evaluate(X, y) # Calling the evaluate function with X and y as it's params, and the rest are default
It works by getting first the weights and biases of the model, appending them to an array and saving it using np.save()
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer into the model
# Adding the dense layer with 25 neurons and relu activation function into the model
model.add(Dense(25, "relu"))
# Adding the output layer with 5 neurons and relu activation funtcion into the model
model.add(Dense(5, "relu"))
model.compile() # Compiling the model with default arguments
model.save("./saved_model") # Saving the model at ./saved_model path
file_path (str): Path where we want to save the model at. Don't add the file extansion as numpy handles it
It works by opening the file and loading the data and then looping over the layers and assigning the parameters to the weights and biases
from nano_keras.models import NN
from nano_keras.layers import Input, Dense
model = NN() # Creating the model
model.add(Input(5)) # Adding the input layer into the model
# Adding the dense layer with 25 neurons and relu activation function into the model
model.add(Dense(25, "relu"))
# Adding the output layer with 5 neurons and relu activation funtcion into the model
model.add(Dense(5, "relu"))
model.compile() # Compiling the model with default parameters
# Loading the model paramters and assigning them
model.load("./saved_model.npy")
file_path (str): The path where we saved the model at, this time you have to add the extension as otherwise it doesn't work
optimizer[Optimizer | list[Optimizer]] - optimizer the layer should use to update the models parameters
In nano-keras it needs to be the first layer of the network as I haven't implemented input_shape paramter in any other layer
from nano_keras.models import NN
from nano_keras.layers import Input
model = NN() # Creating the model
model.add(Input(input_shape = (28, 28, 1))) # Adding the Input layer with input shape of (28, 28, 1)
input_shape (tuple | int): Input shape of the neural network. It's the shape of items in the dataset passed on to the model during training
Dense layer also known as fully connected layer connects every neuron from the previous layer to each neuron in the current layer forming a dense and a interconnected network
Each connection is associated with a weight, which the network adjusts during training to learn patterns and make predictions.
from nano_keras.models import NN
from nao_keras.layers import Dense, Input
model = NN() # Creating the model
model.add(Input(input_shape=5)) # Adding the Input layer with input shape of (5,)
model.add(Dense(units=5, activation="relu")) # Adding the Dense layer with 5 neurons and relu activation function, all the other paramters have default values
activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.
weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is random
regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting
from nano_keras.models import NN
from nano_keras.layers import Input, Dropout
model = NN() # Creating the model
model.add(Input(input_shape=5)) # Adding the input layer with shape of (5,)
# Adding the Dropout layer with 5 neurons and relu activation function, and dropout_rate = 0.2 ,all the other paramters have default values
model.add(Dropout(units=25, activation="relu", dropout_rate = 0.2))
activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.
dropout_rate (float): The percentage of connections we should drop. Note that the number after the dot is the percentage, so for example 0.2 means 20%
weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is random
regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting
Flatten layer is used to flatten the output of the previous layer from a n dimensional array to 1 dimensional array
from nano_keras.models import NN
from nano_keras.layers import Input, Flatten, Dense
model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape=(28, 28, 1)
model.add(Flatten()) # Adding the flatten layer
model.add(Dense(120, "relu")) # Adding the output layer with 120 neurons and relu activation function
Reshape layer is used to change the shape of the output from the previous layer into a desired shape without changing it's values
from nano_keras.models import NN
from nano_keras.layers import Input, Reshape
model = NN() # Creating the model
model.add(Input(25)) # Adding the input layer with input_shape of (25,)
model.add(Reshape(target_shape=(5, 5, 1))) # Adding the Reshape layer with target_shape of (5, 5, 1)
from nano_keras.models import NN
from nano_keras.layers import Input, MaxPool2D
model = NN() # Creating the model
model.add(Input(10)) # Adding the input layer with input_shape of (10,)
model.add(MaxPool1D(pool_size=2, strides=2)) # Adding the MaxPool1D layer with pool size of 2 and strides = 2. The output is (5,)
pool_size (int): The size of the pooling window. The bigger it is the smaller the output will be. Defaults to 2
strides (int): Step of the pooling window. Also the bigger it is the smaller the output will be. Defaults to 2
name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to MaxPool1D
from nano_keras.models import NN
from nano_keras.layers import Input, MaxPool2D
model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape of (28, 28, 1)
model.add(MaxPool2D(pool_size=(2, 2) strides=(2, 2))) # Adding MaxPool2D layer with pool_size of (2, 2) and strides of (2, 2). Output is (14, 14, 1)
pool_size (tuple): The size of the pooling window. The bigger it is the smaller the output will be. Defaults to (2, 2)
strides (tuple): Step of the pooling window. Also the bigger it is the smaller the output will be. Defaults to (2, 2)
name (str): Name of the layer, helps with debugging if you run into a problem. Defaults to MaxPool2D
from nano_keras.models import NN
from nano_keras.layers import Input, Conv1D
model = NN() # Creating the model
model.add(Input(28, 1)) # Adding the input layer with input_shape of (28, 1)
model.add(Conv1D(filters=32, kernel_size=2, strides=2)) # Output is (14, 32) # Adding Conv1D layer with 32 filters, kernel_size of 2 and strides = 2.
kernel_size (int): Size of the kernel the layer uses. The bigger it is the smaller the output. Defaults to 2
strides (int): By how much should the kernel move. The bigger it is the smaller the output. Defaults to 2
activation (Activation | str): Activation function the layer should use. Activation functions introduce non-linearity to the model, allowing it to learn complex patterns.
weight_initialization (str): Weight initialization strategy the layer should use in order to generate the weights. Currently there are 3 options, those being: random, xavier, he. Default value is he
regulizer (Regulizer): Regulizer the model should use. We use this to punish the layer for having big weights. This helps if you are struggling with overfitting
Unlike the name says, it actually is meant for 3d data being for example: (height, width, channels) when it's an image
It's default implementation uses a kernel which slides over the input data and then performs a dot operation between filters and the data under the kernel.
But we can speed it up using the im2col technique which converts the image into columns which allows us to abandon for loops and instead use np.dot(weights, x).
from nano_keras.models import NN
from nano_keras.layers import Input, Conv2D
model = NN() # Creating the model
model.add(Input((28, 28, 1))) # Adding the input layer with input_shape = (28, 28, 1)
model.add(Conv2D(32, kernel_size=(2, 2), strides=(2, 2))) # Adding Conv2D layer with 32 filters, kernel_size of (2, 2) and strides = (2, 2). Output is (14, 14, 32)