In this project, I build a deep neural network without the aid of any deep learning library (Tensorflow, Keras, Pytorch, …). The reason for imposing myself on this task is that nowadays it is effortless to build deep and complex neural networks using the high-level tools provided by some python libraries. This approach allows machine learning practitioners to create powerful models with just a few lines of code, but it has the massive downside of leaving the functioning of those networks unclear.
This approach differs from other implementations for the strategy of storing the cached values. Also, differently from most implementation, this code allows to compare infinite possible network architectures as the number of layers and activation units is defined by the user.
An article describing in detail both the theoretical and coding part has been published on Towards Data Science. You can find it here.
This repository is organized as follows.
train.csv
CSV file containing the training set.
test.csv
CSV file containing the test set.
main.py
Python script to:
- load the training and test sets
- set the network's architecture
- set the hyperparameters (learning step
$\alpha$ , number of iterations) - launch the learning process
- save the learned weights and biases
utils.py
Python file to:
- shuffle data
- normalize data
- initialize parameters
- compute activation functions (ReLu, Softmax) and their derivatives
- One Hot Encode the data
- implement forward propagation
- implement back propagation
- update the network's parameters (weights and biases)
- compute cross entropy
- compute accuracy
- plot the learning curves