This project aims to develop a complete deep learning library from scratch. Conducted for educational purposes, this project helps me to better understand how famous libraries such as PyTorch are working, and how the learning algorithms are implemented at lower levels.
The programming language used for this project is Julia. Julia is a recent programming language with a fast-growing community. It is made for scientific computation, and is famous for being faster than Python or Matlab. This open source programming language is dynamically typed and uses multiple dispatch as a paradigm. As Julia is designed for high performance, it makes it perfect for data science, machine learning and deep learning. When I started the project, I did not know anything about Julia, so it was also a good way to learn this language.
NNJulia is based on the principle of automatic differentiation (AD). AD is a general way of evaluating the derivative of a function defined by a computer program. It relies on a calculus formula named the chain rule. By applying the chain rule on every elementary mathematical operation and functions involved in a complicated mathematical function, derivatives of arbitrary order can be computed automatically. A fast reverse-mode automatic differentiation is mandatory during the backpropagation phase. It allows to compute efficiently the gradients required to update the parameters of a neural network.
To implement AD, a structure Tensor is used. A tensor is a N-dimensional array that can be differentiated. By using mathematical operations and functions between tensors, a computational graph is implicitly created. This allows to store the dependencies between tensors, with respect to the operations that link them.
In the end the library NNJulia offers a simple programming interface to create and train sequential neural networks composed of dense layers (convolutional layers are not implemented yet). The documentation is also available online.
As you can see in examples/ it is possible to classify handwritten digits from the MNIST dataset with a decent accuracy. An example of spiral data points classification is also available.
To get a local copy up and running follow these simple example steps.
- Clone the repo and launch Julia
git clone https://github.com/Clement-W/NNJulia.jl
cd NNJulia.jl
julia
- Instantiate the package
julia> using Pkg
julia> Pkg.activate(".")
julia> Pkg.instantiate()
julia> using NNJulia
- Now you're ready to use NNJulia !
model = Sequential(
Flatten(),
Dense(784, 16, relu),
Dense(16, 16, relu),
Dense(16, 10, softmax),
)
# Initialise the optimiser, the loss function and the metrics used to compute accuracy
opt = GradientDescent(0.05)
loss = BinaryCrossentropy()
metrics = BinaryAccuracy()
# Pass it to the TrainParameters struct that will be used during training
trainParams = TrainParameters(opt, loss, metrics)
# Training specifications
batchsize = 64
nbEpochs = 25;
trainData = DataLoader(train_x, train_y_hot, batchsize,shuffle=true);
history = train!(model, trainParams, trainData, nbEpochs)
Plot the evolution of accuracy and loss during training:
p1 = plot(history["accuracy"],label="Accuracy",legend=:topleft)
p2 = plot(history["loss"],label="Loss")
plot(p1,p2,layout=2)
acc = evaluate(model,metrics,test_x,test_y_hot)
println("accuracy on test data = " * string(acc*100) * "%")