This project aims to implement different Neural Network configuration without using scientific frameworks like TensorFlow or PyTorch.
Each network/config is implemented in 4 formats while trying to get the best of both worlds of TensorFlow (+ Keras) and PyTorch :
- In Python using NumPy
- In Python without taking advantage of NumPy
- In java
- In C++
Layers:
- Fully Connected
Activations:
- Linear
- ReLU
Loss Functions:
- MSE
- Cross Entropy
Weight Initializers:
- Xavier Uniform (aka Glorot)
- He Normal (aka Kaiming Normal)
bias initializer:
- Constant (zero)
Optimizers:
- SGD
- SGD + Momentum
- RMSProp
- AdaGrad
- Adam
Regularizer:
- l1
- l2
Normalization:
- BatchNorm1d
Layers:
- LSTM (LSTMCell respectively)
- Conv1d & Conv2d
- Pool1d & Pool2d
Regularizer:
- Dropout
Normalization:
- Layer Nomalization (LayerNorm)
Loss Functions:
- Binary Focal
- Binary Cross Entropy
Model Summary in TensorFlow Keras Model Summary's Style:
Model Summary:
+--------------+----------------+----------+
| Layer | Output shape | Param# |
+==============+================+==========+
| Input | (-1, 1) | 0 |
+--------------+----------------+----------+
| Dense[0] | (-1, 100) | 200 |
+--------------+----------------+----------+
| LayerNorm[1] | (-1, 100) | 200 |
+--------------+----------------+----------+
| Dropout[2] | (-1, 100) | 0 |
+--------------+----------------+----------+
| Dense[3] | (-1, 1) | 101 |
+--------------+----------------+----------+
total trainable parameters: 501
Model Summary:
+-----------+----------------+----------+
| Layer | Output shape | Param# |
+===========+================+==========+
| Input | (-1, 2, 2) | 0 |
+-----------+----------------+----------+
| LSTM[0] | (-1, 100) | 41200 |
+-----------+----------------+----------+
| Conv1d[1] | (-1, 3, 100) | 20100 |
+-----------+----------------+----------+
| Pool1d[2] | (-1, 2, 100) | 0 |
+-----------+----------------+----------+
| Dense[3] | (-1, 1) | 201 |
+-----------+----------------+----------+
total trainable parameters: 61501
Each directory contains a train.*
that performs tests of correctness and functionality according to its corresponding format and language. You can run it to get a sense of what is going on.
Define your network
- Python:
from nn_without_frameworks import numpy_nn as nn # from nn_without_frameworks import pure_nn as nn
class MyNet(nn.Module):
def __init__(self, input_dim, out_dim):
super().__init__()
self.input_dim = input_dim
self.out_dim = out_dim
self.hidden1 = nn.layers.Dense(in_features=self.input_dim,
out_features=100,
activation=nn.acts.ReLU(),
weight_initializer=nn.inits.HeNormal(nn.acts.ReLU()),
bias_initializer=nn.inits.Constant(0.),
regularizer_type="l2",
lam=1e-3
)
self.output = nn.layers.Dense(in_features=100,
out_features=self.out_dim,
weight_initializer=nn.inits.XavierUniform(),
bias_initializer=nn.inits.Constant(0.),
regularizer_type="l2",
lam=1e-3
)
def forward(self, x):
x = self.hidden1(x)
return self.output(x)
- Java:
import Layers.Dense;
class MyNet extends Module{
int in_features = 0, out_features = 0;
Dense hidden1, output;
public MyNet(int in_features, int out_features){
this.in_features = in_features;
this.out_features = out_features;
this.hidden1 = new Dense(this.in_features,
100,
"relu",
"he_normal",
"zeros", // bias initializer
"l2",
0.001F);
this.layers.add(this.hidden1); // the Crucial and only different part to PyTorch's subclassing
this.output = new Dense(100,
out_features,
"linear",
"xavier_uniform",
"zeros", // bias initializer
"l2",
0.001F);
this.layers.add(this.output); // Crucial and different part to PyTorch's subclassing
}
public float[][] forward(float[][] x){
x = this.hidden1.forward(x);
x = this.output.forward(x);
return x;
}
}
- C++:
#include <iostream>
#include <module.h>
#include <layers.h>
using namespace std;
class MyNet : public Module{
public:
int in_features = 0, out_features = 0;
Dense *hidden, *output; // Layers should be predifned especially, they should be pointers
MyNet(int in_features, int out_features){
this->in_features = in_features;
this->out_features = out_features;
this->hidden = new Dense{this->in_features,
100,
"relu",
"he_normal",
"zeros", // bias initializer
"l2",
0.001};
this->parameters.push_back(this->hidden); // same as java
this->output = new Dense{100,
this->out_features,
"linear",
"xavier_uniform",
"zeros", // bias initializer
"l2",
0.001};
this->parameters.push_back(this->output); same as java
}
float_batch forward(const float_batch &input){ // float_batch =: vector<vector<float> >
float_batch x = this->hidden->forward(input);
x = this->output->forward(x);
return x;
}
};
Train your network
- Python
my_net = MyNet(num_features, num_classes)
ce_loss = nn.losses.CrossEntropyLoss()
opt = nn.optims.SGD(my_net.parameters, lr=1.)
for step in range(epoch):
y = my_net(x)
loss = ce_loss(y, t)
my_net.backward(loss)
opt.apply()
- Java:
import Losses.*;
import Optimizers.*;
MyNet my_net = new MyNet(num_features, num_classes);
CrossEntropyLoss celoss = new CrossEntropyLoss();
SGD opt = new SGD(1.0F, my_net.layers);
for (int epoch = 0; epoch < num_epoch; epoch++) {
y = my_net.forward(x);
Loss loss = celoss.apply(y, t);
my_net.backward(loss);
opt.apply();
}
- C++:
#include <losses.h>
#include <optimizers.h>
MyNet my_net = MyNet{num_features, num_classes};
CrossEntropyLoss celoss;
SGD opt(1, my_net.parameters);
float_batch y; // float_batch =: vector<vector<float> >
for(int step = 0; step < num_epoch; step++){
y= my_net.forward(x);
Loss loss = celoss.apply(y, t);
my_net.backward(loss);
opt.apply();
}
- Current code is inspired by the elegant and simple repository Simple Neural Networks by @MorvanZhou .
- Mathematical foundation of different parts is based on slides of CS W182 / 282A and CS231n courses.
- The current code is far from done and any fix, suggestion, pull request, issue, etc is highly appreciated and welcome. 🤗
- Current work focuses on discovering what is under the hood rather optimization involved in implementing ideas so, feel free to conduct sanity-checks behind the math and correctness of each part and/or if you come up with a better or optimized solution, please don't hesitate to bring up a PR. [thank you in advance. 😊]
- You can take a look at
todo.md
.