Skip to content

alirezakazemipour/NN-Without-Frameworks

Repository files navigation

PRs Welcome buildPyPI

NN-Without-Frameworks

This project aims to implement different Neural Network configuration without using scientific frameworks like TensorFlow or PyTorch.

Each network/config is implemented in 4 formats while trying to get the best of both worlds of TensorFlow (+ Keras) and PyTorch :

  1. In Python using NumPy
  2. In Python without taking advantage of NumPy
  3. In java
  4. In C++

What is currently supported? (equally in all languages and formats)

Layers:

  • Fully Connected

Activations:

  • Linear
  • ReLU

Loss Functions:

  • MSE
  • Cross Entropy

Weight Initializers:

  • Xavier Uniform (aka Glorot)
  • He Normal (aka Kaiming Normal)

bias initializer:

  • Constant (zero)

Optimizers:

  • SGD
  • SGD + Momentum
  • RMSProp
  • AdaGrad
  • Adam

Regularizer:

  • l1
  • l2

Normalization:

  • BatchNorm1d

What is only supported in numpy? (in addition to aforementioned settings)

Layers:

  • LSTM (LSTMCell respectively)
  • Conv1d & Conv2d
  • Pool1d & Pool2d

Regularizer:

  • Dropout

Normalization:

  • Layer Nomalization (LayerNorm)

Loss Functions:

  • Binary Focal
  • Binary Cross Entropy

Model Summary in TensorFlow Keras Model Summary's Style:

Model Summary:
+--------------+----------------+----------+
| Layer        | Output shape   |   Param# |
+==============+================+==========+
| Input        | (-1, 1)        |        0 |
+--------------+----------------+----------+
| Dense[0]     | (-1, 100)      |      200 |
+--------------+----------------+----------+
| LayerNorm[1] | (-1, 100)      |      200 |
+--------------+----------------+----------+
| Dropout[2]   | (-1, 100)      |        0 |
+--------------+----------------+----------+
| Dense[3]     | (-1, 1)        |      101 |
+--------------+----------------+----------+
total trainable parameters: 501


Model Summary:
+-----------+----------------+----------+
| Layer     | Output shape   |   Param# |
+===========+================+==========+
| Input     | (-1, 2, 2)     |        0 |
+-----------+----------------+----------+
| LSTM[0]   | (-1, 100)      |    41200 |
+-----------+----------------+----------+
| Conv1d[1] | (-1, 3, 100)   |    20100 |
+-----------+----------------+----------+
| Pool1d[2] | (-1, 2, 100)   |        0 |
+-----------+----------------+----------+
| Dense[3]  | (-1, 1)        |      201 |
+-----------+----------------+----------+
total trainable parameters: 61501

Examples

Each directory contains a train.* that performs tests of correctness and functionality according to its corresponding format and language. You can run it to get a sense of what is going on.

DQN

Snippet

Define your network

  • Python:
from nn_without_frameworks import numpy_nn as nn  # from nn_without_frameworks import pure_nn as nn


class MyNet(nn.Module):
    def __init__(self, input_dim, out_dim):
        super().__init__()
        self.input_dim = input_dim
        self.out_dim = out_dim
        self.hidden1 = nn.layers.Dense(in_features=self.input_dim,
                                       out_features=100,
                                       activation=nn.acts.ReLU(),
                                       weight_initializer=nn.inits.HeNormal(nn.acts.ReLU()),
                                       bias_initializer=nn.inits.Constant(0.),
                                       regularizer_type="l2",
                                       lam=1e-3
                                       )

        self.output = nn.layers.Dense(in_features=100,
                                      out_features=self.out_dim,
                                      weight_initializer=nn.inits.XavierUniform(),
                                      bias_initializer=nn.inits.Constant(0.),
                                      regularizer_type="l2",
                                      lam=1e-3
                                      )

    def forward(self, x):
        x = self.hidden1(x)
        return self.output(x)
  • Java:
import Layers.Dense;

class MyNet extends Module{
    int in_features = 0, out_features = 0;
    Dense hidden1, output;
    public MyNet(int in_features, int out_features){
        this.in_features = in_features;
        this.out_features = out_features;
        this.hidden1 = new Dense(this.in_features,
                100,
                "relu",
                "he_normal",
                "zeros",  // bias initializer
                "l2",
                0.001F);
        this.layers.add(this.hidden1); // the Crucial and only different part to PyTorch's subclassing
        
        this.output = new Dense(100,
                out_features,
                "linear",
                "xavier_uniform",
                "zeros",  // bias initializer
                "l2",
                0.001F);
        this.layers.add(this.output); // Crucial and different part to PyTorch's subclassing
    }

    public float[][] forward(float[][] x){
        x = this.hidden1.forward(x);
        x = this.output.forward(x);
        return x;
    }
}
  • C++:
#include <iostream>
#include <module.h>
#include <layers.h>

using namespace std;

class MyNet : public Module{
public:
    int in_features = 0, out_features = 0;
    Dense *hidden, *output; // Layers should be predifned especially, they should be pointers

    MyNet(int in_features, int out_features){
        this->in_features = in_features;
        this->out_features = out_features;

        this->hidden = new Dense{this->in_features,
                100,
                "relu",
                "he_normal",
                "zeros", // bias initializer
                "l2",
                0.001};
        this->parameters.push_back(this->hidden); // same as java

        this->output = new Dense{100,
                this->out_features,
                "linear",
                "xavier_uniform",
                "zeros", // bias initializer
                "l2",
                0.001};
        this->parameters.push_back(this->output); same as java
    }
    float_batch forward(const float_batch &input){ // float_batch =: vector<vector<float> >
        float_batch x = this->hidden->forward(input);
        x = this->output->forward(x);
        return x;

    }
};

Train your network

  • Python
my_net = MyNet(num_features, num_classes)
ce_loss = nn.losses.CrossEntropyLoss()
opt = nn.optims.SGD(my_net.parameters, lr=1.)
for step in range(epoch):
    y = my_net(x)
    loss = ce_loss(y, t)
    my_net.backward(loss)
    opt.apply()
  • Java:
import Losses.*;
import Optimizers.*;

MyNet my_net = new MyNet(num_features, num_classes);
CrossEntropyLoss celoss = new CrossEntropyLoss();
SGD opt = new SGD(1.0F, my_net.layers);

for (int epoch = 0; epoch < num_epoch; epoch++) {
    y = my_net.forward(x);
    Loss loss = celoss.apply(y, t);
    my_net.backward(loss);
    opt.apply();
}
  • C++:
#include <losses.h>
#include <optimizers.h>

MyNet my_net = MyNet{num_features, num_classes};
CrossEntropyLoss celoss;
SGD opt(1, my_net.parameters);
float_batch y;  // float_batch =: vector<vector<float> >
for(int step = 0; step < num_epoch; step++){
    y= my_net.forward(x);
    Loss loss = celoss.apply(y, t);
    my_net.backward(loss);
    opt.apply();
}

Acknowledgement

Contributing

  • The current code is far from done and any fix, suggestion, pull request, issue, etc is highly appreciated and welcome. 🤗
  • Current work focuses on discovering what is under the hood rather optimization involved in implementing ideas so, feel free to conduct sanity-checks behind the math and correctness of each part and/or if you come up with a better or optimized solution, please don't hesitate to bring up a PR. [thank you in advance. 😊]
  • You can take a look at todo.md.