An implementation of a deep nearly network with one hidden layer from scratch using only numpy. It allows you to pick the number of input, hidden, and output neurons.
Momentum optimization is also implemented for improvement of stochastic gradient descent. Activations are cached for optimization as well.
Back prop and cross-entropy loss are implemented as well.
pip install requirements.txt
py main.py
- Memory issues are large matrix sizes
- Convolutional NN
- LSTM
- Transformers
All code is under an MIT license.