- Use this repo to train and test your own CNN. Feel free to browse the classes in layers.py. They contain a full CNN implementation in less than 50 lines of code.
-
models/
: Folder which stores the saved models. Further explaination in section 2. -
layers.py
: File containing every layer of the CNN. Each layer is a class with a.forward
and.backward
method. -
model.py
: File with theModel
class. -
run.py
: Script ran by the./run.sh
command. Trains the model. -
utils.py
: File with helper functions and classes.
- The only packaged used for the model is numpy. Other libraries are listed on
requirements.txt
. - To setup and join a miniconda virtual environment, run on terminal:
conda create -n environment_name python=3.8
conda activate environment_name
- The requirements can be installed on a virtual environment with the command
pip install -r requirements.txt
- To run, install the necessary requirements and a image dataset (.csv format).
- There must be a training and a test files. The files must have the label as the first column, and the features as the remaining columns.
- You can download your image file in the data directory.
Notes: 1: The training is only implemented on CPU (no torch, tensorflow or CUDA support). 2: Scipy is used for faster implementation of Correlation and Convolution. I also made fully numpy-based implementations. They work and are in the
functions.py
file. The scipy implementation is only being used due to efficiency gains in training.
- To train a CNN on your image dataset, go into run.sh and set the flag to
--train
and choose the following arguments:--train_data
(full path to your training data file) [OPTIONAL]--test_data
(full path to your test data file) [OPTIONAL]--epochs
(number of full passes through training data @ traintime) [OPTIONAL]--batch_size
(size of the batch (number of images per batch)) [OPTIONAL]--augmenter_ratio
(1 or 4, 1:ratio is how many times training dataset will be augmented) [OPTIONAL]--to_path
(path to .json file where model parameters will be stored for later use) [OPTIONAL]
python3 run.py --train --train_data=path_to_train_data --test_data=path_to_test_data --to_path=name_of_json_that_will_store_model.json
- Run on terminal:
./run.sh
- Whenever you feel like the validation accuracy printed is good enough, you can kill the training at any time. This will NOT corrupt the model saved in the given .json file, and you may proceed to testing and using the model :).
Note: If you want to alter layers/dimensions, do so in the
run.py
file, with the.add(Layer)
method.
- To test a CNN on your image dataset, go into run.sh and set the flag to
--test
and choose the following arguments: --test_data
(full path to your test data file)--from_path
(path to file with model parameters to be loaded)
python3 run.py --test --test_data=path_to_test_data --from_path=name_of_json_with_model.json
- Run on terminal:
./run.sh
Note: The accuracy score for these tests will usually be lower than the accuracy scores achieved with the training and validation sets.
- The full Convolutional Neural Network implementation achieved 99.36% accuracy on the validation set of the MNIST handwritten digit dataset.
- This implementation is NOT the one presented in the
run.py
file. - The 99.36% implementation used 5 kernels, and 256-dimensioned Dense layers.
- The training time was ~25h on my M2 CPU.