Skip to content

This is an ongoing project of designing a custom object detector from scratch. You can also use the pytorch-lightning training pipeline to train your own model.

License

Notifications You must be signed in to change notification settings

taylanates24/object-detection

Repository files navigation

TyNet, A Custom Object Detector with a Pytorch-lightning Training Pipeline

TyNet is a lightweight, powerful, and scalable CNN-based object detector. You can choose a backbone from various backbones, or you can change the model itself, and then train another model by using the training pipeline.

Project Structure

object-detection/
├── Dockerfile
├── LICENSE
├── README.md
├── coco.yml
├── get_coco128.sh
├── requirements.txt
├── train.py
├── training.yaml
├── data/
│   ├── augmentations.py
│   ├── coco_dataset.py
│   ├── process_box.py
│   └── yolo_to_coco.py
├── datasets/
│   └── coco2017
└── models/
    ├── detector.py
    ├── loss.py
    ├── model.py
    ├── model_bn.py
    └── utils.py

Getting Started

To get started with TyNet, follow these steps:

1- Clone this repository to your local machine.

git clone https://github.com/taylanates24/object-detection.git

2 - Build a docker image and create a container from docker image (recommended)

docker build -t tynet:v1 -f Dockerfile .
docker run -v $(pwd):/workspace -it --rm --ipc host tynet:v1

3 - (Optional) If you are using python virtual environment, create a virtual environment and install requirements.

python3 -m venv tynet
source tynet/bin/activate
pip3 install -r requirements.txt

4 - Prepare your dataset in the COCO format or if you want a small dataset to quickly build the environment, run the following code to download coco128 dataset. (the first 128 images and labels from coco2017 dataset)

chmod +x get_coco128.sh
./get_coco128.sh

After that, you have to convert the dataset format to COCO format by running the following code:

python3 data/yolo_to_coco.py --yolo_annotations /workspaces/object-detection/datasets/coco128/labels/train2017 --yolo_img /workspaces/object-detection/datasets/coco128/images/train2017 --coco_names /workspaces/object-detection/datasets/coco.names --out_file_name /workspaces/object-detection/datasets/coco128_train.json --check True

5 - Modify the configuration file training.yaml to match your dataset, hyperparameters and data augmentations.

6 - Run

python3 train.py --train_cfg training.yaml --dataset_cfg coco.yml

to start training the model.

Data Preprocessing

TyNet uses the COCO dataset format for annotations. The data/coco_dataset.py script loads the images and annotations and preprocesses them for training.

Augmentations

imgaug

I add some geometric and color augmentations from imgaug library. You can change the number of augmentations in each iteration, the values and the variation of the augmentations by changing imgaug values in training.yaml file.

For example if you want to use random horizontal flip and scale only, the num_aug should be at most 2 (if num_aug is less than the total number of augmentations, a subset of augmentations is created to apply in each iteration), fliplr should be the probability if the random horizontal flip augmentation, scale should be a list of range and all the other imgaug augmentations should be null.

CutOut

It is my implementation of "Improved Regularization of Convolutional Neural Networks with Cutout" (https://arxiv.org/abs/1708.04552) paper with some improvements. In my implementation, you can change the filled boxes which cutted out. You can fill it with gaussian noise, random colors, white, black, and gray boxes to the cutting area. You can also change the number of cutouts and their scale with respect to the height of the image by changing cutout percentages values in the training.yaml file. The lenght of percentages is the number of cutting boxes.

Copy Paste

It is my implementation of "Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation" (https://arxiv.org/abs/2012.07177) paper with some improvements. In this implementation, you can apply bounding box augmentations by changing copy_paste: box_augments: variable in training.yaml file. The bounding box augmentations only applied to the pasted boxes and at most 1 augmentation is applied at one time. You can also change the number of pasted boxes by changing pasted_bbox_number value in training.yaml

Model Architecture

Backbone

As a backbone, I implemented a changable structure. You can use whatever backbone in the timm library, pre-trained or not. For example, the list of pre-trained backbones can be obtained by using the folloving code:

import timm

available_backbones = timm.list_models(pretrained=True)

Neck

As a neck, I designed an FPN structure, which uses addition as fusion type.

fpn_arch (copy) drawio

Additionally, I desinged a simple but effective and computationally cheap, scaleble block: ScalableCSPResBlock. It is scalable, and designed using CSPNet on top of Resnet.

scalablecspresblock (1)

Head

This is my implementation of bounding box regression head. Each Conv2d consists of one 3x3 Conv2d and one 1x1 Conv2d. At the end, there will be 1xNx4 dimensional tensor, N proposal coordinates for each bbox coordinate.

classification

This is my implementation of bounding box classification head. Similarly, each Conv2d consists of one 3x3 Conv2d and one 1x1 Conv2d. At the end, there will be 1xNx80 dimensional tensor, N class probabilities for each class in COCO dataset.

classification_corrected

Training

To train the TyNet, you can edit the training pipeline such as learning rate scheduler, optimizer and so on.

Optimizer

I have implemented 4 optimizers in this repository, Adam, AdamW, SGD and ASGD. You can choose one of them by changing training: optimizer: in training.yaml. file.

Learning Rate Scheduler

There are 3 learning rate schedulers in this repository, cosine, multistep_lr and cosine_annealing. You can choose one of them by changing training: lr_scheduler: in training.yaml. file.

Validation

There is also a validation frequency that indicates the time when the model validates. It is set to 1, that means every epoch, the validation phase occurs.

Logging

Pytorch-lightning supports a couple of loggers, in this repository, TensorBoard is chosen as logger. After training starts, you can observe the logs by opening a new bash terminal and type the following code:

tensorboard --bind_all --logdir tb_logs

After typing this command, the tensorboard screen will be ready in localhost soon.

The tb_logs is default logging folder of this repository. You can change it by changing the

logger = TensorBoardLogger("tb_logs", name="my_model")

line in train.py file.

Testing and Evaluation

Coming soon

Conclusion and Future Work

The head part of the model will be written by using CSPNet to have a more accurate and fast head part.

Yo can read my blog to have more insights about how to design a custom object detector from scratch. The first part is: How To Design An Object Detector Part 1: Choosing A Backbone The other parts will be available soon.

About

This is an ongoing project of designing a custom object detector from scratch. You can also use the pytorch-lightning training pipeline to train your own model.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published