PyAutoLabeler

PyAutoLabeler is a simple tool that helps accelerate the image labeling process for custom mobile detection models. The main idea behind this project is to train a slower but very accurate model with the available labeled dataset and use the trained model to label additional images for the faster but not so accurate model. Follows the XML format created by tzutalin/labelImg. Newly labelled images can be validated using LabelImg.

Single Shot Multibox Detector(SSD) implementation in PyTorch burrowed from:

qfgaohao/pytorch-ssd

Preparation

First, clone this repository:

https://github.com/RishavRajendra/pyAutoLabeler.git
cd pyAutoLabeler

Prerequisites

Tested on Python 3.7.1

What things you need to install the software and how to install them

pip3 install torch torchvision
pip3 install opencv-python
pip3 install pandas

Data Preparation

We are following the Pascal VOC dataset format. So the images and annotations need to be in the following structure:

|-pytorch
|-datasets
    |-images
        |-test
            |-Annotations
            |-JPEGImages
            |-test.txt
        |-train
            |-Annotations
            |-JPEGImages
            |-train.txt

Populate test.txt and train.txt with the names of the images in test and train folders respectively. For ex: image01.jpg should be image01. I recommend using ls Annotations/*.xml > train.txt and removing the .xml using a text editor.

Replace the class names in pytorch/voc_dataset.py and pytorch/models/voc-model-labels.txt with your own class labels.

Pre-trained Model

Download pre-trained models:

wget -P models https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
wget -P models https://storage.googleapis.com/models-hao/mb2-imagenet-71_8.pth

Train

Train the VGG based SSD model that will label our images for us:

cd pytorch
python train_ssd.py --train_dataset datasets/images/train/ --test_dataset datasets/images/test/ --net vgg16-ssd --base_net models/vgg16_reducedfc.pth  --batch_size 24 --num_epochs 200 --scheduler "multi-step” —-milestones “120,160”

Evaluate Training

Highest precision per-class I achieved:

Average Precision Per-class:
slope: 0.8559686187901899
start: 0.8570291777188328
blocka: 0.9020976917186305
blockb: 0.869311086160904
blockc: 0.8974831184775937
blockd: 0.879126840705184
blocke: 0.8879779277601088
blockf: 0.8877579605029737
obstacle: 0.8977965373066247
side: 0.7882676643651874
corner: 0.8587204123994634

Average Precision Across All Classes:0.8710488214459722

Code to evaluate the model:

python eval_ssd.py --net vgg16-ssd --dataset vision/datasets/images/test/ --trained_model vision/models/vgg16-trained.pth --label_file vision/models/voc-model-labels.txt

Label new images

Use the newly trained model to label new images to increase the size of your dataset:

python img_to_xml.py <path_to_images> <path_to_annotations> <model_path> <label_path>

I highly recommend going through the labeled images with LabelImg to validate and improve the bounding box accuracy of your mobile model.

Tensorflow Support

If you want to use the labeled images for Tensorflow Object Detection API, convert XML to CSV:

python xml_to_csv.py <dataset_path>

Dataset provided in <dataset_path> should follow the directory structure shown in Data Preparation. After converting the generated XML to CSV, you can use generate_tfrecord.py to create your tfrecords file. Some Tensorflow Object Detection API turorials I love are:

Authors

Rishav Rajendra - Website

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch

pytorch

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

eval_ssd.py

eval_ssd.py

generate_tfrecord.py

generate_tfrecord.py

img_to_xml.py

img_to_xml.py

xml_to_csv.py

xml_to_csv.py

Repository files navigation

PyAutoLabeler

Preparation

Prerequisites

Data Preparation

Pre-trained Model

Train

Evaluate Training

Label new images

Tensorflow Support

Authors

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
pytorch		pytorch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_ssd.py		eval_ssd.py
generate_tfrecord.py		generate_tfrecord.py
img_to_xml.py		img_to_xml.py
xml_to_csv.py		xml_to_csv.py

License

RishavRajendra/pyAutoLabeler

Folders and files

Latest commit

History

Repository files navigation

PyAutoLabeler

Preparation

Prerequisites

Data Preparation

Pre-trained Model

Train

Evaluate Training

Label new images

Tensorflow Support

Authors

License

About

Resources

License

Stars

Watchers

Forks

Languages