LabelDetection is a graphical tool that aims to facilitate all the steps required in the pipeline to construct and use a deep-learning base object detection model. This includes the functionality to annotate a dataset of images, train a model, and use it for prediction with new images. LabelDetection is based on the LabelImg tool, and its key features are:
- Facilitates the annotation of images.
- Detects objects using deep-learing based detection models trained with different algorithms and libraries.
- Applies Test time augmentation to improve the performance of models.
- Generates the necessary code and folder structure to train a model using different algorithms.
- Includes data distillation in the training code to employ unlabelled images to construct the detection model.
- Generates a report of the detected objects.
- Installation
- How to use
- Using detection models
- Test-time augmentation for object detection
- Training new models
- Citation
- Acknowledgements
LabelDetection can be run on Linux and Windows.
LabelDetection can be installed both in Linux and Windows using pip.
pip install labeldetection
labelDetection
This tool requires Python 3.6 and Qt5 and the packages listed in the requirements.txt
file.
- Clone this repository.
git clone https://github.com/ancasag/LabelDetection
- Install the necessary dependencies.
cd LabelDetection
sudo apt-get install pyqt5-dev-tools
pip3 install -r requirements.txt
make qt5py3
- Run LabelDetection:
python3 labelDetection.py
Here is the manual to learn how to use LabelDetection. Below, you can also watch a video where LabelDetection works with a series of examples.
LabelDetection can employ models trained with different algorithms and libraries to detect objects in the images. Currently, LabelDetection support algorithms trained with the following libraries:
- YOLO v3 models trained with the Darknet library. To use this kind of model, you need a folder with three files: a .weights file (that contains the weights of the model), a .cfg file (that contains the configuration of the network), and a .names file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- SSD512 models trained with the MXNet library. To use this kind of model, you need a folder with two files: a .params file (that contains the weights of the model), a .txt file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- Faster RCNN models trained with the MXNet library. To use this kind of model, you need a folder with two files: a .params file (that contains the weights of the model), a .txt file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- YOLO models trained with the MXNet library. To use this kind of model, you need a folder with two files: a .params file (that contains the weights of the model), a .txt file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- RetinaNet-Resnet50 models trained with Keras. To use this kind of model, you need a folder with two files: a .h5 file (that contains the weights of the model), a .csv file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- Mask-RCNN-Resnet50 models trained with Keras. To use this kind of model, you need a folder with two files: a .h5 file (that contains the weights of the model), a .names file (that contains the classes that can be detected with the model). Files for COCO dataset: download.
- EfficientDet-B0 models trained with Keras. To use this kind of model, you need a folder with two files: a .h5 file (that contains the weights of the model), a .names file (that contains the classes that can be detected with the model). Files for fruit dataset: download.
- FSAF-Resnet50 models trained with Keras. To use this kind of model, you need a folder with two files: a .h5 file (that contains the weights of the model), a .names file (that contains the classes that can be detected with the model). Files for fruit dataset: download.
- FCOS-Resnet50 models trained with Keras. To use this kind of model, you need a folder with two files: a .h5 file (that contains the weights of the model), a .names file (that contains the classes that can be detected with the model). Files for fruit dataset: download.
Test-time augmentation (TTA) is an ensemble technique that can be applied to increase the performance of a model. This functionality is available thanks to the ensemble repository and can be applied to any model of those described in using detection models.
Three different voting strategies can be applied for TTA:
- Affirmative. This means that whenever one of the techniques that produce the initial predictions says that a region contains an object, such a detection is considered as valid.
- Consensus. This means that the majority of the initial methods must agree to consider that a region contains an object. The consensus strategy is analogous to the majority voting strategy commonly applied in ensemble methods for images classification.
- Unanimous. This means that all the methods must agree to consider that a region contains an object.
These are all the techniques that we have defined to use in the TTA process. The first column corresponds with the name assigned to the technique, and the second column describes the technique.
- "avgBlur": Average blurring
- "bilaBlur": Bilateral blurring
- "blur": Blurring
- "chanHsv": Change to hsv colour space
- "chanLab": Change to lab colour space
- "crop": Crop
- "dropOut": Dropout
- "elastic": Elastic deformation
- "histo": Equalize histogram
- "vflip": Vertical flip
- "hflip": Horizontal flip
- "hvflip": Vertical and horizontal flip
- "gamma": Gamma correction
- "blurGau": Gaussian blurring
- "avgNoise": Add Gaussian noise
- "invert": Invert
- "medianblur": Median blurring
- "none": None
- "raiseBlue": Raise blue channel
- "raiseGreen": Raise green channel
- "raiseHue": Raise hue
- "raiseRed": Raise red
- "raiseSatu": Raise saturation
- "raiseValue": Raise value
- "resize": Resize
- "rotation10": Rotate 10º
- "rotation90": Rotate 90º
- "rotation180": Rotate 180º
- "rotation270": Rotate 270º
- "saltPeper": Add salt and pepper noise
- "sharpen": Sharpen
- "shiftChannel": Shift channel
- "shearing": Shearing
- "translation": Translation
From an annotated folder, LabelDetection generates all the code required to train a new model. The generated code is available in the form of a Jupyter notebook that can be run locally, provided that the user has a GPU, or using Google Colaboratory.
Currently, LabelDetection can generate the code to train the following models:
- YOLO v3 models trained with the Darknet library
- SSD512 models trained with the MXNet library.
- RetinaNet-Resnet50 models trained with Keras.
- Mask-RCNN-Resnet50 models trained with Keras.
- EfficientDet-B0 models trained with Keras.
- FSAF-Resnet50 models trained with Keras.
- FCOS-Resnet50 models trained with Keras.
All the models trained with the generated code can be latter employed for detection in LabelDetection.
There are several options to upload a dataset to Google Colaboratory:
- Upload the zip file generated by LabelDetection to the main folder of Google Drive.
- Include the following code in two empty cells of the notebook:
from google.colab import drive
drive.mount('/content/drive')
!mv /content/drive/My\ Drive/dataset.zip dataset.zip
!unzip dataset.zip
- Upload the zip file generated by LabelDetection to Dropbox.
- Create and copy a Dropbox share link for the uploaded file.
- Include and execute the following instruction where
dropbox-link
must be replaced with the Drobpox link.
!wget dropbox-link -O datasets.zip
!unzip datasets.zip
Annotating a dataset of images might be a time-consuming task, but object detection models benefits when there are many annotated images. Therefore, LabelDetection incorporates to the training code the necessary functionality to apply a semi-supervised learning technique known as data distillation. This technique will employ both the annotated and unlabelled images to create the detection model. This technique can be applied with all the available models.
By default, the transformation techniques applied for data distillation are histogram normalization and vertical flips, but the user can employ any of the techniques described in techniques for TTA.