Comparison of State-of-the-Art Deep Learning APIs for Image Multi-Label Classification using Semantic Metrics
Some functionality and metrics calculation implementation for the "Comparison of State-of-the-Art Deep Learning APIs for Image Multi-Label Classification using Semantic Metrics" paper (also in arXiv).
@article{kubany2020comparison,
title={Comparison of State-of-the-Art Deep Learning APIs for Image Multi-Label Classification using Semantic Metrics},
author={Kubany, Adam and Ishay, Shimon Ben and Ohayon, Ruben Sacha and Shmilovici, Armin and Rokach, Lior and Doitshman, Tomer},
journal={Expert Systems with Applications},
year={2020},
publisher={Elsevier}
DOI={https://doi.org/10.1016/j.eswa.2020.113656}
}
This repository includes the following functionality:
- ETL procedure for the ‘Visual Genome’ and ‘Open Images’ datasets
- Multi-labels classification inference for the paper applied commercial and open-source APIs
- Evaluation metrics:
- Example-based metrics
- Semantic metrics
- Label-based metrics
The ETL procedure for the 'Visual Genome' and 'Open Images' benchmark datasets.
Download the dataset's metadata to the appropriate folder (\datasets\[DATASET]\metadata). The objects.json file for the Visual Genome dataset, and train-annotations-human-imagelabels-boxable.csv and the train-images-boxable-with-rotation.csv for the Open Images dataset.
$python etl.py
The ETL results are saved in the \datasets\[DATASET]\data
folder.
Inference scripts for the paper applied commercial and open-source APIs.
The inference script for the following commercial APIs:
- Imagga
- IBM Watson
- Clarifai
- Microsoft Computer Vision
- Google Cloud Vision
- Wolfram Alpha
- Create an inference python conda environment using
env_inference.yml
file. - Subscribe to each commercial API (following the explanation links inside the APIs' function docstring comments) and set the appropriate API subscription keys in the
AUTH
variable in theconfig.py
.
$python labels_inference_commercial.py
Please note that the commercial APIs often change their interface...
The inference results are saved in the \datasets\[DATASET]\data
folder.
The inference script for the following open-source APIs:
- InceptionResNet v2 (trained on ImageNet)
- Mobilenet v2 (trained on ImageNet)
- VGG19 (trained on ImageNet)
- Inception v3 (trained on ImageNet)
- ResNet50 (trained on ImageNet)
- ResNet50 (trained on COCO)
- YOLO V3 (trained on ImageNet)
- YOLO V3 (trained on COCO)
- Deepdetect (trained on ImageNet)
- Create an inference python conda environment using
env_inference.yml
file (same env as for the commercial APIs). - Python will automatically download the needed model files for the first inference usage of each of the ImageNet trained APIs (besides the YOLO ImageNet API)
- For the COCO trained APIs, download the ResNet50 or YOLO model files to the
sources
folder. - The ImageNet trained YOLO API requires Linux OS (others can manage with Windows) and the follow the installation procedure.
$python labels_inference_open_source.py
The inference results are saved in the \datasets\[DATASET]\data
folder.
Example-based, label-based, and the proposed semantic evaluation metrics.
- Example-based metrics: accuracy, recall, precision, and F1
- Semantic metrics: semantic accuracy, semantic recall, semantic precision, semantic F1, Word Moving Distance (WMD), BOW embedding similarity for fine-tuned BERT, and fine-tuned RoBERTa
- Create a metrics calculation python conda environment using
env_metrics_calc.yml
file. - Download the word2vec pre-trained bin file to the
sources
folder. - Python will automatically download the needed model files for the first use of BERT and RoBERTa.
$python example_based_metrics_semantics.py
The metrics results are saved in the \datasets\[DATASET]\results
folder.
Label-based metrics: micro and macro averaging of precision, recall, and F1
- Install PHP 7 and MySQL, we recommend the WAMP package.
- Create the
sem_open_images
andsem_visual_genome
MySQL databases and import the DB tables using the.sql
file in the\datasets\\[DATASET]\data
folder.
$php label_based_metrics.php "OPEN_IMAGE" (or "VISUAL_GENOME")
The metrics results are saved in the \datasets\[DATASET]\results
folder.