Improving Loss Function for Deep CNN-based AIA

This is the official PyTorch implementation of the paper "Improving loss function for deep convolutional neural network applied in automatic image annotation"

Abstract

Automatic image annotation (AIA) is a mechanism for describing the visual content of an image with a list of semantic labels. Typically, there is a massive imbalance between positive and negative tags in a picture—in other words, an image includes much fewer positive labels than negative ones. This imbalance can negatively affect the optimization process and diminish the emphasis on gradients from positive labels during training. Although traditional annotation models mainly focus on model structure design, we propose a novel unsymmetrical loss function for a deep convolutional neural network (CNN) that performs differently on positives and negatives, which leads to a reduction in the loss contribution from negative labels and also highlights the contribution of positive ones. During the annotation process, we specify a threshold for each label separately based on the Matthews correlation coefficient (MCC). Extensive experiments on high-vocabulary datasets like Corel 5 k, IAPR TC-12, and Esp Game reveal that despite ignoring the semantic relationships between labels, our suggested approach achieves remarkable results compared to the state-of-the-art automatic image annotation models.

Datasets

There are three well-known datasets that are mostly used in AIA tasks. The table below provides details about these datasets. It is also possible to download them by the given links. (After downloading each dataset, replace its 'images' folder with the corresponding 'images' folder in the 'datasets' folder).

Dataset	Num of images	Num of training images	Num of testing images	Num of vocabularies	Labels per image	Image per label
Corel 5k	5,000	4,500	500	260	3.4	58.6
IAPR TC-12	19,627	17,665	1962	291	5.7	347.7
ESP Game	20,770	18,689	2081	268	4.7	362.7

Convolutional model

TResNet-M

Train and Evaluation

To train the model in Spyder IDE use the code below:

run main.py --data {select training dataset} --loss-function {select loss function}

Please note that:

You should put Corel-5k, ESP-Game or IAPR-TC-12 in {select training dataset}.
You should put the proposedLoss in {select loss function}.

To evaluate the model in Spyder IDE use the code below:

run main.py --data {select training dataset} --loss-function {select loss function} --evaluate

Results

Proposed method:

data	precision	recall	f1-score	N+
Corel 5k	0.466	0.554	0.506	189
IAPR TC-12	0.503	0.562	0.531	285
ESP Game	0.423	0.484	0.452	261

Proposed method + MCC:

data	precision	recall	f1-score	N+
Corel 5k	0.484	0.563	0.520	191
IAPR TC-12	0.562	0.515	0.537	277
ESP Game	0.508	0.421	0.461	255

Citation

Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follows:

@article{salar2023improving,
  title={Improving loss function for deep convolutional neural network applied in automatic image annotation},
  author={Salar, Ali and Ahmadi, Ali},
  journal={The Visual Computer},
  pages={1--13},
  year={2023},
  publisher={Springer}
}

Contact

I would be happy to answer any questions you may have - Ali Salar (parham1998resume@gmail.com)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.spyproject/config		.spyproject/config
checkpoints		checkpoints
datasets		datasets
README.md		README.md
datasets.py		datasets.py
engine.py		engine.py
evaluation_metrics.py		evaluation_metrics.py
image_show.py		image_show.py
loss_functions.py		loss_functions.py
main.py		main.py
models.py		models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.spyproject/config

.spyproject/config

checkpoints

checkpoints

datasets

datasets

README.md

README.md

datasets.py

datasets.py

engine.py

engine.py

evaluation_metrics.py

evaluation_metrics.py

image_show.py

image_show.py

loss_functions.py

loss_functions.py

main.py

main.py

models.py

models.py

Repository files navigation

Improving Loss Function for Deep CNN-based AIA

Abstract

Datasets

Convolutional model

Train and Evaluation

Results

Citation

Contact

About

Languages

parham1998/Improving-Loss-Function-for-Deep-CNN-based-AIA

Folders and files

Latest commit

History

Repository files navigation

Improving Loss Function for Deep CNN-based AIA

Abstract

Datasets

Convolutional model

Train and Evaluation

Results

Citation

Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages