Skip to content

kosuke1701/ZACI-20-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Danbooru 2020 Zero-shot Anime Character Identification Dataset (ZACI-20)

The goal of this dataset is creating human-level character identification models which do not require retraining on novel characters. The dataset is derived from Danbooru2020 dataset [Anonymous+2021].

Features

  • Large-scale character face image dataset.
    • 1.45M face images of 39K characters (train dataset).
  • Designed for zero-shot setting.
    • Characters in the test dataset do not appear in the train dataset, allowing us to test model performance on novel characters.
  • Human annotated test dataset.
    • Image pairs with errorneous face detection or duplicate images are manually removed.
    • We can compare model performance to human performance.

Benchmarks

Random negative pairs

  • Negative image pairs with different character labels are randomly sampled in this test set.
    • Limitation:
      • Since negative pairs are sampled in a completely random manner, most of them are easy negative which consists of images of clearly different characters.
      • Thus, model performance tends to be over-estimated.

Figure 1. Examples of random negative pairs (each column).



Table 1. Performance of benchmark models (random negative pairs).
model name FPR (%) FNR (%) EER (%) note
Human 1.59 13.9 N/A by kosuke1701
ResNet-152 2.40 13.9 8.89 w/ RandAug, Contrastive loss. 0206_resnet152 by kosuke1701
SE-ResNet-152 2.43 13.9 8.15 w/ RandAug, Contrastive loss. 0206_seresnet152 by kosuke1701
ResNet-152 2.54 13.9 8.33 w/ RandAug, Contrastive + Classification (Cross Entropy) loss. 0301_cls_resnet152 by kosuke1701
ResNet-18 2.96 13.9 8.65 w/ RandAug, Contrastive + Classification (Cross Entropy) loss. 0217_cls_resnet18 by kosuke1701
ResNet-18 5.08 13.9 9.59 w/ RandAug, Contrastive loss. 0206_resnet18 by kosuke1701

Adversarially sampled negative pairs

  • Negative image pairs which are most confusing to a trained model are kept as test set.
    • 0206_resnet152 is used as the trained model.
    • Negative image pairs are sorted by their predicted scores, and pairs with largest scores are selected.
  • Current benchmarks show much lower performance than that of humans on this confusing negative pairs.

Figure 2. Examples of adversarial negative pairs (each column). Those pairs highlight common errors by benchmark models.



Table 2. Performance of benchmark models (adversarial negative pairs).
model name FPR (%) FNR (%) EER (%) note
Human 13.6 16.9 N/A by kosuke1701
SE-ResNet-152 68.9 16.9 39.7 w/ RandAug, Contrastive loss. 0206_seresnet152 by kosuke1701
Vit L-16 70.9 16.9 39.9 Based on arkel23/animesion. Trained and evaluated w/ modified codes kosuke1701/animesion. Pretrained model
ResNet-152 70.9 16.9 32.6 w/ RandAug, Contrastive + Classification (Cross Entropy) loss. 0301_cls_resnet152 by kosuke1701
ResNet-18 75.7 16.9 34.7 w/ RandAug, Contrastive + Classification (Cross Entropy) loss. 0217_cls_resnet18 by kosuke1701
ResNet-18 94.9 16.9 43.0 w/ RandAug, Contrastive loss. 0206_resnet18 by kosuke1701
  • The performance of 0206_resnet152 is not shown here because it is not possible to fairly compare the performance of the adversarially attacked model with those of other non-attacked models.

Participation

  • Your participation is welcome!! Please create an issue if you want to add your model to this list.
  • Please do not use test dataset to tune hyperparameters!!
  • You can use external resources. However, please do not use any data with character labels in test dataset to ensure fair comparison.
    • Note that / in original character labels of Danbooru 2020 is replaced by __.

Getting Started

Download preprocessed dataset

Many thanks to gwern, the dataset is now available for download via rsync. Use following commands to download and untar the dataset.

rsync --verbose rsync://78.46.86.149:873/biggan/20210206-kosukeakimoto-zaci2020-danbooru2020zeroshotfaces.tar ./
tar -xvf 20210206-kosukeakimoto-zaci2020-danbooru2020zeroshotfaces.tar

Preprocess images

Otherwise, you can create ZACI-20 dataset from the original Danbooru 2020 dataset as follows.

  • Download SFW 512 px subset of Danbooru 2020 dataset.
  • Install dependencies.
    • pip install tqdm pillow
  • Crop images using a preprocessing code.
    • # Danbooru 2020 SFW 512 px directroy
      export DANBOORU_DIR=/path/to/danbooru2020/512px
      # train
      python process_danbooru.py --danbooru-dir ${DANBOORU_DIR} --dataset-fn dataset/zaci20_train.json --save-dir zaci20_train
      # test
      python process_danbooru.py --danbooru-dir ${DANBOORU_DIR} --dataset-fn dataset/zaci20_test.json --save-dir zaci20_test
    • Images will be stored in different directories for each character.

Evaluate your model

  • Use evaluation code.
    • python evaluate.py --test-pairs dataset/zaci20_test_pairs.csv --test-dataset-dir zaci20_test
    • If you want to evaluate my benchmarks, download and unzip compressed model files. AnimeCV should be installed to run my benchmarks.

Todo

  • Create more difficult test dataset by adversarially sample negative image pairs.
  • Evaluate and add other existing methods to the list of benchmarks.

Notes

Preprocessing

  • Face annotations by pre-trained EfficientDet model [Tan+2020] is used to crop images. The annotations are publicly available at AnimeCV.
  • All cropped face images are resized to the size of 224 x 224.
  • / in original character labels of Danbooru 2020 is replaced by __.

Data selection

  • The methodology of [Wang,2019] is used to select images in this dataset.
    • More specifically, I only kept the images with only one character tag and only one face annotation.
  • Character tags with only one image are removed.
  • I split the set of characters into train/test set.
    • Characters in the test set is randomly selected from the set of characters with only two face images.
    • In this way, the number of images in the train set is maximized while keeping the diversity of characters in the test set.
  • Since evaluating and annotating all negative pairs in the test set is time-consuming task, a subset of negative pairs are included in the test set.

Benchmarks

  • Benchmark models (0206_resnet, 0206_seresnet, 0206_resnet18, 0217_cls_resnet18, 0301_cls_resnet152) are trained with this code.
    • python -u -m optuna_metric_learning.train --conf <CONFIG_FN> --model-def-fn examples/image_folder_example.py --max-epoch 60 --patience 3 --n-fold 100 --no-trial
      • Corresponding <CONFIG_FN>:
        • examples/tuned_configs/resnet18.json (for 0206_resnet18)
        • examples/tuned_configs/resnet152.json (for 0206_resnet152)
        • examples/tuned_configs/seresnet152.json (for 0206_seresnet152)
    • python -u -m optuna_metric_learning.train --conf <CONFIG_FN> --model-def-fn examples/image_folder_examples_classifier.py --no-trial --trainer TrainWithClassifier --max-epoch 300 --patience 3 --n-fold 10
      • Corresponding <CONFIG_FN>:
        • examples/tuned_configs/resnet18_cls.json (for 0217_cls_resnet18)
        • examples/tuned_configs/resnet152_cls.json (for 0301_cls_resnet152)
    • Results of conducted hyperparameter tuning on my private dataset is listed in this Google spreadsheet.

Statistics

size (MB) #images #characters
train 16,644 1,451,527 39.038
test 26 2006 1003

Citation

If you found this dataset or my benchmark models useful, please consider citing this repository and Danbooru 2020 dataset.

@misc{danbooru2020,
    author = {Anonymous and Danbooru community and Gwern Branwen},
    title = {Danbooru2020: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset},
    howpublished = {\url{https://www.gwern.net/Danbooru2020}},
    url = {https://www.gwern.net/Danbooru2020},
    type = {dataset},
    year = {2021},
    month = {January},
    timestamp = {2020-01-12},
    note = {Accessed: 2021-02-06} }
@misc{zaci20,
        author = {Kosuke Akimoto},
        title = {Danbooru 2020 Zero-shot Anime Character Identification Dataset (ZACI-20)},
        howpublished = {\url{https://github.com/kosuke1701/ZACI-20-dataset}},
        url = {https://github.com/kosuke1701/ZACI-20-dataset},
        type = {dataset,model},
        year = {2021},
        month = {February} }

References

[Wang,2019] Yan Wang. "Danbooru2018 Anime Character Recognition Dataset," 2019, github.com/grapeot/Danbooru2018AnimeCharacterRecognitionDataset (accessed: 2021-02-11).

[Tan+2020] Tan, Mingxing, Ruoming Pang, and Quoc V. Le. "Efficientdet: Scalable and efficient object detection." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020.

[Anonymous+2021] Anonymous, The Danbooru Community, and Gwern Branwen.“Danbooru2020: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset.” 2021, www.gwern.net/Danbooru2020 (accessed: 2021-02-11).

About

Image dataset for zero-shot character identification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages