Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection

Official repository of Manga109Dialog (ICME 2024) | Paper | Dataset

Prerequisites

Manga109 dataset
- Download from http://www.manga109.org/en/download.html
Manga109Dialog annotation
- Download from https://github.com/manga109/public-annotations

Environment setup

Check INSTALL.md for installation instructions.

Data preprocessing

Convert the annotations from Manga109 into a format suitable for the scene graph generation (SGG) models. For more details, check README.md.

Speaker prediction

This is the core part of our model. For details on how to detect characters and texts in comics and predict the speaker based on visual information, check README.md.

Evaluation

In addition to conventional metrics for evaluating SGG models, we have introduced a new metric tailored for comics: Recall@(#text).

# PredCls / SGCls
python eval_and_vis/eval_original.py

# SGDet
python eval_and_vis/eval_original_sgdet.py

You can find details on conventional evaluation metrics in METRICS.md.

Visualization

The visualization tools for predictions can be found in eval_and_vis/.

1.visualize_PredCls_and_SGCls.ipynb
2.visualize_SGDet.ipynb
3.visualize_SGDet.ipynb
4.visualize_custom_SGDet.ipynb

Citation

When using annotations of Manga109Dialog, please cite our paper.

@inproceedings{li2024manga109dialog,
  title={Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection},
  author={Li, Yingxuan and Aizawa, Kiyoharu and Matsui, Yusuke},
  booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)},
  year={2024},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
eval_and_vis		eval_and_vis
preprocessing		preprocessing
speaker_prediction		speaker_prediction
.gitignore		.gitignore
README.md		README.md
comic_sgg.sh		comic_sgg.sh
comic_sgg_pretrain.sh		comic_sgg_pretrain.sh
comic_sgg_test.sh		comic_sgg_test.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval_and_vis

eval_and_vis

preprocessing

preprocessing

speaker_prediction

speaker_prediction

.gitignore

.gitignore

README.md

README.md

comic_sgg.sh

comic_sgg.sh

comic_sgg_pretrain.sh

comic_sgg_pretrain.sh

comic_sgg_test.sh

comic_sgg_test.sh

Repository files navigation

Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection

Prerequisites

Environment setup

Data preprocessing

Speaker prediction

Evaluation

Visualization

Citation

About

Releases

Packages

Languages

liyingxuan1012/Manga109Dialog

Folders and files

Latest commit

History

Repository files navigation

Manga109Dialog: A Large-scale Dialogue Dataset for Comics Speaker Detection

Prerequisites

Environment setup

Data preprocessing

Speaker prediction

Evaluation

Visualization

Citation

About

Resources

Stars

Watchers

Forks

Languages