Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
input
output
paper
presentation
README.md

README.md

Segmentation-free speech text recognition for comic books

Abstract

Speech text in comic books is written in a particular manner by the scriptwriter which raises unusual challenges for text recognition. We first detail these challenges and present different approaches to solve them. We compare the performances of pre-trained OCR and segmentation-free approach for speech text of comic books written in Latin script. We demonstrate that few good quality pre-trained OCR output samples, associated with other unlabeled data with the same writing style, can feed a segmentation-free OCR and improve text recognition. Thanks to the help of the lexicality measure that automatically accept or reject the pre-trained OCR output as pseudo ground truth for a subsequent segmentation-free OCR training and recognition.

Citation BiBteX

Published in the proceedings of the 2nd International Workshop on coMics ANalysis, Processing and Understanding.


@inproceedings{Rigaud2017Segmentation,
author={C. Rigaud and J. C. Burie and J. M. Ogier},
booktitle={Proceedings of the 2nd International Workshop on coMics ANalysis, Processing and Understanding},
title={Segmentation-Free Speech Text Recognition for Comic Books},
series = {MANPU '17},
volume={03},
pages={29-34},
keywords={Image segmentation;Optical character recognition software;Speech;Speech recognition;Text recognition;Training;Writing;Text recognition;comic book image analysis;pseudo ground truth;segmentation-free OCR},
doi={10.1109/ICDAR.2017.288},
month={Nov},
year={2017},
address = {Kyoto, Japan},
}