typeface-corpus

The repository is initially focused on compiling data that is relevant to the OCR activities conducted in the natural history collections communities and in the digital humanities communities. These communities face the challenge of needing to extract high-quality text from documents and images that contain a variety of typefaces. The goal of this repository is to compile a corpus of typeface samples in standardized formats to help the natural history collection and digital humanities communities significantly improve the quality of text generated by OCR engines such as Tesseract and OCRopus.

For details about the types of files and formatting, see the Submission Procedues document.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
courier_12_ibm_selectric-1		courier_12_ibm_selectric-1
letter_gothic12_ibm_selectric-1		letter_gothic12_ibm_selectric-1
README.md		README.md
submission_procedures.md		submission_procedures.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

typeface-corpus

About

Releases

Packages

jbest/typeface-corpus

Folders and files

Latest commit

History

Repository files navigation

typeface-corpus

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages