Chronicling America OCR debatcher
This program takes paths to
.tar.bz2 batches of OCR files from the
Chronicling America bulk data
downloads. It converts
each batch into a CSV file, which you can load into a database or do whatever
you like with. It will process the batches concurrently.
./chronam-ocr-debatcher [--processes=8] <path/to/a/batch.tar.bz2 ...>
You can download binaries from the releases page.