How to use ocrevalUAtion

Once you have installed a version of ocrevaluation.jar, you can run it as follows:

java -cp ocrevaluation.jar eu.digitisation.Main \
    -gt {ground_truth_file} [{encoding}] \
    -ocr {ocr_file} [encoding] \
    -d {output_directory} [-r {equivalences_file}]

Where:

{ground_truth_file} = the full path to a ground truth file. Supported formats: Text, PAGE.
{ocr_file} = the full path to an OCR result file. Supported formats: Text, PAGE XML, FineReader10 XML, hOCR HTML
{output_directory} = the folder where the report (HTML format) will be generated.
{encoding} = the preceding file encoding type (optional).
{equivalences_file} = an optional text file describing equivalences between Unicode characters (two sequences, separated by a comma, of hexadecimal code points per line).

Example:

java -cp ocrevaluation.jar eu.digitisation.Main \
    -gt groundtruth.xml -ocr ocr.txt utf8 \
    -d output -r equivalences.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use ocrevalUAtion

Clone this wiki locally