Evalatuion_tool_MOP_v1

This is a customized evaluation tool for the dataset presented in the paper: "Semi-Automatic LaTeX-Based Labeling of Mathematical Objects in PDF Documents: MOP Data Set". ACM DocEng 2019

Acknowledgment to the authors in "Performance evaluation of mathematical formula identification", who developed the original version of the evaluation tool for the Marmot dataset. Unlike Marmot tool, our dataset does not differentiate between Isolated mathematical objects (display) or Embedded mathematical objects (in-line). The reported results are adjusted accordingly.

NOTE

We are currently developing a new tool that will also test the performance of LaTeX generation and subject prediction.

FOR PDF OFFSETS

Depending on what tool you use, you may have to subtract some offset from the cordinates.

HOW TO USE

Download the project with git clone https://github.com/unkown512/Evalatuion_tool_MOP_v1.git
From the root directory run cd Evaluation_Tool
To insure the project builds correctly, run python evaluate.py This should output the below information:

cor 0 mis 0 fal 1 par 0 exp 1 pae 0 mer 0 spl 0 cor 0 mis 0 fal 0 par 0 exp 1 pae 0 mer 0 spl 0 cor 1 mis 0 fal 0 par 0 exp 0 pae 0 mer 0 spl 0 cor 0 mis 1 fal 0 par 0 exp 0 pae 0 mer 0 spl 0 cor 1 mis 2 fal 8 par 0 exp 1 pae 1 mer 0 spl 0 MO False rate 0.6 MO Mis rate 0.333333333333

In total, the tool outputs 9 metrics:

Correct
Miss
False
Partial
Expanded
Partial and Expanded
Merged
Split

How to evaluate your performance?

Create a XML file whose name schema consists of <ID>_page_<#>.xml, whose ID and # correspond to the PDF_File and Ground_Truth file name schema. See MOP for more information.

For each extracted MO (mathematical object), insert a new element row as follows:

Note that for each row, it is optional to add additional rows such as <char BBox>. However, these are currently ignored during evaluation.

Follow the examples in the data/prediction directory. NOTE: You can add additional information such as the BBox for each character, size, and value without affecting the result.

BBox values X1, Y1, X2, Y2 represent the full tight bounding box of the predicted MO in the PDF document:

X1 = Left most x-coordinate
Y1 = Lowest y-coordinate
X2 = Right most x-coordinate
Y2 = Highest y-coordinate

For each file in the data/pdf directory ensure a file exists in both the data/ground_truth and data/prediction directories. There must be an equal number of files and each file should have a unique <ID> inside its own directory that corresponds to the other directories.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Evaluation_Tool		Evaluation_Tool
MOP_prediction_example.PNG		MOP_prediction_example.PNG
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Evalatuion_tool_MOP_v1

NOTE

FOR PDF OFFSETS

HOW TO USE

How to evaluate your performance?

About

Releases

Packages

Languages

unkown512/Evalatuion_tool_MOP_v1

Folders and files

Latest commit

History

Repository files navigation

Evalatuion_tool_MOP_v1

NOTE

FOR PDF OFFSETS

HOW TO USE

How to evaluate your performance?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages