Printed mathematical expression recognition (MER) models are usually trained and tested using LaTeX-generated mathematical expressions (MEs) as input and the LaTeX source code as ground truth. As the same ME can be generated by various different LaTeX source codes, this leads to unwanted variations in the ground truth data that bias test performance results and hinder efficient learning. In addition, the use of only one font to generate the MEs heavily limits the generalization of the reported results to realistic scenarios. We propose a data-centric approach to overcome this problem, and present convincing experimental results: Our main contribution is an enhanced LaTeX normalization to map any LaTeX ME to a canonical form. Based on this process, we developed an improved version of the benchmark dataset im2latex-100k, featuring
You can download the im2latexv2 dataset from Zenodo (Part 1: 10.5281/zenodo.11230382, Part 2: 10.5281/zenodo.11296280)
You can download the realFormula dataset from Zenodo (doi: 10.5281/zenodo.11296815)
Get the model from dropbox and save it in the trainedModels folder. You can inference an image with the inference.py script. The image should have ideally a solution of 600 DPI by a font size of 12.
Adapt the config file and the dataset config file if required.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Felix M. Schmitt-Koopmann, Elaine M. Huang, Hans-Peter Hutter, Thilo Stadelmann, Alireza Darvishy
@ARTICLE{9869643,
author={Schmitt-Koopmann, Felix M. and Huang, Elaine M. and Hutter, Hans-Peter and
Stadelmann, Thilo and Darvishy, Alireza},
journal={IEEE Access},
title={MathNet: A Data-Centric Approach for Printed Mathematical Expression Recognition},
year={2024},
doi={10.1109/ACCESS.2024.3404834}}