Hello,
First of all, thank you very much for your impressive work on DocPTBench!
I have successfully reproduced the evaluation metrics for the PaddleOCR-VL model by following the instructions in DocPTBench/docs/parsing.md. Everything works perfectly!
I am now interested in extending the evaluation to other models mentioned in your benchmark. Specifically, I would like to know how to reproduce the results for:
Expert Models: e.g., HunyuanOCR, MinerU (PDF-Extract-Kit), etc.
General MLLMs: e.g., Qwen2.5-VL-72B, Gemini 2.5 Pro, etc.
Could you please provide the corresponding inference scripts for these models? If the code is not yet available, any guidance, configuration files, or reference documentation on how to integrate these models into your evaluation framework would be greatly appreciated.
Thank you again for your significant contribution to the community!