You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LayoutXLMProcessor.__call__ should support a language argument for Tesseract OCR
Motivation
LayoutXLM is a multilingual version of the successful LayoutLMv2 model. The main reason to use it over LayoutLMV2 is to handle different languages, yet the current API does not allow specifying the language to be used in apply_tesseract.
Your contribution
I could submit a PR but I am not that familiar with the Transformers library to suggest the best place to add the lang argument.
The text was updated successfully, but these errors were encountered:
Xargonus
changed the title
LayoutXLMProcessor applies the english tesseract model
LayoutXLMProcessor applies the english Tesseract model
Nov 24, 2021
🚀 Feature request
LayoutXLMProcessor.__call__ should support a language argument for Tesseract OCR
Motivation
LayoutXLM is a multilingual version of the successful LayoutLMv2 model. The main reason to use it over LayoutLMV2 is to handle different languages, yet the current API does not allow specifying the language to be used in apply_tesseract.
Your contribution
I could submit a PR but I am not that familiar with the Transformers library to suggest the best place to add the lang argument.
The text was updated successfully, but these errors were encountered: