Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LayoutLMv2FeatureExtractor now supports non-English languages when applying Tesseract OCR. #14514

Merged

Commits on Nov 24, 2021

  1. Added the lang argument to apply_tesseract in feature_extraction_layo…

    …utlmv2.py, which is used in pytesseract.image_to_data.
    Xargonus committed Nov 24, 2021
    Configuration menu
    Copy the full SHA
    3ad64ca View commit details
    Browse the repository at this point in the history
  2. Added ocr_lang argument to LayoutLMv2FeatureExtractor.__init__, which…

    … is used when calling apply_tesseract
    Xargonus committed Nov 24, 2021
    Configuration menu
    Copy the full SHA
    95fe37d View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1d3f3e3 View commit details
    Browse the repository at this point in the history
  4. Specified in the documentation of the LayoutLMv2FeatureExtractor that…

    … the ocr_lang argument should be a language code.
    Xargonus committed Nov 24, 2021
    Configuration menu
    Copy the full SHA
    ccfaa07 View commit details
    Browse the repository at this point in the history
  5. Update src/transformers/models/layoutlmv2/feature_extraction_layoutlm…

    …v2.py
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    Xargonus and NielsRogge committed Nov 24, 2021
    Configuration menu
    Copy the full SHA
    83d6608 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    d550ea5 View commit details
    Browse the repository at this point in the history

Commits on Nov 25, 2021

  1. Update src/transformers/models/layoutlmv2/feature_extraction_layoutlm…

    …v2.py
    
    Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
    Xargonus and NielsRogge committed Nov 25, 2021
    Configuration menu
    Copy the full SHA
    2ae5d82 View commit details
    Browse the repository at this point in the history