A PyTorch implementation of DTrOCR: Decoder-only Transformer for Optical Character Recognition.
This repository uses Masato Fujitake's DTrOCR research as a basis for training a model used for Japanese text OCR.
The project is based on Arvind Rajan's implementation of original research (https://github.com/arvindrajan92/DTrOCR). It utilizes ViT by hugging face and a language model Japanese GPT-2-medium by Rinna Co (https://huggingface.co/rinna/japanese-gpt2-small).