Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trOCR run example #451

Closed
vozdemir opened this issue Sep 29, 2021 · 14 comments
Closed

trOCR run example #451

vozdemir opened this issue Sep 29, 2021 · 14 comments

Comments

@vozdemir
Copy link

Many thanks for trOCR!! I couldn't run trOCR on text. How to run it, can you give an exaample pls.

Model I am using trOCR.

@wolfshow
Copy link
Contributor

wolfshow commented Sep 29, 2021

@vozdemir Can you provide more details about your question? You may just download the dataset provided and do the inference with the base/large models. Meanwhile, please refer to #448 for more details.

@NielsRogge
Copy link

@wolfshow would be great if you can fix my notebook to run inference with TrOCR on one particular image: https://colab.research.google.com/drive/1BHkOBUGHr1xlVQ9pLVLZPA0VvAxF4Zo_?usp=sharing

@wolfshow
Copy link
Contributor

wolfshow commented Oct 2, 2021

@wolfshow would be great if you can fix my notebook to run inference with TrOCR on one particular image: https://colab.research.google.com/drive/1BHkOBUGHr1xlVQ9pLVLZPA0VvAxF4Zo_?usp=sharing

@Dod-o will help providing an inference example on that.

@Dod-o
Copy link
Contributor

Dod-o commented Oct 2, 2021

@NielsRogge The inference example has been uploaded, please see details in pic_inference.py

@Dod-o Dod-o closed this as completed Oct 2, 2021
@SeifeddineGharbi
Copy link

@Dod-o when running pic_inference.py, I get the following error: urllib.error.HTTPError: HTTP Error 404: Not Found
it occurs when the script tries to download: Downloading: "https://github.com/pytorch/fairseq/archive/master.zip"

Any help please, thank you!!

@NielsRogge
Copy link

NielsRogge commented Oct 4, 2021

@SeifeddineGharbi I figured it out (I had to create a custom fork of Fairseq in order to make it work).

Expect TrOCR to be added to Transformers soon ;)

@SeifeddineGharbi
Copy link

@NielsRogge Thank you so much for replying but can you explain a bit further, please?

@wolfshow
Copy link
Contributor

wolfshow commented Oct 4, 2021

@SeifeddineGharbi I figured it out (I had to create a custom fork of Fairseq in order to make it work).

Expect TrOCR to be added to Transformers soon ;)

We found the fairseq model cannot be easily converted into the hf format. So we need to take more time to re-train the models with the hf library.

@nithinreddyy
Copy link

@wolfshow would be great if you can fix my notebook to run inference with TrOCR on one particular image: https://colab.research.google.com/drive/1BHkOBUGHr1xlVQ9pLVLZPA0VvAxF4Zo_?usp=sharing

Hugging face has uploaded the trocr model in their models. You can look into it.

https://huggingface.co/transformers/model_doc/trocr.html

@NielsRogge
Copy link

@nithinreddyy haha I wrote that page 😅

@nithinreddyy
Copy link

@nithinreddyy haha I wrote that page 😅

Yaaa 😬😬. I gone through the 3 notebooks, but you haven't written code for testing the image and extracting the text (For fine tuning model with IAM dataset). In the 2nd notebook you trained the model with IAM dataset, in 3rd notebook you just checked the test evaluation scores. But how to take one image from test data and extract the text?

@NielsRogge
Copy link

NielsRogge commented Nov 4, 2021

My inference notebook does exactly what you want.

@nithinreddyy
Copy link

My inference notebook does exactly what you want.

But you are directly loading the model from hugging face. What if we have our own dataset and want to train the model with data? In 2nd notebook You've trained the custom model with IAM dataset and you haven't written code how to extract text from one of the test images. I'm looking for that.

@NielsRogge
Copy link

NielsRogge commented Nov 4, 2021

Sorry, I thought you were talking about recognizing text, but you mean extracting text.

The IAM dataset only contains single-line text images, hence one doesn't need to perform any text extraction anymore. However, if you want to apply TrOCR on an entire PDF document, then you first need a text extraction algorithm.

You can for example take a look at this one: https://github.com/qurator-spk/eynollah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants