Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Happy to see fair evaluation metric being developed #2

Closed
rkcosmos opened this issue Jul 29, 2020 · 2 comments
Closed

Happy to see fair evaluation metric being developed #2

rkcosmos opened this issue Jul 29, 2020 · 2 comments

Comments

@rkcosmos
Copy link

Hello Clova team,

Greeting from EasyOCR, I'm looking forward to have EasyOCR evaluated by fair metric. (I'm a bit scared of course but it's always better to know own's weakness than living in illusion) May I suggest you to add support for datasets in other language? I think UnrealText has the potential. (I have not tested it myself though.)

Thanks,
Rakpong

@dev-strender
Copy link
Collaborator

Oh, hi @rkcosmos !

First of all, thank you for your interest on our evaluation.
And also thanks for your contribution on OCR community with maintaining EasyOCR, an amazing OCR inference open source code utilizing our works.

As we mentioned on README,
we did not test on multi-lingual dataset since there exists more things to consider, like

  • reading order
  • calculation of text length on different language
  • validity of pseudo character center on other languages, and so on.

You recommended the UnrealText for test set to examine, and yes,
we are planning to target on MLT19(https://rrc.cvc.uab.es/?ch=15), which has the same kinds of languages.

We hope to develop it as soon as possible, so stay tuned!
OR, any pull requests are welcome :)

@rkcosmos
Copy link
Author

rkcosmos commented Aug 2, 2020

Hi, yes we use a lot of your work. Before launching our project, we looked into quite a number of repositories and found that your team produces easy-to-read + really work as advertised + under free license code. We really appreciate your team's effort in open sourcing their work.

It's good to know you are targeting MLT19. From your TODO on readme, you mention trying to calculate length of text. Let's say in Hindi you can have र+ ्+ क+ ि = र्कि. Do you want length = 4 or length = 1? If you want length = 4, then you can just use len(). If you want length = 1, then you can render them and calculate the width of image. I can even try to create a function for that. (would like to find a way to contribute back to your team :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants