Parse Images to get Text #5

AnimeshSinha1309 · 2020-03-10T17:40:26Z

Develop a basic model that can get details on the book just from a photograph of the front page. This is of primary use in older / self-bound books, where the task should be relatively easier, yet scaling this project with parsing index and more will depend crucially on our ability to do this. The following would be involved in the development right now.

Get the basic PyTesseract model working.
Get and tag a small dataset of books, get area of bounding boxes and plot for title/author/waste.
Start combining the texts based on overlap, proximity and size to get Title/author.

KanishAnand · 2020-07-05T17:25:07Z

Issues in OCR

language of book cover page
accuracy is still not great
especially for large text or for cursive font
decide distance limit of merging bounding boxes.

AnimeshSinha1309 · 2020-07-25T17:51:00Z

We have this working, any integrations are deffered until we need it I guess, certainly not in this sprint.

AnimeshSinha1309 added the type:feature New feature or request label Mar 10, 2020

AnimeshSinha1309 added this to the Early Preview Run milestone Mar 10, 2020

AnimeshSinha1309 assigned KanishAnand Mar 10, 2020

AnimeshSinha1309 added this to To do in Flutter App Jul 2, 2020

AnimeshSinha1309 added the segment:books Databasing and collecting information on books label Jul 2, 2020

AnimeshSinha1309 moved this from To do to In progress in Flutter App Jul 5, 2020

AnimeshSinha1309 closed this as completed Jul 25, 2020

Flutter App automation moved this from In progress to Done Jul 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parse Images to get Text #5

Parse Images to get Text #5

AnimeshSinha1309 commented Mar 10, 2020 •

edited

Loading

KanishAnand commented Jul 5, 2020 •

edited

Loading

AnimeshSinha1309 commented Jul 25, 2020

Parse Images to get Text #5

Parse Images to get Text #5

Comments

AnimeshSinha1309 commented Mar 10, 2020 • edited Loading

KanishAnand commented Jul 5, 2020 • edited Loading

AnimeshSinha1309 commented Jul 25, 2020

AnimeshSinha1309 commented Mar 10, 2020 •

edited

Loading

KanishAnand commented Jul 5, 2020 •

edited

Loading