[models] Add text recognition module #4

fg-mindee · 2021-01-11T09:59:27Z

Design a model subpart that is responsible to identify text strings inside the regions of interest of an image

Input

images: Numpy-style encoded (cropped) images (already read), expected to hold a single character sequence

Output

text: list of N strings, where N = number of cropped input images

The following components would be required:

Preprocessor (Detection module : Preprocessor #20, feat: Added function to crop images from bounding boxes #33)
RecognitionModel (CRNN Module (implemented with VGG16) #35)
RecognitionProcessor (feat: Added recognition postprocessor with CTC decoder #37)
RecognitionPredictor (feat: Added high-level predictors #39)

fg-mindee · 2021-01-26T12:11:42Z

EDIT: as discussed, the recognition module actually expects (cropped) images that are supposed to have a single character sequence. The higher-level object will take care of using localization information from DetectionPredictor to crop the images and pass it to the RecognitionPredictor

fg-mindee added type: enhancement Improvement help wanted Extra attention is needed module: models Related to doctr.models labels Jan 11, 2021

fg-mindee added this to the 0.1.0 milestone Jan 11, 2021

fg-mindee removed type: enhancement Improvement help wanted Extra attention is needed labels Jan 12, 2021

fg-mindee assigned fg-mindee and charlesmindee Jan 21, 2021

fg-mindee mentioned this issue Jan 26, 2021

feat: Added high-level predictors #39

Merged

fg-mindee closed this as completed in #39 Jan 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[models] Add text recognition module #4

[models] Add text recognition module #4

fg-mindee commented Jan 11, 2021 •

edited

fg-mindee commented Jan 26, 2021

[models] Add text recognition module #4

[models] Add text recognition module #4

Comments

fg-mindee commented Jan 11, 2021 • edited

fg-mindee commented Jan 26, 2021

fg-mindee commented Jan 11, 2021 •

edited