Skip to content

[models] Add VIPTR recognition model #1867

@felixdittrich92

Description

@felixdittrich92

🚀 The feature

Ref.: #1826

Paper: VIPTR
Implementation: https://github.com/cxfyxl/VIPTR

  • PyTorch implementation
  • TensorFlow implementation
Hi, 

I would like to suggest possibly introducing another state-of-the-art text recognition architecture to docTR.
[SVIPTR](https://paperswithcode.com/paper/viptr-a-vision-permutable-extractor-for-fast)
It's promising accurate results at low latency.

Notably, the SVIPTR-T (Tiny) variant delivers highly competitive accuracy on par with other lightweight models and achieves SOTA inference speeds. Meanwhile, the SVIPTR-L (Large) attains SOTA accuracy in single-encoder-type models, while maintaining a low parameter count and favorable inference speed.

Thanks for your consideration.

Inference latency should be comparable to crnn_mobilenet_v3_large and the results are hopefully comparable to parseq.
The addition is agreed.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions