[models] Add VIPTR recognition model

### 🚀 The feature

Ref.: #1826 

Paper: [VIPTR](https://paperswithcode.com/paper/viptr-a-vision-permutable-extractor-for-fast)
Implementation: https://github.com/cxfyxl/VIPTR

- [x] PyTorch implementation
- [ ] TensorFlow implementation

```
Hi, 

I would like to suggest possibly introducing another state-of-the-art text recognition architecture to docTR.
[SVIPTR](https://paperswithcode.com/paper/viptr-a-vision-permutable-extractor-for-fast)
It's promising accurate results at low latency.

Notably, the SVIPTR-T (Tiny) variant delivers highly competitive accuracy on par with other lightweight models and achieves SOTA inference speeds. Meanwhile, the SVIPTR-L (Large) attains SOTA accuracy in single-encoder-type models, while maintaining a low parameter count and favorable inference speed.

Thanks for your consideration.
```

Inference latency should be comparable to `crnn_mobilenet_v3_large` and the results are hopefully comparable to `parseq`.
The addition is agreed.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[models] Add VIPTR recognition model #1867

🚀 The feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[models] Add VIPTR recognition model #1867

Description

🚀 The feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions