-
Notifications
You must be signed in to change notification settings - Fork 621
Closed
Labels
ext: docsRelated to docs folderRelated to docs folderext: testsRelated to tests folderRelated to tests folderframework: pytorchRelated to PyTorch backendRelated to PyTorch backendmodule: modelsRelated to doctr.modelsRelated to doctr.modelstopic: documentationImprovements or additions to documentationImprovements or additions to documentationtopic: text recognitionRelated to the task of text recognitionRelated to the task of text recognitiontype: new featureNew featureNew feature
Milestone
Description
🚀 The feature
Ref.: #1826
Paper: VIPTR
Implementation: https://github.com/cxfyxl/VIPTR
- PyTorch implementation
- TensorFlow implementation
Hi,
I would like to suggest possibly introducing another state-of-the-art text recognition architecture to docTR.
[SVIPTR](https://paperswithcode.com/paper/viptr-a-vision-permutable-extractor-for-fast)
It's promising accurate results at low latency.
Notably, the SVIPTR-T (Tiny) variant delivers highly competitive accuracy on par with other lightweight models and achieves SOTA inference speeds. Meanwhile, the SVIPTR-L (Large) attains SOTA accuracy in single-encoder-type models, while maintaining a low parameter count and favorable inference speed.
Thanks for your consideration.
Inference latency should be comparable to crnn_mobilenet_v3_large and the results are hopefully comparable to parseq.
The addition is agreed.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
ext: docsRelated to docs folderRelated to docs folderext: testsRelated to tests folderRelated to tests folderframework: pytorchRelated to PyTorch backendRelated to PyTorch backendmodule: modelsRelated to doctr.modelsRelated to doctr.modelstopic: documentationImprovements or additions to documentationImprovements or additions to documentationtopic: text recognitionRelated to the task of text recognitionRelated to the task of text recognitiontype: new featureNew featureNew feature