E2STR

The official implementation of E2STR: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer (CVPR-2024) PDF

environment

install mmocr 1.0.0
install requirements.txt

data & model

Download Union14M-L from Union14M-L
Download the MAE pretrained ViT weight from MAERec
Download OPT-125M
Download all the test dataset (listed in Table 1 and Table 2) and modify all the data_root in configs/textrecog/base/datasets. We may upload them later.
The 600k training data with character-wise annotations will be available later. But currently, the repository can also run well without this training data (i.e., you can perform in-context training with only Transform Strategy by modifying 'JSON FILE FOR CHARACTER-WISE POSITION INFORMATION' as None). Also refer to Table 4.

train

stage1: vanilla STR training

modify 'MAE PRETRAIN WEIGHT PATH' / 'LM WEIGHT PATH' / 'CHECKPOINT SAVE PATH' / 'SAVE_NAME' in configs/textrecog/icl_ocr/stage1.py

sh run_stage1.sh

stage2: in-context training

modify 'STAGE-1 WEIGHT PATH' / 'LM WEIGHT PATH' / 'JSON FILE FOR CHARACTER-WISE POSITION INFORMATION' / 'CHECKPOINT SAVE PATH' / 'SAVE_NAME' in configs/textrecog/icl_ocr/stage2.py

sh run_stage2.sh

evaluate

Construct the in-context pool (i.e., a json file) by randomly sample data from any target training set. The json file should be structured as follows:

[
{
    'img_path': ,
    'gt_text': 
}
]

Modify 'JSON FILE FOR IN-CONTEXT POOL' in configs/textrecog/icl_ocr/stage2.py
Run the following command to evaluate the model.

bash tools/dist_test.sh ./configs/textrecog/icl_ocr/S-stage2.py 'STAGE2-CHECKPOINT-PATH' 8

Citation

If you find our models / code / papers useful in your research, please consider giving stars ⭐ and citations 📝

@article{zhao2023multi,
  title={Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer},
  author={Zhao, Zhen and Huang, Can and Wu, Binghong and Lin, Chunhui and Liu, Hao and Zhang, Zhizhong and Tan, Xin and Tang, Jingqun and Xie, Yuan},
  journal={CVPR},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs/textrecog

configs/textrecog

dicts

dicts

mmocr

mmocr

tools

tools

LICENSE

LICENSE

readme.md

readme.md

requirements.txt

requirements.txt

run_stage1.sh

run_stage1.sh

run_stage2.sh

run_stage2.sh

Repository files navigation

E2STR

environment

data & model

train

evaluate

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs/textrecog		configs/textrecog
dicts		dicts
mmocr		mmocr
tools		tools
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt
run_stage1.sh		run_stage1.sh
run_stage2.sh		run_stage2.sh

License

bytedance/E2STR

Folders and files

Latest commit

History

Repository files navigation

E2STR

environment

data & model

train

evaluate

Citation

About

Resources

License

Stars

Watchers

Forks

Languages