Skip to content

SOTA baseline in ICDAR 2024 Competition on Historical Map Text Detection, Recognition, and Linking.

License

Notifications You must be signed in to change notification settings

yyyyyxie/MapTextPipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

ICDAR https://rrc.cvc.uab.es/ This competition is internationally recognized as an authoritative event in the field of text recognition. The data evaluation and metrics in top conference papers in the text recognition field often come from ICDAR competition data and metrics. Generally, there are several major events each year, and each event is further divided into 3-4 competitions.

Introduction to the ICDAR24 Competition on Historical Map Text Detection, Recognition, and Linking

https://rrc.cvc.uab.es/?ch=28&com=introduction

Text on digitized historical maps contains valuable information providing georeferenced political and cultural context, yet the wealth of information in digitized historical maps remains largely inaccessible due to their unsearchable raster format. This competition aims to address the unique challenges of detecting and recognizing textual information (e.g., place names) and linking words to form location phrases.

Usage

  • Installation

Python 3.8 + PyTorch 2.0.1 + CUDA 11.7 + Detectron2

conda create -n dnts python=3.8 -y
conda activate dnts
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
cd detectron2
pip install -e .
pip install -r requirements.txt
cd ..
python setup.py build develop
  • Convert to COCO formats:

python tools/convert.py --input-json your_gt_path --output-json your_output_path --output_image_id_json your_output_image_id_path
  • Convert results to submition formats:

python tools/convert_to_original.py --input-json your_pred_path  --input_image_id_json your_input_image_id_path --output-json your_output_submition_format_path 

Fine-tune

You can download our pre-trained model in OneDrive and fine-tune it on the Rumsey dataset. The fine-tuning command is as follows:

python tools/train.py --config-file configs/ViTAEv2_S/rumsey/final_rumsey.yaml --num-gpus 2

You can also directly use our fine-tuned weights for inference:

python tools/train.py --config-file configs/ViTAEv2_S/rumsey/test.yaml --num-gpus 2 --eval-only

JSON results will be saved in output/vitaev2/test/rumsey_bs2_test_final/inference/text_results.json, and you can use tools/convert_to_original.py to convert the JSON file to submission results.

Citation

This project utilizes methods related to DNTextSpotter. If you find MapTextPipeline helpful, please consider giving this repo a star ⭐ and citing:

@article{xie2024dntextspotter,
  title={DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training},
  author={Xie, Yu and Qiao, Qian and Gao, Jun and Wu, Tianxiang and Huang, Shaoyao and Fan, Jiaqing and Cao, Ziqiang and Wang, Zili and Zhang, Yue and Zhang, Jielei and others},
  journal={arXiv preprint arXiv:2408.00355},
  year={2024}
}

Acknowledgement

This project is based on Adelaidet and DeepSolo. For academic use, this project is licensed under the 2-clause BSD License.

About

SOTA baseline in ICDAR 2024 Competition on Historical Map Text Detection, Recognition, and Linking.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published