Skip to content

2023-MindSpore-1/ms-code-158

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DBNet


Paper: Real-time Scene Text Detection with Differentiable Binarization

Label: Text Detection


Introduction

In recent years, segmentation based methods are very popular in scene text detection, because segmentation results can more accurately describe various shapes of scene text, such as curved text. However, post-processing of binarization is essential for segmentation based detection, which converts the probability graph generated by the segmentation method into the boundary box/region of the text. DBNet has proposed a module called differentiable binarization (DB), It can perform the binarization process in the segmentation network. The segmentation network optimized with the DB module can adaptively set the binarization threshold, which not only simplifies the post-processing, but also improves the performance of text detection.

img

Dataset

Datasets used: ICDAR2015

  • Size: 132M
    • Training Set:
      • image: 88.5M(1000 images)
      • label: 157KB
    • Evaluation Set:
      • image: 43.3M(500 images)
      • label: 244KB
  • Data format: image, label

Environmental requirements

  • Device(Ascend/GPU/CPU)
    • Use Ascend/GPU/CPU as hardware environment. Refer to MindSpore to install the runtime environment.
  • MindSpore >= 1.9
git clone https://gitee.com/mindspore/models.git
cd models/official/cv/DBNet
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

BenchMark

Accuracy

Model pretrained Model config Train Set Test Set Device Num Epoch Test Size Recall Precision Hmean CheckPoint Graph Train Log
DBNet-R18 R18 cfg ICDAR2015 Train ICDAR2015 Test 1 1200 736 78.63 84.21 81.32 download download
DBNet-R50 R50 cfg ICDAR2015 Train ICDAR2015 Test 1 1200 736 81.05 88.07 84.41 download download

Performance

device Model dataset Params(M) PyNative train 1P bs=16 (ms/step) PyNative train 8P bs=8 (ms/step) PyNative infer(FPS) Graph train 1P bs=16 (ms/step) Graph train 8P bs=8 (ms/step) Graph infer(FPS)
Ascend DBNet-R18 ICDAR2015 11.78 M 370 530 - 224 195 40.62
GPU DBNet-R18 ICDAR2015 11.78 M 710 880 - 560 435 30.97
Ascend DBNet-R50 ICDAR2015 24.28 M 524 680 - 273 220 33.88
GPU DBNet-R50 ICDAR2015 24.28 M 935 1054 - 730 547 23.95

This model is greatly affected by data processing, and the performance data on different machines fluctuate greatly. The above data are for reference.

The above data are tested at:

Ascend 910 32G 8 devices; Operating system: Euler2.8; Memory: 756 G; ARM 96 cores CPU;

GPU v100 PCIE 32G 8 devices; Operating system: Ubuntu 18.04; Memory: 502 G; x86 72 cores CPU.

Quickly Start

Run standalone:

bash run_standalone_train.sh [CONFIG_PATH] [DEVICE_ID] [LOG_NAME](optional)

Run distribution:

bash run_distribution_train.sh [DEVICE_NUM] [CONFIG_PATH] [LOG_NAME](optional)

Evaluation:

bash run_eval.sh [CONFIG_PATH] [CKPT_PATH] [DEVICE_ID] [LOG_NAME](optional)

If you need to modify the device or other configurations, please modify the corresponding items in the configuration file.

Training

Run standalone train

bash run_standalone_train.sh [CONFIG_PATH] [DEVICE_ID] [LOG_NAME](optional)
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# DEVICE_ID: Device id used for training
# LOG_NAME: The name of the saved log and output folder. The default is standalone_train

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file

After the training, you can find the checkpoint file in [LOG_NAME].

Run distribution train

bash run_distribution_train.sh [DEVICE_NUM] [CONFIG_PATH] [LOG_NAME](optional)
# DEVICE_NUM: Device number used for training.
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# LOG_NAME: The name of the saved log and output folder. The default is distribution_train

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file.

ModelArts

  1. Configure the ModelArts parameter in the config file:
  • setting enable_modelarts=True
  • Setting OBS dataset path data_url:
  • Set OBS training return path train_url:
  1. Referring to ModelArts executing training.

Online Evaluation

evaluation

bash run_eval.sh [CONFIG_PATH] [CKPT_PATH] [DEVICE_ID] [LOG_NAME](optional)
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# DEVICE_ID: Device id used for training
# LOG_NAME: The name of the saved log and output folder. The default is eval

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file.

Off-line Evaluation

Export Process

python export.py --config_path=[CONFIG_PATH] --ckpt_path=[CKPT_PATH]

You can find the MINDIR file in output_dir in config file.

310 Inference

Plaese refer to MindSpore Inference with C++ Deployment Guide to set environment variables.

bash scripts/run_cpp_infer.sh [MINDIR_PATH] [CONFIG_PATH] [OUTPUT_DIR] [DEVICE_TARGET] [DEVICE_ID]
# MINDIR_PATH: The path of MindIR file
# CONFIG_PATH: The configuration file path
# OUTPUT_DIR: Data preprocessing and result saving path
# DEVICE_TARGET: Should be in [Ascend, GPU, CPU], 310 Inference choose Ascend
# DEVICE_ID: Device id

Disclaimers

Models only provide scripts for downloading and preprocessing public data sets. We do not own these datasets, nor are we responsible for their quality or maintenance. Please ensure that you have the permission to use the datasets under the permission of the datasets. The models trained on these datasets are only used for non-commercial research and teaching purposes.

To the dataset owner: If you do not want to include the data set in MindSpore models or want to update it in any way, we will delete or update all public content as required. Please contact us through Gitee. Thank you for your understanding and contribution to the community.

Thank

This version of DBNet draws on some excellent open source projects, including:

https://github.com/MhLiao/DB.git

https://gitee.com/yanan0122/dbnet-and-dbnet_pp-by-mind-spore.git

FAQ

Please refer to Models FAQ to find some common public questions.

Q: When there is not enough memory or too many threads with WARNING, how to solve it?

A: Adjust the num_workers, prefetch_size, max_rowsize in configuration file. Generally, excessive CPU consumption needs to be reduced num_workers; Excessive memory consumption needs to be reduced num_workers, prefetch_size and max_rowsize.

Q: What to do if loss does not converge in GPU environment?

A: Setting mix_precision to False in configuration file.

Q: Why TotalText has dataset interface but no configuration file?

A: TotalText needs to use the pretrained parameters on the SynthText dataset. Currently, no pre-trained parameter file on the SynthText dataset is provided.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published