DBNet

Paper: Real-time Scene Text Detection with Differentiable Binarization

Label: Text Detection

Introduction

In recent years, segmentation based methods are very popular in scene text detection, because segmentation results can more accurately describe various shapes of scene text, such as curved text. However, post-processing of binarization is essential for segmentation based detection, which converts the probability graph generated by the segmentation method into the boundary box/region of the text. DBNet has proposed a module called differentiable binarization (DB), It can perform the binarization process in the segmentation network. The segmentation network optimized with the DB module can adaptively set the binarization threshold, which not only simplifies the post-processing, but also improves the performance of text detection.

Dataset

Datasets used: ICDAR2015

Size: 132M
- Training Set:
  - image: 88.5M(1000 images)
  - label: 157KB
- Evaluation Set:
  - image: 43.3M(500 images)
  - label: 244KB
Data format: image, label

Environmental requirements

Device（Ascend/GPU/CPU）
- Use Ascend/GPU/CPU as hardware environment. Refer to MindSpore to install the runtime environment.
MindSpore >= 1.9

git clone https://gitee.com/mindspore/models.git
cd models/official/cv/DBNet
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

BenchMark

Accuracy

Model	pretrained Model	config	Train Set	Test Set	Device Num	Epoch	Test Size	Recall	Precision	Hmean	CheckPoint	Graph Train Log
DBNet-R18	R18	cfg	ICDAR2015 Train	ICDAR2015 Test	1	1200	736	78.63	84.21	81.32	download	download
DBNet-R50	R50	cfg	ICDAR2015 Train	ICDAR2015 Test	1	1200	736	81.05	88.07	84.41	download	download

Performance

device	Model	dataset	Params(M)	PyNative train 1P bs=16 (ms/step)	PyNative train 8P bs=8 (ms/step)	PyNative infer(FPS)	Graph train 1P bs=16 (ms/step)	Graph train 8P bs=8 (ms/step)	Graph infer(FPS)
Ascend	DBNet-R18	ICDAR2015	11.78 M	370	530	-	224	195	40.62
GPU	DBNet-R18	ICDAR2015	11.78 M	710	880	-	560	435	30.97
Ascend	DBNet-R50	ICDAR2015	24.28 M	524	680	-	273	220	33.88
GPU	DBNet-R50	ICDAR2015	24.28 M	935	1054	-	730	547	23.95

This model is greatly affected by data processing, and the performance data on different machines fluctuate greatly. The above data are for reference.

The above data are tested at:

Ascend 910 32G 8 devices; Operating system: Euler2.8; Memory: 756 G; ARM 96 cores CPU;

GPU v100 PCIE 32G 8 devices; Operating system: Ubuntu 18.04; Memory: 502 G; x86 72 cores CPU.

Quickly Start

Run standalone:

bash run_standalone_train.sh [CONFIG_PATH] [DEVICE_ID] [LOG_NAME](optional)

Run distribution:

bash run_distribution_train.sh [DEVICE_NUM] [CONFIG_PATH] [LOG_NAME](optional)

Evaluation:

bash run_eval.sh [CONFIG_PATH] [CKPT_PATH] [DEVICE_ID] [LOG_NAME](optional)

If you need to modify the device or other configurations, please modify the corresponding items in the configuration file.

Training

Run standalone train

bash run_standalone_train.sh [CONFIG_PATH] [DEVICE_ID] [LOG_NAME](optional)
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# DEVICE_ID: Device id used for training
# LOG_NAME: The name of the saved log and output folder. The default is standalone_train

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file

After the training, you can find the checkpoint file in [LOG_NAME].

Run distribution train

bash run_distribution_train.sh [DEVICE_NUM] [CONFIG_PATH] [LOG_NAME](optional)
# DEVICE_NUM: Device number used for training.
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# LOG_NAME: The name of the saved log and output folder. The default is distribution_train

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file.

ModelArts

Configure the ModelArts parameter in the config file:

setting enable_modelarts=True
Setting OBS dataset path data_url:
Set OBS training return path train_url:

Referring to ModelArts executing training.

Online Evaluation

evaluation

bash run_eval.sh [CONFIG_PATH] [CKPT_PATH] [DEVICE_ID] [LOG_NAME](optional)
# CONFIG_PATH: The configuration file path, device target default is Ascend. If you need to modify it, please modify the device in the config file.
# DEVICE_ID: Device id used for training
# LOG_NAME: The name of the saved log and output folder. The default is eval

Executing the above command will run in the background. You can view the results through the [LOG_NAME].txt file.

Off-line Evaluation

Export Process

python export.py --config_path=[CONFIG_PATH] --ckpt_path=[CKPT_PATH]

You can find the MINDIR file in output_dir in config file.

310 Inference

Plaese refer to MindSpore Inference with C++ Deployment Guide to set environment variables.

bash scripts/run_cpp_infer.sh [MINDIR_PATH] [CONFIG_PATH] [OUTPUT_DIR] [DEVICE_TARGET] [DEVICE_ID]
# MINDIR_PATH: The path of MindIR file
# CONFIG_PATH: The configuration file path
# OUTPUT_DIR: Data preprocessing and result saving path
# DEVICE_TARGET: Should be in [Ascend, GPU, CPU], 310 Inference choose Ascend
# DEVICE_ID: Device id

Disclaimers

Models only provide scripts for downloading and preprocessing public data sets. We do not own these datasets, nor are we responsible for their quality or maintenance. Please ensure that you have the permission to use the datasets under the permission of the datasets. The models trained on these datasets are only used for non-commercial research and teaching purposes.

To the dataset owner: If you do not want to include the data set in MindSpore models or want to update it in any way, we will delete or update all public content as required. Please contact us through Gitee. Thank you for your understanding and contribution to the community.

Thank

This version of DBNet draws on some excellent open source projects, including:

https://github.com/MhLiao/DB.git

https://gitee.com/yanan0122/dbnet-and-dbnet_pp-by-mind-spore.git

FAQ

Please refer to Models FAQ to find some common public questions.

Q: When there is not enough memory or too many threads with WARNING, how to solve it?

A: Adjust the num_workers, prefetch_size, max_rowsize in configuration file. Generally, excessive CPU consumption needs to be reduced num_workers; Excessive memory consumption needs to be reduced num_workers, prefetch_size and max_rowsize.

Q: What to do if loss does not converge in GPU environment?

A: Setting mix_precision to False in configuration file.

Q: Why TotalText has dataset interface but no configuration file?

A: TotalText needs to use the pretrained parameters on the SynthText dataset. Currently, no pre-trained parameter file on the SynthText dataset is provided.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commits
config		config
cpp_infer		cpp_infer
scripts		scripts
src		src
README.md		README.md
README_CN.md		README_CN.md
eval.py		eval.py
export.py		export.py
requirements.txt		requirements.txt
train.py		train.py

2023-MindSpore-1/ms-code-158

Folders and files

Latest commit

History

Repository files navigation