note: some code is inherited from MhLiao/DB
conda create --name DBNet.pytorch -y
conda activate DBNet.pytorch
conda install ipython pip
# python dependencies
pip install -r requirement.txt
# install PyTorch with cuda-10.1
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
# clone repo
git clone https://github.com/WenmuZhou/DBNet.pytorch.git
cd DBNet.pytorch/
# build deformable_conv from torchvision >=0.5
git clone https://github.com/pytorch/vision.git
cd vision
python3 setup.py install
- pytorch 1.2+
- torchvision 0.5+
- gcc 4.9+
TBD
train: prepare a text in the following format, use '\t' as a separator
/path/to/img.jpg path/to/label.txt
...
val: use a folder
img/ store img
gt/ store gt file
- config the
dataset['train']['dataset'['data_path']'
,dataset['validate']['dataset'['data_path']
in config/icdar2015_resnet18_fpn_DBhead_polyLR.yaml - single gpu train
bash singlel_gpu_train.sh
- Multi-gpu training
bash multi_gpu_train.sh
eval.py is used to test model on test dataset
- config
model_path
in eval.sh - use following script to test
bash eval.sh
predict.py is used to inference on single image
- config
model_path
,img_path
, in predict.py - use following script to predict
python3 predict.py
The project is still under development.
only train on ICDAR2015 dataset
Method | image size (short size) | learning rate | Precision (%) | Recall (%) | F-measure (%) | FPS |
---|---|---|---|---|---|---|
Defrom-ResNet-18(paper) | 736 | 0.007 | 86.8 | 78.4 | 82.3 | 48 |
Resnet18-FPN-DBHead | 736 | 1e-3 | 87.03 | 75.06 | 80.6 | 43 |
Resnet50-FPN-DBHead | 736 | 1e-3 | 88.06 | 77.14 | 82.24 | 27 |
TBD
- mutil gpu training
- https://arxiv.org/pdf/1911.08947.pdf
- https://github.com/WenmuZhou/PANet.pytorch
- https://github.com/MhLiao/DB
If this repository helps you,please star it. Thanks.