Introduction

This is a PyToch implementation of Video Text Tracking With a Spatio-Temporal Complementary Model.

Part of the code is inherited from DB and SiamMask.

ToDo List

Release code
Document for Installation
Document for training and testing

Installation

Requirements:

Python 3.6
PyTorch >= 1.2
GCC 5.5
CUDA 9.2

  conda create --name scm python=3.6
  conda activate scm

  # install PyTorch with cuda-9.2
  conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=9.2 -c pytorch

  # python dependencies
  pip install -r requirement.txt

  # clone repo
  git clone https://github.com/lsabrinax/VideoTextSCM
  cd VideoTextSCM/

  # build deformable convolution opertor
  cd assets/ops/dcn/
  python setup.py build_ext --inplace

Datasets

The root of the dataset directory can be VideoTextSCM/datasets/. Download the converted ground-truth and data list Baidu Drive(download code: 0e8b), Google Drive. The images of each dataset can be obtained from official website.

Testing

run the below command to get the tracking results and submit the results to official website to get the performance

CUDA_VISIBLE_DEVICES=0 python demo_textboxPP.py --input-root path-to-test-dataset --output-root path-to-save-result --sub-res --dataset icdar --weight-path path-to-embedding-model --scm-config path-to-scm-config --scm-weight-path path-to-scm-model

Training

SCM

#download the pre-trained model
cd VideoTextSCM/scm/experiments/siammask_sharp
wget http://www.robots.ox.ac.uk/~qwang/SiamMask_VOT.pth

#train the model
cd VideoTextSCM
CUDA_VISIBLE_DEVICES=0,1,2,3 python train_scm.py --save-dir path-to-save-scm-model --pretrained \
./scm/experiments/siammask_sharp/SiamMask_VOT.pth --config ./scm/experiments/siammask_sharp/config_icdar.json \
--batch 256 --epochs 20

Embedding

Download totaltext_resnet50 Baidu Drive (download code: p6u3), Google Drive.

cd db_model & mkdir weights # put totaltext_resnet50 in db_model/weights

#train embedding
cd VideoTextSCM
CUDA_VISIBLE_DEVICES=0 python train_embedding.py --exp_name model-name --batch_size 3 --num_workers 8 --lr 0.0005

Citing the related works

Please cite the related works in your publications if it helps your research:

@article{gao2021video,
  title={Video Text Tracking With a Spatio-Temporal Complementary Model},
  author={Gao, Yuzhe and Li, Xing and Zhang, Jiajian and Zhou, Yu and Jin, Dian and Wang, Jing and Zhu, Shenggao and Bai, Xiang},
  journal={IEEE Transactions on Image Processing},
  volume={30},
  pages={9321--9331},
  year={2021},
  publisher={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets/ops/dcn		assets/ops/dcn
db_model		db_model
scm		scm
tracker		tracker
utils		utils
README.md		README.md
dataset.py		dataset.py
demo_textboxPP.py		demo_textboxPP.py
requirements.txt		requirements.txt
track_textboxPP.py		track_textboxPP.py
train_embedding.py		train_embedding.py
train_scm.py		train_scm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

ToDo List

Installation

Requirements:

Datasets

Testing

Training

SCM

Embedding

Citing the related works

About

Releases

Packages

Languages

lsabrinax/VideoTextSCM

Folders and files

Latest commit

History

Repository files navigation

Introduction

ToDo List

Installation

Requirements:

Datasets

Testing

Training

SCM

Embedding

Citing the related works

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages