Overview

Original CRAFT text detector's input image size is 384x384. Though CRAFT showed good performance for scene text detection, the input size is not enough for the high-resolution task, especially when it comes to document.

This repository of CRAFT, you can change input image size for improving model performance at training.

How to use

Prepare your data

First of all, write your own Dataloader code.

In datasets/craft_dataset.py, you can find CustomDataset.

Make your CustomDataset return image, char_boxes, words, image_fn by __getitem__ method. Return data format should be same as below.

image : np.ndarray

char_boxes : character level bounding box coord.

[   [lx, ly], [rx, ly], [rx, ry], [lx, ry],
    [lx, ly], [rx, ly], [rx, ry], [lx, ry],
    ...]

words : list of words. character annotation should be in order of bounding boxes.
image_fn : pathlib image path

Then, change setting in settings/default.yaml.

train_data_path: <your train data path>
val_data_path: <your validate data path>

These two setting is all you need to edit.

Now you are ready to train your model. But the training might be very slow because of data processing time at making character and affinity heatmap.

When it comes to train detecting text in high resolution documents, the heatmap processing is very slow.

In fact, the same data processing repeats every epoch. So, it does not necessarily have to be done for every epoch. Therefore, let's preprocess it before we start training.

Run preprocess.py like below.

python preprocess.py --setting settings/default.yaml --num_workers 16 --batch_size 4

Train

python run.py --setting settings/default.yaml --version 0 --num_workers 16 -bs 4 --preprocessed

To monitor the training progress, use tensorboard.

tensorboard --logdir tb_logs --bind_all

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
datasets		datasets
models		models
settings		settings
utils		utils
.gitignore		.gitignore
README.md		README.md
preprocess.py		preprocess.py
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

models

models

settings

settings

utils

utils

.gitignore

.gitignore

README.md

README.md

preprocess.py

preprocess.py

run.py

run.py

Repository files navigation

Overview

How to use

Prepare your data

Train

About

Releases

Packages

Languages

YongWookHa/craft-text-detector

Folders and files

Latest commit

History

Repository files navigation

Overview

How to use

Prepare your data

Train

About

Topics

Resources

Stars

Watchers

Forks

Languages