Skip to content

Re-implementation of CRAFT Text Detection for Japanese Text Detection

Notifications You must be signed in to change notification settings

lamhoangtung/CRAFT-Japanese

 
 

Repository files navigation

Re-Implementing CRAFT-Character Region Awareness for Text Detection

Focused on Japanese Text

Objective

Clone the repository

git clone https://github.com/autonise/CRAFT-Remade.git
cd CRAFT-Remade

Option 1: Conda Environment Installation

conda env create -f environment.yml
conda activate craft

Option 2: Pip Installation

pip3 install -r requirements.txt

Running on custom images

  • Put the images inside a folder.
  • Get a pre-trained model from the pre-trained model list (Currently only strong supervision using SYNTH-Text available)
  • Run the command
python3 main.py synthesize --model=./model/final_model.pkl --folder=./input

Pre-trained models

Strong Supervision

Weak Supervision

  • Datapile - In Progress

How to train the model from scratch

Strong Supervision on Synthetic dataset

  • Download the pre-trained model on Synthetic dataset at here
  • Otherwise if you want to train from scratch
  • Download my generated Japanese SynthText dataset at here
  • Run the command
python3 main.py train_synth
  • To test your model on SynthText, Run the command
python3 main.py test_synth --model /path/to/model

Weak Supervision

First Pre-Process your dataset

  • The assumed structure of the dataset is
.
├── generated (This folder will contain the weak-supervision intermediate targets)
├── train
│   ├── img_1.jpg
│   ├── img_2.jpg
│   ├── img_3.jpg
│   ├── img_4.jpg
│   └── img_5.jpg
│   └── ...
│   └── train_gt.json (This can be generated using the pre_process function described below)
├── test
│   ├── img_1.jpg
│   ├── img_2.jpg
│   ├── img_3.jpg
│   ├── img_4.jpg
│   └── img_5.jpg
│   └── ...
│   └── test_gt.json (This can be generated using the pre_process function described below)
  • First convert datapile dataset to OCR only format using datapile_to_onmt.py script
  • To generate the json files for Datapile
In config.py change the corresponding values

'datapile': {
    'train': {
        'target_json_name': 'train_gt.json',
        'base_path': './input/datapile/train/',
    },
    'test': {
        'target_json_name': 'test_gt.json',
        'base_path': './input/datapile/test/',
    }
  • Run the command:
python3 main.py pre_process --dataset datapile

Second Train your model based on weak-supervision

  • Run the command
python3 main.py weak_supervision --model /path/to/strong/supervision/model --iterations <num_of_iterations(20)>
  • This will train the weak supervision model for the number of iterations you specified

About

Re-implementation of CRAFT Text Detection for Japanese Text Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.5%
  • Shell 0.5%