## Annotate Comics/Manga
Download comictextdetector.pt and put it into data directory.
Run next block to generate following annotations for data\examples\AisazuNihaIrarenai-003.jpg:
- AisazuNihaIrarenai-003.txt: yolo format bounding boxes of english&japanese text block bounding boxes. 0 is eng.
- mask-AisazuNihaIrarenai-003.png
- line-AisazuNihaIrarenai-003.txt: icdar format bboxes of text lines.

In [1]:
from inference import model2annotations

img_dir = r'data/examples'
model_path = r'data/comictextdetector.pt'
img_dir = r'data/examples'                              # can be dir list
save_dir = r'data/examples/annotations'
model2annotations(model_path, img_dir, save_dir, save_json=False)

100%|██████████| 1/1 [00:04<00:00,  4.78s/it]


## Generate synthetic data
- current rendering script won't handle characters missing from fonts.
- Please use no-text images.

In [1]:
from text_rendering import ComicTextSampler, render_comictext, ALIGN_LEFT, ALIGN_CENTER
import copy

ja_sampler_dict = {
                'num_txtblk': 20,
                'font': {
                        'font_dir': 'data/examples/fonts',   # font file directory
                        'font_statics': 'data/font_statics_en.csv',     # Just a font list file, please create your own list and ignore the last two cols.
                        'num': 1200,     # first 500 of the fontlist will be used 

                        # params to mimic comic/manga text style
                        'size': {'value': [0.02, 0.03, 0.15],
                                'prob': [1, 0.4, 0.15]},
                        'stroke_width': {'value': [0, 0.1, 0.15],
                                        'prob': [1, 0.5, 0.2]},
                        'color': {'value': ['black', 'white', 'random'],
                                'prob': [1, 1, 0.4]},
                },
                'text': {
                        'lang': 'ja',   # render japanese, 'en' for english
                        'orientation': {'value': [1, 0],    # 1 is vertical text.
                                                'prob': [1, 0.3]},
                        'rotation': {'value': [0, 30, 60],
                                                'prob': [1, 0.3, 0.1]},
                        'num_lines': {'value': [0.15],
                                'prob': [1]}, 
                        'length': {'value': [0.3],
                                'prob': [1]},
                        'min_num_lines': 1,
                        'min_length': 3,
                        'alignment': {'value': [ALIGN_LEFT, ALIGN_CENTER],
                                'prob': [0.3, 1]}
                }
        }

jp_cts = ComicTextSampler((845, 1280), ja_sampler_dict, seed=0)
eng_dict = copy.deepcopy(ja_sampler_dict)
eng_dict['text']['lang'] = 'en'
eng_dict['text']['orientation'] = {'value': [1, 0],
                                'prob': [0, 1]}
eng_cts = ComicTextSampler((845, 1280), eng_dict, seed=0)

img_dir = r'data/examples'
save_dir = r'data/examples/annotations'
 
render_comictext([eng_cts, jp_cts], img_dir, save_dir=save_dir, save_prefix=None, render_num=10, label_dir=None, show=False)

100%|██████████| 10/10 [00:12<00:00,  1.23s/it]


## Training
### Train Text Block Detector
Train yolov5s using official repo of yolov5, assume the trained model is 'yolov5sblk.pt', go to the root directory of yolov5 and run following code.

``` python
import torch
m = torch.load('yolov5sblk.pt')['model']
save_dict = {
    'cfg': m.yaml,
    'weights': m.state_dict()
}
torch.save(save_dict, 'yolov5sblk.ckpt')
```
### Train Text Segmentation Head
1. Put yolov5sblk.ckpt into data.   
2. Refer to train_seg.py for further details.  

### Train DBHead
Please refer to train_db.py.


## Concat weights & export as onnx

In [None]:
from utils.export import *
concate_models('data/yolov5sblk.ckpt', 'data/unet_best.ckpt', 'data/db_best.ckpt', 'data/textdetector.pt')

batch_size, imgsz = 1, 1024
cuda = torch.cuda.is_available()
device = 'cpu'
im = torch.zeros(batch_size, 3, imgsz, imgsz).to(device)
model_path = r'data/textdetector.pt'
model = TextDetBase(model_path, device=device).to(device)
export_onnx(model, im, model_path, 11)