## Training Detection Model

The purpose of this notebook is to fine-tune a model for detecting speech bubbles, text and characters in manga panels.
Due to the unique layout, artistic style, and the presence of both text and illustrations on the same page,
a text detection model trained on natural images will not perform well on manga images.

### Prepare datasets

In [None]:
from google.colab import drive
drive.mount('/content/drive')

!git clone https://github.com/mayocream/tschan
# acquire the dataset from http://www.manga109.org/ja/download.html
!unzip /content/drive/MyDrive/ML/Manga109.zip -d tschan/datasets/

### Preprecessing datasets

In [None]:
%cd tschan/datasets
!mv Manga109_released_2021_12_30 manga109 # keep name consistent with scripts/convert_manga109_to_coco.py

%pip install manga109api supervision
!python scripts/convert_manga109_to_coco.py \
    --manga109_root_dir manga109 \
    --dataset_version v2021.12.31 \
    --label_filename_prefix manga109_coco \
    --add_manga109_info

# convert coco to yolo format
!python scripts/convert_coco_to_yolo.py

### Training

In [None]:
%pip install ultralytics

from ultralytics import YOLO

# Load a model
model = YOLO('yolov8n.pt')  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data='datasets/manga109/yolo_format/manga109.yaml', epochs=100, verbose=True)

### Inference

In [None]:
from ultralytics import YOLO

# Load a pretrained YOLOv8n model
model = YOLO('runs/train/wrights/best.pt')

model.predict('a.jpg', save=True, imgsz=640, conf=0.5)