<a href="https://colab.research.google.com/github/HidekiAI/ML-manga109-OCR/blob/trunk/Untitled0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


First two are essential, but not necessarily needed for both CoLab and local Jupyter-notebook. But without these, when you crash or restart, you cannot skip it... For CoLab, you must first make sure remote drive is mounted. To align BASH and Python scripts to work on multiple platform, for local, you'd need to either soft-link (or junction) and/or mount (i.e. `mount bind`).

Note that below is ONLY necessary for Google CoLab to access your Google Drive. If on Notepad/Jupyter, do the following instead (not exact, just the example):

-   Linux: make sure to `ln -sv ~/Google/MyDrive /content/drive` to softlink your Google G-Drive as `/content/drive`
-   Windows: From DOS Command Prompt (right clock to launch as Admin) `mklink.exe /D "C:/content/drive" "C:/Users/HidekiAI/Google/MyDrive/"` to create a dir-junction


In [None]:
#!/usr/bin/python
# No need to execute this if running locally, this is only for Google CoLab usage
from google.colab import drive
drive.mount('/content/drive')

## Constants

Where are my data, where do I save my trained data and progress


In [19]:
#!/usr/bin/python
import os

global train_path, val_path, test_path, data_yaml_path

train_path = '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/train'
val_path = '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/val'
test_path = '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/test'
data_yaml_path = '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/'
data_yaml_file_path = os.path.join(data_yaml_path, 'data.yaml')

# print CURRENT directory:
print(f"Current directory: {os.getcwd()}")

# if cwd starts with "C:\", then we are in Windows, so switch paths
if os.getcwd().startswith("C:\\"):
    train_path = '../../data/images/train'
    val_path = '../../data/images/val'
    test_path = '../../data/images/test'
    data_yaml_path = './data/'
    data_yaml_file_path = os.path.join(data_yaml_path, 'data.yaml')

# validate paths exist
if not os.path.exists(train_path):
    print(f"Train path {train_path} does not exist")
if not os.path.exists(val_path):
    print(f"Validation path {val_path} does not exist")
if not os.path.exists(test_path):
    print(f"Test path {test_path} does not exist")
if not os.path.exists(data_yaml_path):
    print(f"Data yaml path {data_yaml_path} does not exist")

Current directory: c:\Users\HidekiAI\projects\remote\github\mine\hidekiai\ML-manga109-OCR\training\text_detection


## Libs

-   Ultralitics YOLO


In [1]:
#!/bin/bash
%pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.2.21-py3-none-any.whl.metadata (40 kB)
     ---------------------------------------- 0.0/40.7 kB ? eta -:--:--
     -------------------------------------- 40.7/40.7 kB 951.7 kB/s eta 0:00:00
Collecting thop>=0.1.1 (from ultralytics)
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl.metadata (2.7 kB)
Downloading ultralytics-8.2.21-py3-none-any.whl (777 kB)
   ---------------------------------------- 0.0/777.9 kB ? eta -:--:--
   ---------------- ---------------------- 327.7/777.9 kB 10.2 MB/s eta 0:00:01
   --------------------------------------  768.0/777.9 kB 12.0 MB/s eta 0:00:01
   ---------------------------------------- 777.9/777.9 kB 8.2 MB/s eta 0:00:00
Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Installing collected packages: thop, ultralytics
Successfully installed thop-0.1.1.post2209072238 ultralytics-8.2.21
Note: you may need to restart the kernel to use updated packages.


## data.yaml

YAML config for YOLO; note that because it's YAML file, it's not based on env-vars or globals


In [14]:
#!/usr/bin/python

data_yaml_content = f"""
train: {train_path}
val: {val_path}
test: {test_path}

nc: 1  # number of classes
names: ['text']  # class names
"""

with open(data_yaml_file_path, 'w') as f:
    f.write(data_yaml_content)

# verify file now exists:
if not os.path.exists(data_yaml_file_path):
    print(f"Data yaml file {data_yaml_file_path} does not exist")

# Dump yaml content to verify, by reading it back
with open(data_yaml_file_path, 'r') as f:
    print(f.read())


train: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/train
val: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/val
test: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/test

nc: 1  # number of classes
names: ['text']  # class names



## Training

```bash
yolo detect train data=data.yaml epochs=50 imgsz=640
```

Usage of Model size:

- `yolov8n.pt` (Nano): The smallest model, optimized for speed and efficiency on resource-constrained devices. It has the least number of parameters and computational complexity, making it fast but less accurate.
- `yolov8s.pt` (Small): A small model that offers a good balance between speed and accuracy. Suitable for scenarios where both performance and accuracy are important but resource usage needs to be moderate.
- `yolov8m.pt` (Medium): A medium-sized model that improves accuracy over the small model but at the cost of additional computational resources and slower inference times.
- `yolov8l.pt` (Large): A larger model with more parameters and higher computational requirements, offering higher accuracy but slower inference times.
- `yolov8x.pt` (Extra Large): The largest model with the highest number of parameters and computational requirements. It provides the best accuracy but is the slowest in terms of inference speed.


In [25]:
#!/usr/bin/python

from ultralytics import YOLO

# print current directory
print(f"Current directory: {os.getcwd()}")

# Load the YOLO model
model = YOLO('yolov8s.pt')  # see MD coment above for other versions

# dump data.yaml content
with open(data_yaml_file_path, 'r') as f:
    print(f.read())

# Train the model
model.train(data=data_yaml_file_path, epochs=50, imgsz=640)

Current directory: c:\Users\HidekiAI\projects\remote\github\mine\hidekiai\ML-manga109-OCR\training\text_detection

train: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/train
val: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/val
test: /content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/images/test

nc: 1  # number of classes
names: ['text']  # class names

Ultralytics YOLOv8.2.21  Python-3.11.7 torch-2.3.0 CPU (Intel Core(TM) i7-7820HQ 2.90GHz)
[34m[1mengine\trainer: [0mtask=detect, mode=train, model=yolov8s.pt, data=/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/data.yaml, epochs=50, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train9, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False,

RuntimeError: Dataset '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/data.yaml' error  
Dataset '/content/drive/MyDrive/projects/ML-manga-ocr-rust/data/text_detection/data.yaml' images not found , missing path 'G:\content\drive\MyDrive\projects\ML-manga-ocr-rust\data\text_detection\images\val'
Note dataset download directory is 'C:\Users\HidekiAI\projects\remote\github\mine\hidekiai\ML-manga109-OCR\training\text_detection\datasets'. You can update this in 'C:\Users\HidekiAI\AppData\Roaming\Ultralytics\settings.yaml'