In [1]:
import os
from ultralytics import YOLO
import matplotlib.pyplot as plt

## Available Tasks Supported by YOLOv8:

1. **Detection**:
   - Objective: Detect objects in an image or video frame and draw bounding boxes around them.
   - Use Case: Object detection in various applications such as surveillance, autonomous vehicles, and robotics.

2. **Segmentation**:
   - Objective: Segment an image into different regions based on its content and assign labels to each region.
   - Use Case: Image segmentation, medical imaging, and understanding object boundaries.

3. **Classification**:
   - Objective: Classify an image into different categories based on its content.
   - Use Case: Image classification in various applications such as object recognition, content filtering, and visual search.

4. **Pose/Keypoint Detection**:
   - Objective: Detect specific points (keypoints) in an image or video frame for tracking movement or pose estimation.
   - Use Case: Human pose estimation, motion tracking, and gesture recognition.

Note: YOLOv8 utilizes different architectures (e.g., U-Net, EfficientNet) to perform segmentation, classification, and pose/keypoint detection, ensuring accuracy and speed in these tasks.
### Models Supported by YOLOv8:
The following models are available in YOLOv8 and are suitable starting points for various tasks:

<div align=center>

| Model Type   | Pre-trained Weights                            | Task                |
|--------------|------------------------------------------------|---------------------|
| YOLOv8       | yolov8n.pt, yolov8s.pt, yolov8m.pt, yolov8l.pt, yolov8x.pt       | Detection           |
| YOLOv8-seg   | yolov8n-seg.pt, yolov8s-seg.pt, yolov8m-seg.pt, yolov8l-seg.pt, yolov8x-seg.pt   | Instance Segmentation |
| YOLOv8-pose  | yolov8n-pose.pt, yolov8s-pose.pt, yolov8m-pose.pt, yolov8l-pose.pt, yolov8x-pose.pt, yolov8x-pose-p6 | Pose/Keypoints      |
| YOLOv8-cls   | yolov8n-cls.pt, yolov8s-cls.pt, yolov8m-cls.pt, yolov8l-cls.pt, yolov8x-cls.pt     | Classification      |

</div>

## Model and project settings

In [2]:
model    = 'yolov8n.pt'                                    # Initial weights path. Can aslo be a .yaml defining a model, e.g. yolov8m.yaml
task     = 'detect'                                        # 'detect', 'classify', 'segment'
project  = 'yolov8/'                                       # Project folder
name     = 'digit8n_edges'                                 # Model name
exist_ok = True                                            # Overwrite existing project folder
seed     = 0                                               # Seed for training
resume   = False                                           # Resume training from last.pt from project + name
verbose  = False                                           # Print detailed results
plots    = False                                           # Plot training results
project_dir = os.path.join(os.getcwd(), project, name)     # Project directory
device   = '0'                                             # CUDA device, i.e. '0' or '0,1,2,3' or 'cpu'

project_settings = {
    'name': name,
    'project': project,
    'exist_ok': exist_ok,
    'seed': seed,
    'verbose': verbose,
    'resume': resume,
    'plots': plots,
    'device': device,
}

## Available Datasets
These datasets may be used to train or pre-train a model in YOLOv8. The datasets are can also be found [here](https://docs.ultralytics.com/datasets/).
<details>

<summary>
Detection
</summary>

| Dataset       | Description                                                      |
|---------------|------------------------------------------------------------------|
| Argoverse     | 3D tracking and motion forecasting data from urban environments. |
| COCO          | Large-scale dataset for object detection, segmentation, and captioning. |
| COCO8         | Subset of COCO train and COCO val for quick tests.               |
| Global Wheat 2020 | Dataset of wheat head images for object detection and localization. |
| Objects365    | High-quality dataset for object detection with 365 categories.   |
| SKU-110K      | Dense object detection dataset in retail environments.           |
| VisDrone      | Dataset with object detection and multi-object tracking from drone-captured imagery. |
| VOC           | Pascal Visual Object Classes dataset for object detection and segmentation. |
| xView         | Dataset for object detection in overhead imagery.                |

</details>

<details>

<summary>
Instance Segmentation
</summary>

| Dataset       | Description                                                      |
|---------------|------------------------------------------------------------------|
| COCO          | Large-scale dataset for object detection, segmentation, and captioning. |
| COCO8-seg     | Subset of COCO with segmentation annotations.                    |


</details>

<details>

<summary>
Pose Estimation
</summary>

| Dataset       | Description                                                      |
|---------------|------------------------------------------------------------------|
| COCO          | Large-scale dataset with human pose annotations.                  |
| COCO8-pose    | Subset of COCO with human pose annotations.                       |


</details>

<details>

<summary>
Classification
</summary>

| Dataset       | Description                                                      |
|---------------|------------------------------------------------------------------|
| Caltech 101   | Dataset with images of 101 object categories for classification.  |
| Caltech 256   | Extended version of Caltech 101 with 256 categories.             |
| CIFAR-10      | Dataset of color images in 10 classes.                            |
| CIFAR-100     | Extended version of CIFAR-10 with 100 categories.                 |
| Fashion-MNIST | Dataset with grayscale images of fashion categories.              |
| ImageNet      | Large-scale dataset for object detection and classification.      |
| ImageNet-10   | Subset of ImageNet with 10 categories.                            |
| Imagewoof     | Challenging subset of ImageNet with 10 dog breed categories.      |
| Imagenette    | Smaller subset of ImageNet with 10 easily distinguishable classes.|
| MNIST         | Dataset of grayscale images of handwritten digits.                |


</details>

<details>

<summary>
Multi-Object Tracking
</summary>

| Dataset       | Description                                                      |
|---------------|------------------------------------------------------------------|
| Argoverse     | 3D tracking and motion forecasting data from urban environments. |
| VisDrone      | Dataset with object detection and multi-object tracking from drone-captured imagery. |


</details>



## Constructing a YOLO Custom Dataset
<details>

<summary>
Step-by-step
</summary>

1. Collect and Organize Your Data:
   - Gather images relevant to the object(s) you want to detect.
   - Create separate directories for images and annotation files.

2. Annotate the Images:
   - Use an annotation tool to mark the bounding boxes around objects of interest.
   - Save annotations in a YOLO-compatible format.
   - Text files with one row per object instance: ```<class_id> <x_center> <y_center> <width> <height>```
   - This can be done using various tools such as [Roboflow](https://roboflow.com/) **online** or [YoloLabel](https://github.com/developer0hye/Yolo_Label) **offline**.

3. Split the Dataset:
   - Divide your dataset into training, validation, and testing sets.
   - Ensure each set has a representative distribution of classes and object instances.

4. Generate data.yaml file:
   - Contains information about the dataset and paths to the training, validation, and testing sets.
   - Class names are listed in the order of their IDs in ```names```.
   - Number of classes is specified in the ```nc``` field.
   - Example:
   ```yaml
      names:
      - '0'
      - '1'
      - '2'
      - '3'
      - '4'
      - '5'
      - '6'
      - '7'
      - '8'
      - '9'
      nc: 10
      path: /path/to/dataset/ # Path to dataset directory.
      test: test              # Relative to path above.
      train: train            
      val: val               
   ```

</details>


## Data settings
```python
data_path  = '/home/thjo/Datasets/BolidenDigits/data.yaml' # Path to data.yaml
imgsz      = 416                                           # Image size.
batch_size = -1                                           # Batch size, '-1' uses the largest batch size that fits on the GPS(s)     
```

In [3]:
data  = '/home/thjo/Datasets/BolidenDigits/data.yaml'      # Path to data.yaml
imgsz      = 416                                           # Image size.
batch      = 64                                            # Batch size, '-1' uses the largest batch size that fits on the GPU(s)
fraction   = 1.0                                           # Fraction of dataset to use for training (0-1) Useful for debugging to validate code and convergence

data_settings ={
    'data': data,
    'imgsz': imgsz,
    'batch': batch,
    'fraction': fraction
}

## Training settings
The following are a subset of the available training settings. For a complete list, see [here](https://docs.ultralytics.com/modes/train/#arguments).

<details>
<summary>lr0 - Initial Learning Rate</summary>
The lr0 parameter represents the initial learning rate of the model. It determines the step size at the beginning of training, influencing how quickly the model learns from the data. A higher learning rate can lead to faster convergence, but it may also cause instability or overshooting. Conversely, a lower learning rate may result in slower convergence but can lead to more accurate and stable training.

</details>

<details>
<summary>val - Validation during Training</summary>
The val parameter is a boolean flag indicating whether to validate the model on the validation set during training. Validation allows monitoring the model's performance on unseen data and helps in detecting overfitting or underfitting. By evaluating the model's performance on the validation set, you can make informed decisions regarding model selection and hyperparameter tuning.

</details>

<details>
<summary>epochs - Number of Epochs</summary>
The epochs parameter determines the total number of times the model will iterate over the entire training dataset. An epoch represents a complete pass through the entire training data, and each epoch updates the model's parameters based on the optimization algorithm used. Choosing an appropriate number of epochs is important to balance training time and model convergence. Too few epochs may result in an undertrained model, while too many epochs may lead to overfitting.

</details>

<details>
<summary>patience - Early Stopping</summary>
The patience parameter refers to the number of epochs to wait before stopping training if there is no improvement in the validation metric. It enables early stopping, a technique used to prevent overfitting and improve efficiency. If the model's performance on the validation set does not improve for a specified number of epochs (defined by patience), training is stopped early to avoid wasting computational resources on a non-improving model.

</details>

<details>
<summary>save_period - Model Saving</summary>
The save_period parameter determines how frequently the model is saved during training. Setting a positive value for save_period means that the model will be saved every n epochs. This allows you to have checkpoints of the model at regular intervals during training. Alternatively, setting save_period to -1 disables automatic saving of the model. Manually saving the model can be done at any desired point in the code.

</details>

<details>
<summary>device - Hardware Device</summary>
The device parameter specifies the hardware device to use for training the model. If you have a compatible GPU, you can specify the CUDA device ID (e.g., '0') to utilize GPU acceleration, which can significantly speed up the training process. Alternatively, you can set device to 'cpu' for CPU training. Choosing the appropriate device depends on the availability of hardware resources and the size of the dataset.

</details>

<details>
<summary>cache - Image Caching</summary>
The cache parameter determines whether to cache images for faster training. Caching preprocessed images in memory can improve the training speed by reducing disk I/O and preprocessing overhead. However, caching can consume a significant amount of memory, so it is advisable to consider the available memory resources and the size of the dataset before enabling this option.

</details>

<details>
<summary>workers - CPU Workers</summary>
The workers parameter specifies the number of CPU workers to use for data loading during training. Increasing the number of workers enables parallel data loading, which can speed up the training process, especially when there are bottlenecks in data loading and preprocessing. However, the optimal number of workers depends on the available CPU resources and the complexity of data loading operations.

</details>

<details>
<summary>optimizer - Optimization Algorithm</summary>
The optimizer parameter determines the algorithm used to update the model's parameters during training. Options include SGD (Stochastic Gradient Descent), Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, or 'auto' to automatically select an optimizer based on the model architecture and problem. Each optimizer has its own set of hyperparameters that control the learning process. Choosing an appropriate optimizer and its hyperparameters is crucial for achieving good training performance.

</details>

<details>
<summary>cos_lr - Cosine Learning Rate Scheduler</summary>
The cos_lr parameter indicates whether to use a cosine learning rate scheduler. The cosine learning rate scheduler gradually reduces the learning rate during training. 
This technique is based on the cosine function and can help the model converge to a better optima. Using a cosine learning rate scheduler can potentially improve the model's performance and generalization.

</details>


In [4]:

lr0          = 0.005                                      # Initial learning rate
val          = False                                      # Validate on validation set during training
epochs       = 100                                        # Number of epochs
patience     = 25                                         # Stop training after this many epochs without improvement (early stopping)
save_period  = -1                                         # Save model every n epochs, -1 to disable
device       = '0'                                        # Device (cuda device id)
cache        = False                                      # Cache images for faster training
workers      = 8                                          # Number of CPU workers
optimizer    = 'auto'                                     # Optimizer (SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, auto)
cos_lr       = True                                       # Use cosine learning rate scheduler
lrf          = 0.0001                                     # Final learning rate (for cosine scheduler)
momentum     = 0.937                                      # SGD momentum/Adam beta1
weight_decay = 0.0005                                     # optimizer weight decay
warmup_epochs= 3                                          # Warmup epochs (fractions ok)
warmup_momentum = 0.8                                     # Warmup initial momentum

training_settings = {
    'lr0': lr0,
    'val': val,
    'epochs': epochs,
    'patience': patience,
    'save_period': save_period,
    'device': device,
    'cache': cache,
    'workers': workers,
    'optimizer': optimizer,
    'cos_lr': cos_lr,
    'lrf': lrf,
    'momentum': momentum,
    'weight_decay': weight_decay,
    'warmup_epochs': warmup_epochs,
    'warmup_momentum': warmup_momentum
}


## Augmentations
YoloV8 supports a wide range of augmentations. For a complete list, see [here](https://docs.ultralytics.com/usage/cfg/#augmentation).

<div align=center>


| Parameter    | Value  | Description                                 |
|--------------|--------|---------------------------------------------|
| hsv_h        | 0.015  | Image HSV-Hue augmentation (fraction)        |
| hsv_s        | 0.7    | Image HSV-Saturation augmentation (fraction) |
| hsv_v        | 0.4    | Image HSV-Value augmentation (fraction)      |
| degrees      | 0.0    | Image rotation (+/- deg)                     |
| translate    | 0.1    | Image translation (+/- fraction)             |
| scale        | 0.5    | Image scale (+/- gain)                       |
| shear        | 0.0    | Image shear (+/- deg)                        |
| perspective  | 0.0    | Image perspective (+/- fraction)             |
| flipud       | 0.0    | Image flip up-down (probability)             |
| fliplr       | 0.5    | Image flip left-right (probability)          |
| mosaic       | 1.0    | Image mosaic (probability)                   |
| mixup        | 0.0    | Image mixup (probability)                    |
| copy_paste   | 0.0    | Segment copy-paste (probability)             |


</div>

### Standard Augmentations

In [5]:
augmentations = {
    'hsv_h': 0.015,
    'hsv_s': 0.7,
    'hsv_v': 0.4,
    'degrees': 10.0,
    'translate': 0.1,
    'scale': 0.5,
    'shear': 0.0,
    'perspective': 0.0,
    'flipud': 0.0,
    'fliplr': 0.0,
    'mosaic': 1.0,
    'mixup': 0.0,
    'copy_paste': 0.0
}


In [6]:
kwargs = {
    **project_settings,
    **data_settings,
    **training_settings,
    **augmentations
}

In [7]:
model = YOLO(model=model, task=task)
model.train(**kwargs)

New https://pypi.org/project/ultralytics/8.0.116 available 😃 Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.0.114 🚀 Python-3.9.16 torch-2.0.1+cu117 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 7973MiB)
[34m[1myolo/engine/trainer: [0mtask=detect, mode=train, model=yolov8n.pt, data=/home/thjo/Datasets/BolidenDigits/data.yaml, epochs=100, patience=25, batch=64, imgsz=416, save=True, save_period=-1, cache=False, device=0, workers=8, project=yolov8/, name=digit8n_edges, exist_ok=True, pretrained=False, optimizer=auto, verbose=False, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=True, close_mosaic=0, resume=False, amp=True, fraction=1.0, profile=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=False, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=False, source=None, show=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, vid_stride=1, line_width=None, visual

[34m[1mAMP: [0mrunning Automatic Mixed Precision (AMP) checks with YOLOv8n...
[34m[1mAMP: [0mchecks passed ✅
[34m[1mtrain: [0mScanning /home/thjo/Datasets/BolidenDigits/train.cache... 1703 images, 67 backgrounds, 0 corrupt: 100%|██████████| 1703/1703 [00:00<?, ?it/s]
[34m[1mval: [0mScanning /home/thjo/Datasets/BolidenDigits/val.cache... 200 images, 9 backgrounds, 0 corrupt: 100%|██████████| 200/200 [00:00<?, ?it/s]
[34m[1moptimizer:[0m AdamW(lr=0.000714, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 416 train, 416 val
Using 8 dataloader workers
Logging results to [1myolov8/digit8n_edges[0m
Starting training for 100 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
      1/100      3.65G      1.864      3.856      1.691        298        416: 100%|██████████| 27/27 [00:06<00:00,  3.92it/s]

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Si

In [None]:
model.load(os.path.join(project_dir, 'weights', 'best.pt'))  # load best model

Transferred 595/595 items from pretrained weights


<ultralytics.yolo.engine.model.YOLO at 0x7f9bd8f5cc10>

In [None]:
val_results = model.val()

Ultralytics YOLOv8.0.114 🚀 Python-3.9.16 torch-2.0.1+cu117 CUDA:0 (NVIDIA GeForce RTX 3060 Ti, 7973MiB)
Model summary (fused): 268 layers, 68133198 parameters, 0 gradients, 257.4 GFLOPs
[34m[1mval: [0mScanning /home/thjo/Datasets/BolidenDigits/val.cache... 77 images, 0 backgrounds, 0 corrupt: 100%|██████████| 77/77 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 10/10 [00:02<00:00,  4.71it/s]
                   all         77        269      0.952      0.898      0.954      0.783
Speed: 0.2ms preprocess, 22.4ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1myolov8/digit8x_extended[0m


In [None]:
model.export(format='onnx', dynamic=True, simplify=True, opset=12)

NameError: name 'model' is not defined

In [None]:
results = model.predict('/home/thjo/Datasets/BolidenDigits/test/', save=True)

Results saved to [1myolov8/digit8x_extended[0m


In [None]:
model.predict('/home/thjo/Datasets/BolidenDigits/manualDigits/test/', show=True)

[ultralytics.yolo.engine.results.Results object with attributes:
 
 boxes: ultralytics.yolo.engine.results.Boxes object
 keypoints: None
 keys: ['boxes']
 masks: None
 names: {0: '0', 1: '1', 2: '2', 3: '3', 4: '4', 5: '5', 6: '6', 7: '7', 8: '8', 9: '9'}
 orig_img: array([[[122, 106, 124],
         [122, 106, 124],
         [123, 107, 125],
         ...,
         [125, 109, 127],
         [125, 109, 127],
         [125, 109, 127]],
 
        [[122, 106, 124],
         [122, 106, 124],
         [123, 107, 125],
         ...,
         [125, 109, 127],
         [125, 109, 127],
         [125, 109, 127]],
 
        [[122, 106, 124],
         [122, 106, 124],
         [123, 107, 125],
         ...,
         [125, 109, 127],
         [125, 109, 127],
         [125, 109, 127]],
 
        ...,
 
        [[113,  97, 115],
         [113,  97, 115],
         [113,  97, 115],
         ...,
         [114,  98, 116],
         [114,  98, 116],
         [114,  98, 116]],
 
        [[113,  97, 115],
 

: 