# Thesis: Training an Adapter for Cruise

This notebook documents the workflow for training a YOLO-based adapter model tailored for cruise applications. The process includes dataset preparation, configuration file creation, model training, and result management.

## Install Required Libraries

In this step, we will install the necessary libraries for training and evaluation. This includes the `ultralytics` package, which provides the YOLO implementation used in this workflow.

In [None]:
!pip install -q ultralytics
from ultralytics import YOLO

## Create YAML Configuration for Training

This section describes how to automatically generate a `data.yaml` configuration file required for YOLO training. The script reads class names from `classes.txt`, sets up dataset paths, and writes the configuration in YAML format.

In [None]:
# Python function to automatically create data.yaml config file
# 1. Reads "classes.txt" file to get list of class names
# 2. Creates data dictionary with correct paths to folders, number of classes, and names of classes
# 3. Writes data in YAML format to data.yaml

import yaml
import os

def create_data_yaml(path_to_classes_txt, path_to_data_yaml):

  # Read class.txt to get class names
  if not os.path.exists(path_to_classes_txt):
    print(f'classes.txt file not found! Please create a classes.txt labelmap and move it to {path_to_classes_txt}')
    return
  with open(path_to_classes_txt, 'r') as f:
    classes = []
    for line in f.readlines():
      if len(line.strip()) == 0: continue
      classes.append(line.strip())
  number_of_classes = len(classes)

  # Create data dictionary
  data = {
      'path': '/kaggle/input/cocov8',
      'train': 'train/images',
      'val': 'validation/images',
      'nc': number_of_classes,
      'names': classes
  }

  # Write data to YAML file
  with open(path_to_data_yaml, 'w') as f:
    yaml.dump(data, f, sort_keys=False)
  print(f'Created config file at {path_to_data_yaml}')

  return

# Define path to classes.txt and run function
path_to_classes_txt = '/kaggle/input/cocov8/classes.txt'
path_to_data_yaml = 'data.yaml'

create_data_yaml(path_to_classes_txt, path_to_data_yaml)

print('\nFile contents:\n')

Displays the contents of the generated `data.yaml` configuration file, which defines dataset paths and class names for YOLO training.

In [None]:
!cat data.yaml

# Start YOLO model training

# The model and training parameters are defined in the cell below.
# Please run the next cell to begin training.

In [None]:
# !yolo task=detect mode=train model=yolo11s.pt data=data.yaml epochs=120 imgsz=640 device=0,1 patience=10

# Load pretrained model (better starting point than from scratch)
model = YOLO("yolo11n.pt")  # or "yolov8s.pt" for standard YOLOv8

# Train the model with optimized parameters
results = model.train(
    data="data.yaml",
    epochs=120,
    imgsz=640,
    batch=16,  # Adjust based on your GPU memory
    device=[0,1],  # Use both GPUs
    patience=30,  # Early stopping if no improvement for 30 epochs
    optimizer='auto',  # Let YOLO choose the best optimizer
    lr0=0.01,  # Initial learning rate
    lrf=0.01,  # Final learning rate
    momentum=0.937,
    weight_decay=0.0005,
    warmup_epochs=3.0,
    warmup_momentum=0.8,
    box=7.5,  # box loss gain
    cls=0.5,  # cls loss gain
    dfl=1.5,  # dfl loss gain
    hsv_h=0.015,  # image HSV-Hue augmentation
    hsv_s=0.7,  # image HSV-Saturation augmentation
    hsv_v=0.4,  # image HSV-Value augmentation
    degrees=0.0,  # image rotation
    translate=0.1,  # image translation
    scale=0.5,  # image scale
    shear=0.0,  # image shear
    perspective=0.0,  # image perspective
    flipud=0.0,  # image flip up-down
    fliplr=0.5,  # image flip left-right
    mosaic=1.0,  # image mosaic
    mixup=0.0,  # image mixup
    copy_paste=0.0  # segment copy-paste
)

# Retraining... (fire-turn)

This section documents the retraining process for the adapter model. Adjustments to training parameters or data can be made here to further improve model performance based on previous results or new requirements.

```markdown
## Copy Training Results to Save Server

This section demonstrates how to securely copy the `runs` directory containing training results to a remote save server. This ensures that your experiment outputs are backed up and accessible for further analysis or sharing.
```

In [None]:
!pip install -q gdown
!gdown 'https://drive.google.com/uc?id=1nQ0_w3uG8McFgPxt-kVS1RPW1aKpb2YS'
!chmod 400 /kaggle/working/gcp-key
!ssh -i /kaggle/working/gcp-key -o StrictHostKeyChecking=no trung@34.142.148.134 "rm -rf /home/trung/runs"
!scp -i /kaggle/working/gcp-key -o StrictHostKeyChecking=no -r runs trung@34.142.148.134:/home/trung/
!echo "Done!"