Data Preprocessing || `Model Training` || Streamlit App

---
# Solar Up -- Solar Panel Object Detection and Segmentation

WBS Coding School || Data Science Bootcamp || Final Project

---
<br>

**Use Google's GPU** to run this script to substantially reduce training time.

(Next to "Connection" -> Click arrow -> "Change runtime type" -> choose a GPU, e.g. "T4 GPU")


### Table of Contents
- [1. Download Data](#download)
- [2. Model Training](#training)
- [3. Export Model Weights](#export)

#### Libraries



In [None]:
# ultralytics is the package you need for YOLOv8.
%pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.1.28-py3-none-any.whl (721 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m721.3/721.3 kB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0m
Collecting thop>=0.1.1 (from ultralytics)
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m39.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m51.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda

In [None]:
# Update gdown for full functionality.
%pip install --upgrade gdown

Collecting gdown
  Downloading gdown-5.1.0-py3-none-any.whl (17 kB)
Installing collected packages: gdown
  Attempting uninstall: gdown
    Found existing installation: gdown 4.7.3
    Uninstalling gdown-4.7.3:
      Successfully uninstalled gdown-4.7.3
Successfully installed gdown-5.1.0


In [None]:
import os
import shutil
import torch

import ultralytics
from ultralytics import YOLO
from ultralytics import settings

from google.colab import files, drive

___
<a id="download"></a>
## 1.&nbsp; Download Data

We'll download the satellite images and a config yaml file from a Google Drive data dump.

The yaml tells the YOLOv8 model where to find the data and what the classes are. In our case, there's only one class, solar panels.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Download kasmi_solar_detection.yaml config file.
!gdown 1niMiTCjQN1FGxanL-hK8XBU4vgKB5FfT

# Insert the respective file ID here. To get it,
# go on your Google Drive > right click on 'kasmi_solar_detection.yaml' > Share > Copy link.
# The link will look something like:
# https://drive.google.com/file/d/<your_file_ID_here>/

Downloading...
From: https://drive.google.com/uc?id=1niMiTCjQN1FGxanL-hK8XBU4vgKB5FfT
To: /content/kasmi_solar_detection.yaml
  0% 0.00/215 [00:00<?, ?B/s]100% 215/215 [00:00<00:00, 1.24MB/s]


Next, download the satellite images from a data-dump Drive.

In [None]:
# Download zipped training data folder.
!gdown 1qMdHqmdVZynqZbgoB9dz1Ni6ZbFtaKCJ

Downloading...
From (original): https://drive.google.com/uc?id=1qMdHqmdVZynqZbgoB9dz1Ni6ZbFtaKCJ
From (redirected): https://drive.google.com/uc?id=1qMdHqmdVZynqZbgoB9dz1Ni6ZbFtaKCJ&confirm=t&uuid=14490323-57ca-48b0-bf5d-49128db4ffe2
To: /content/kasmi_solar_detection.zip
100% 1.57G/1.57G [00:11<00:00, 141MB/s]


In [None]:
# Display current path on Google Colab Machine.
%pwd
# Current path is '/content'.

# YOLOv8 expects the data folder to be contained in a directory called 'content/datasets/'.
!mkdir /content/datasets/


mkdir: cannot create directory ‘/content/datasets/’: File exists
Archive:  kasmi_solar_detection.zip
checkdir:  cannot create extraction directory: content/datasets/kasmi_solar_detection
           No such file or directory


In [None]:
# Unzip training data to that directory.
!unzip kasmi_solar_detection.zip -d /content/datasets/kasmi_solar_detection

'/content'

Check hardware and directory structure. Make sure a GPU is being used for model training.

In [None]:
ultralytics.checks()

Ultralytics YOLOv8.1.28 🚀 Python-3.10.12 torch-2.2.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
Setup complete ✅ (2 CPUs, 12.7 GB RAM, 33.5/78.2 GB disk)


In [None]:
# Name of the dataset.
dataset_name = "kasmi_solar"

___
<a id="training"></a>
## 2.&nbsp; Model Training

Here we'll load a pretrained YOLOv8 model (small, medium or large) and train it on the solar panel satellite data. Subsequently, we'll check its performance on the validation set.

In [None]:
# Load a COCO-pretrained YOLOv8m model.
model_name = 'yolov8m'
model = YOLO(f'{model_name}.pt')

Downloading https://github.com/ultralytics/assets/releases/download/v8.1.0/yolov8m.pt to 'yolov8m.pt'...


100%|██████████| 49.7M/49.7M [00:00<00:00, 154MB/s]


In [None]:
# Train the model on our custom dataset.
config_file = f'{dataset_name}_detection.yaml'
epochs = 20
img_size = 400
experiment = f'{model_name}_{epochs}'
device = 0 if torch.cuda.is_available() else "cpu" # device agnostic code

results = model.train(data=config_file, epochs=epochs, imgsz=img_size, name=experiment, device=device)

Ultralytics YOLOv8.1.28 🚀 Python-3.10.12 torch-2.2.1+cu121 CUDA:0 (Tesla T4, 15102MiB)
[34m[1mengine/trainer: [0mtask=detect, mode=train, model=yolov8m.pt, data=kasmi_solar_detection.yaml, epochs=20, time=None, patience=100, batch=16, imgsz=400, save=True, save_period=-1, cache=False, device=0, workers=8, project=None, name=yolov8m_20, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_c

[34m[1mtrain: [0mScanning /content/datasets/kasmi_solar_detection/train/labels.cache... 9312 images, 0 backgrounds, 0 corrupt: 100%|██████████| 9312/9312 [00:00<?, ?it/s]


[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))


[34m[1mval: [0mScanning /content/datasets/kasmi_solar_detection/val/labels.cache... 1996 images, 0 backgrounds, 0 corrupt: 100%|██████████| 1996/1996 [00:00<?, ?it/s]


Plotting labels to runs/detect/yolov8m_20/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 77 weight(decay=0.0), 84 weight(decay=0.0005), 83 bias(decay=0.0)
[34m[1mTensorBoard: [0mmodel graph visualization added ✅
Image sizes 416 train, 416 val
Using 2 dataloader workers
Logging results to [1mruns/detect/yolov8m_20[0m
Starting training for 20 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/20      4.06G     0.9542       0.77      1.071         29        416: 100%|██████████| 582/582 [02:59<00:00,  3.24it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:22<00:00,  2.80it/s]

                   all       1996       2581      0.873      0.778      0.881      0.669






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/20      3.93G      1.006     0.7882      1.112         37        416: 100%|██████████| 582/582 [02:45<00:00,  3.52it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:20<00:00,  3.13it/s]


                   all       1996       2581       0.86      0.799      0.887      0.698

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/20      3.93G     0.9702     0.7466        1.1         26        416: 100%|██████████| 582/582 [02:39<00:00,  3.64it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:19<00:00,  3.29it/s]


                   all       1996       2581      0.879      0.838      0.916      0.714

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/20      3.93G     0.9195     0.6871      1.074         37        416: 100%|██████████| 582/582 [02:40<00:00,  3.62it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:20<00:00,  3.03it/s]


                   all       1996       2581      0.892      0.875      0.937      0.764

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/20      3.94G     0.8903     0.6433      1.064         31        416: 100%|██████████| 582/582 [02:43<00:00,  3.57it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:19<00:00,  3.31it/s]

                   all       1996       2581       0.93       0.88      0.954      0.786






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/20      3.94G     0.8614     0.6222      1.051         26        416: 100%|██████████| 582/582 [02:41<00:00,  3.61it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.53it/s]


                   all       1996       2581      0.909      0.901      0.955      0.789

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/20      3.93G     0.8283     0.5758      1.032         22        416: 100%|██████████| 582/582 [02:36<00:00,  3.73it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.58it/s]

                   all       1996       2581      0.924      0.919      0.966      0.804






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/20      3.94G     0.8046     0.5492      1.019         35        416: 100%|██████████| 582/582 [02:40<00:00,  3.62it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:19<00:00,  3.28it/s]

                   all       1996       2581      0.932      0.901      0.964      0.819






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/20      3.94G     0.7863     0.5389      1.011         30        416: 100%|██████████| 582/582 [02:46<00:00,  3.49it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:20<00:00,  3.13it/s]

                   all       1996       2581      0.919      0.919      0.966      0.819






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/20      3.94G     0.7778     0.5275      1.011         36        416: 100%|██████████| 582/582 [02:39<00:00,  3.65it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:18<00:00,  3.49it/s]

                   all       1996       2581      0.934      0.914      0.969      0.826





Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01), CLAHE(p=0.01, clip_limit=(1, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/20      3.94G     0.6571     0.4508      0.955         19        416: 100%|██████████| 582/582 [02:33<00:00,  3.79it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.66it/s]

                   all       1996       2581      0.935      0.908       0.97      0.825






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/20      3.93G     0.6442     0.4328     0.9426         16        416: 100%|██████████| 582/582 [02:32<00:00,  3.83it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.69it/s]

                   all       1996       2581      0.927      0.929      0.972      0.843






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/20      3.94G     0.6256     0.4132     0.9358         22        416: 100%|██████████| 582/582 [02:32<00:00,  3.82it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.65it/s]

                   all       1996       2581      0.954      0.905      0.977      0.838






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/20      3.94G     0.6129     0.3983     0.9268         19        416: 100%|██████████| 582/582 [02:29<00:00,  3.88it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:19<00:00,  3.24it/s]

                   all       1996       2581      0.931      0.933      0.978      0.846






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/20      3.94G     0.6048     0.3833     0.9225         24        416: 100%|██████████| 582/582 [02:30<00:00,  3.86it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.63it/s]

                   all       1996       2581       0.94      0.929      0.977      0.849






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/20      3.94G     0.5891     0.3658     0.9161         21        416: 100%|██████████| 582/582 [02:31<00:00,  3.84it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.64it/s]

                   all       1996       2581      0.947      0.924      0.979      0.855






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/20      3.94G     0.5743     0.3508     0.9094         18        416: 100%|██████████| 582/582 [02:32<00:00,  3.81it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.53it/s]


                   all       1996       2581      0.946      0.936      0.981      0.857

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/20      3.94G     0.5685     0.3406     0.9063         21        416: 100%|██████████| 582/582 [02:32<00:00,  3.82it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.58it/s]

                   all       1996       2581      0.951       0.93      0.981      0.859






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/20      3.95G     0.5538     0.3244     0.8948         19        416: 100%|██████████| 582/582 [02:32<00:00,  3.82it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 63/63 [00:17<00:00,  3.67it/s]

                   all       1996       2581      0.943      0.939       0.98      0.861






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/20      3.94G     0.5526     0.3172     0.8997         20        416:  58%|█████▊    | 337/582 [01:28<01:01,  3.97it/s]

In [None]:
# Validate the model.
metrics = model.val(data=config_file, epochs=epochs, imgsz=img_size, name=experiment, device=device)  # no arguments needed, dataset and settings remembered

metrics.box.map    # map50-95(B)
metrics.box.map50  # map50(B)
metrics.box.map75  # map75(B)
metrics.box.maps   # a list contains map50-95(B) of each category

___
<a id="export"></a>
## 3.&nbsp; Export Model Weights

In [None]:
# Load the custom trained model.
path_to_best = f"runs/detect/{experiment}/weights/best.pt" # .pt = PyTorch model
model = YOLO(path_to_best)

# Export the model. The default format is 'torchscript' (PyTorch).
model.export()

In [None]:
# Specify the output path for the ZIP archive for the training output.
output_path = f"models/trained_model_{experiment}"
print(output_path)

shutil.make_archive(output_path, 'zip', f"/content/runs/detect/{experiment}")

In [None]:
# Download trained model.
files.download(f"runs/detect/{experiment}/weights/best_{experiment}.torchscript")

In [None]:
# Download training data.
files.download(f"models/training_data_{experiment}.zip")

In [None]:
# Specify the output path for the ZIP archive for the evaluation data.
output_path = f"models/evaluation_data_{experiment}"
print(output_path)

shutil.make_archive(output_path, 'zip', f"/content/runs/detect/{experiment}")

In [None]:
# Download evaluation data.
files.download(f"models/evaluation_data_{experiment}.zip")