# Beet segmentation model training

Date: 16.11.2023  
Authors: Gustav Schimmer & Philipp Friedrich  

**This notebook is purposed for training a YOLOv6 algorithm in detecting sugar beet plants on images.**  
  
  
Major steps are:
- Downsampling and Resizing of images
- Create custom dataset with labeled data
- Initialize YOLOv6 algorithm
- Train algorithm
- Validation of the model
- Inferencing YOLOv6 model on test data

## Import necessary libraries

In [2]:
import os
import cv2
from google.colab import drive

ModuleNotFoundError: No module named 'cv2'

## Data preparation: Downsampling & Resizing

Before creation of a custom dataset, data needs to be resampled to a lower resolution to minimize needed computation power. 

#### Define data paths

In [3]:
# Input data path
input_folder = r'..\beet-segmentation\data\20230514\field_1'

# Output data path
output_folder = r'..\beet-segmentation\data\20230514\field_1_test_img'

#### Write function to resample images

In [None]:
# Write function to resample images to taret width and height
def crop_and_resize_image(input_path, output_folder, square_size, target_width, target_height):
    image = cv2.imread(input_path)
    if image is not None:
        # Verkleinere das Bild auf die Zielgröße
        image = cv2.resize(image, (target_width, target_height))
        # Zuschneiden in 256x256 Quadraten
        for y in range(0, target_height - square_size + 1, square_size):
            for x in range(0, target_width - square_size + 1, square_size):
                square = image[y:y + square_size, x:x + square_size]
                # Speichere das Quadrat mit einem fortlaufenden Index
                output_path = os.path.join(output_folder, f"{os.path.splitext(os.path.basename(input_path))[0]}_{y // square_size * (target_width // square_size) + x // square_size}.jpg")
                cv2.imwrite(output_path, square)

#### Image resampling

In [None]:
# Define target image width and height
target_width, target_height = 1500, 2000
square_size = 1024

# Erstelle den Ausgabeordner, wenn er nicht existiert
os.makedirs(output_folder, exist_ok=True)

# Durchlaufe alle Bilder im Eingabeordner
for filename in os.listdir(input_folder):
    if filename.endswith(".jpg"):
        input_path = os.path.join(input_folder, filename)
        crop_and_resize_image(input_path, output_folder, square_size, target_width, target_height)

print("Image resampling done.")

#### Create training lables

To train the algorithm training data consisting of annotations are necessary. This often is cost and time intensive
S some of the open source tools available online are  :* 

https://roboflow.com/annotate?ref=blog.roboflow.c* om

https://blog.roboflow.com/c* vat/

https://blog.roboflow.com/la  belimg/

VGI platforms like OSM provided a promising source of massive, free labels together with rich and detailed semantic information for satellite image analysis. Use of multimodal data for mapping more sophisticated objects in OSM with the help of its rich semantic information, has been demonstrated previously by tools such as ohsome2label. However, the tool poses certain limitations w.r.t. the image size (256 by 256 pixels).

At HeiGIT, we have developed a flexible multimodal dataset creation and annotation tool that combines VGI data and VHR Imagery for rapid data generation.

We will be using a simple dataset generated for Wastewater Treatment Plants (WTPs) (at 256 by 256 pixels) using Bing Imagery (VHR) annd OSM data for labelling individual features of WTPs.

## Create Custom dataset

## Mount Drive for working in Google Colab

In [None]:
drive.mount('/content/drive', force_remount=True)

In [None]:
HOME = os.getcwd()
print(HOME)

## Initializing YOLOv6 algorithm

In [None]:
# Download MT-YOLOv6 repository and install requirements
!git clone https://github.com/meituan/YOLOv6
%cd YOLOv6

In [None]:
!pip install -r requirements.txt