# Introductory notebook for Omdena Skymaps

Hi everybody! As I saw most of the participants want to use the colab, I prepared this notebook - it is the code from scripts predict_png.py and train.py contained in the baseline Github repository, as these are the two important ones. Please, work your way through it, try to understand the code and play around with the data. We plan to share some specific tasks soon, but first we need everybody to get going. This may result in slow start for advanced participants, but please be patient - its also a first Omdena challenge for us at Skymaps and we are trying our best so everyone has the chance to learn something and also enjoy it!

We are assuming everyone has at least basic knowledge of Python, jupyter notebooks / google colab and machine learning. If you do not and feel completely lost, please try to do some short online python crash course and study how to execute python code. I believe you can still learn something, but it may require a lot of determination on your side.

## Instruction for debugging


I saw many people trying to debug some code / asking for help. If you are stuck and ask for help, make sure your question contains the following:

*   Before you ask for help, search the Slack whether similar problem has not been already solved
*   Before you ask for help, search the error message on Google 
*   Clearly state the problem
*   Tell us what you already did to solve the issue
*   Post a screenshot of your error message
*   Post execution environment (colab, my PC ...)

This not only helps others to solve your problem, but also keeps slack from getting spammed and most importantly - you learn most by searching for the answer!

# CODE STARTS HERE

## Before you start working, switch your instance to GPU acceleration!

This can be done by Edit > Notebook settings or Runtime>Change runtime type and select GPU as Hardware accelerator.

Firstly we need to setup the environment (output ommited by using %%capture) - if you have trouble  running cells with %%capture, delete this row

In [None]:
# To avoid lots of log prints
%%capture  

!pip install numpy scipy Pillow matplotlib scikit-image scikit-learn 
!pip install opencv-python torch geopandas geojson 
!pip install rasterio rio_tiler torchvision supermercado mercantile gdown
!pip install albumentations==0.5.2
!pip install pytorch-lightning

!pip install mlflow --quiet


We are installing COCO Python API for evaluation

In [None]:
%%capture
%%shell

pip install cython
# Install pycocotools, the version by default in Colab
# has a bug fixed in https://github.com/cocodataset/cocoapi/pull/354
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

Next lets create the project folder structure

**Use own Google Drive to avoid reinitializing everything everytime**

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

In [None]:
!rm -rf sample_data

In [None]:
!mkdir SkyMaps_task4

In [None]:
%cd SkyMaps_task4/
!mkdir data
!mkdir weights

Download dataset containing training and validation image tiles including annotations (output ommited by using %%capture) and also the trained baseline Faster-RCNN model weights

In [None]:
# To avoid lots of log prints
%%capture  

!gdown https://drive.google.com/uc?id=1nGVhW9mgvonHBsaDn6Q0-yCK8CCCvKX1 -O ./data/complete_dataset_rgb.zip
# !gdown https://drive.google.com/uc?id=1HheG86uj1_RsgfwJ1X1WIsT9Iwh430jY -O ./weights/fasterrcnn_repa.pth
!unzip ./data/complete_dataset_rgb.zip -d ./data

!rm -rf ./data/complete_dataset_rgb.zip

**Clone the code base**

In [None]:
import mlflow
import os
import requests
import datetime
import time
from getpass import getpass

In [None]:
access_token = getpass('Enter your GitHub access token: ')

! git clone https://{access_token}@github.com/OmdenaAI/SkyMaps.git

In [None]:
%cd SkyMaps
! git pull
%ls
! git branch -a
! git branch t4-od-bc-001
! git checkout t4-od-bc-001
! git branch 

In [None]:
%pwd

## Configure MLflow 🧐

**Set Environment Variables**

In [None]:
#@title Enter the repository name for the project:

REPO_NAME= "ai/test" #@param {type:"string"}

In [None]:
#@title Enter the username of your DAGsHub account:

USER_NAME = "bcarrero" #@param {type:"string"}

**Initialize MLflow**

**Set Local Configurations**

Under the [Token tab](https://dagshub.com/user/settings/tokens) in the user setting, copy the default token and use it here.

Could make it work if 2FA is enabled on DagsHub. 

# When it says "copy the default token and used it here"... Where exactly is here? Is this needed?

In [None]:
os.environ['MLFLOW_TRACKING_USERNAME'] = USER_NAME
os.environ['MLFLOW_TRACKING_PASSWORD'] = getpass('Enter your DAGsHub access token or password: ')
# token = getpass('Enter your DAGsHub access only token: ')

mlflow.set_tracking_uri(f'https://dagshub.com/{USER_NAME}/{REPO_NAME}.mlflow')
# mlflow.set_tracking_uri(f'https://{token}@dagshub.com/{USER_NAME}/{REPO_NAME}.mlflow')

## Predict crop on a single PNG image

Following code is to detect crop on a single PNG image tile using a trained baseline Faster-RCNN model. Firstly import libraries

In [None]:
import cv2
import numpy as np
import torch
import torchvision
from matplotlib import pyplot as plt
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

Now lets define some paths and constants

In [None]:
DIR_INPUT = '/content/SkyMaps_task4/data/complete_dataset'
DIR_TRAIN = f'{DIR_INPUT}/train'
DIR_VAL = f'{DIR_INPUT}/validation'
DIR_TEST = f'{DIR_INPUT}/test'
my_model_filepath = '../SkyMaps_task4/weights/fasterrcnn_resnet152_fpn.pth'

# Train your own model

### Now lets train your own model using the same procedure as the baseline model was trained! Lets import some more libraries

In [None]:
import albumentations as A
import pandas as pd
import IPython
import torch
import torchvision
import pytorch_lightning
from albumentations.pytorch.transforms import ToTensorV2
from torch.utils.data import DataLoader
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torch.utils.data import Dataset
from sklearn import preprocessing

In [None]:
import sys
sys.path.append('ml-detection/models')

from l_faster_rcnn import fasterrcnn_resnet152_fpn, fasterrcnn_resnext50_32x4d_fpn, evaluate
from l_Evaluation import COCOEvaluator

### Before we load the training annotations, lets declare a function that we will use to process them to a correct format

In [None]:
def process_annotations(input_df, print_shape=False, ordinal_encoder=None):
    """ Function to process input annotations from Tensorflow object detection format before training
    """

    # Define pandas columns for bounding box position
    input_df = input_df.rename(columns={'filename': 'image_id', 'xmin': 'x', 'ymin': 'y'})
    input_df['w'] = input_df['xmax'] - input_df['x']
    input_df['h'] = input_df['ymax'] - input_df['y']
    input_df = input_df.drop(columns=['xmax', 'ymax'])

    input_df['x'] = input_df['x'].astype(np.float)
    input_df['y'] = input_df['y'].astype(np.float)
    input_df['w'] = input_df['w'].astype(np.float)
    input_df['h'] = input_df['h'].astype(np.float)

    # Encode classes using ordinal encoding +1
    if ordinal_encoder is None:
      ordinal_encoder = preprocessing.LabelEncoder()
      ordinal_encoder.fit(input_df['class'].unique())
    input_df['class'] = ordinal_encoder.transform(input_df['class']) + 1

    image_ids = input_df['image_id'].unique()

    if print_shape:
        print(f'Loaded dataframe of shape {input_df.shape}')

    return input_df, image_ids, ordinal_encoder

Now we can define augmentations function from the Albumentations library

In [None]:
# Albumentations augmentations functions
def get_train_transform():
    return A.Compose([
        A.Flip(0.25),
        A.RandomRotate90(p=0.5),
        A.Transpose(p=0.25),
        A.RandomBrightnessContrast(p=0.2),
        A.RandomGamma(p=0.2),
        A.Blur(p=0.2),
        A.ColorJitter(p=0.2),
        A.Downscale(p=0.2),
        A.ChannelDropout(p=0.2),
        A.ChannelShuffle(p=0.2),
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

In [None]:
# Albumentation validation transform - simple conversion to pytorch Tensor
def get_valid_transform():
    return A.Compose([
        ToTensorV2(p=1.0)
    ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

Now lets declare three classes - first is a Pytorch dataset class for data loading during model training, the other two are classes that help us during the training process

In [None]:
class PlantDataset(Dataset):
    """ Pytorch dataset class customized for Skymaps data loading

    """

    def __init__(self, dataframe, image_dir, transforms=None):
        super().__init__()

        self.image_ids = dataframe['image_id'].unique()
        self.df = dataframe
        self.image_dir = image_dir
        self.transforms = transforms

    def __getitem__(self, index: int):
        image_id = self.image_ids[index]
        records = self.df[self.df['image_id'] == image_id]

        image = cv2.imread(f'{self.image_dir}/{image_id}', cv2.IMREAD_COLOR)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)
        image /= 255.0

        boxes = records[['x', 'y', 'w', 'h']].values
        boxes[:, 2] = boxes[:, 0] + boxes[:, 2]
        boxes[:, 3] = boxes[:, 1] + boxes[:, 3]

        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])
        area = torch.as_tensor(area, dtype=torch.float32)

        # Create torch of ones and assign class by multipy it with class label
        labels = torch.ones((records.shape[0],), dtype=torch.int64) * torch.tensor(records['class'].values)

        # suppose all instances are not crowd
        iscrowd = torch.zeros((records.shape[0],), dtype=torch.int64)

        target = {}
        target['boxes'] = boxes
        target['labels'] = labels
        # target['masks'] = None
        target['image_id'] = torch.tensor([index])
        target['area'] = area
        target['iscrowd'] = iscrowd

        height, width, channels = image.shape
        target['width'] = torch.tensor([width])
        target['height'] = torch.tensor([height])

        if self.transforms:
            sample = {
                'image': image,
                'bboxes': target['boxes'],
                'labels': labels
            }
            sample = self.transforms(**sample)
            image = sample['image']

            # target['boxes'] = torch.stack(tuple(map(torch.tensor, zip(*sample['bboxes'])))).permute(1, 0)
            target['boxes'] = torch.tensor(sample['bboxes'])

        return image, target, image_id

    def __len__(self) -> int:
        return self.image_ids.shape[0]

In [None]:
class Averager:
    """ Class for averaging loss function during training process

    """

    def __init__(self):
        self.current_total = 0.0
        self.iterations = 0.0

    def send(self, value):
        self.current_total += value
        self.iterations += 1

    @property
    def value(self):
        if self.iterations == 0:
            return 0
        else:
            return 1.0 * self.current_total / self.iterations

    def reset(self):
        self.current_total = 0.0
        self.iterations = 0.0

In [None]:
def collate_fn(batch):
    """ Batch data loading helper function for pytorch
    """
    return tuple(zip(*batch))

Now we can load the annotations and process them to correct input format

In [None]:
train_df = pd.read_csv(f'{DIR_TRAIN}/_annotations.csv')
valid_df = pd.read_csv(f'{DIR_VAL}/_annotations.csv')
test_df = pd.read_csv(f'{DIR_TEST}/_annotations.csv')

print('Loaded annotations before format conversion')
print(train_df.head(5).to_string())

train_df, train_img_ids, ordinal_encoder = process_annotations(train_df, print_shape=True)
valid_df, val_img_ids, _ = process_annotations(valid_df, print_shape=True, ordinal_encoder=ordinal_encoder)
test_df, test_img_ids, _ = process_annotations(test_df, print_shape=True, ordinal_encoder=ordinal_encoder)

print('\nAnnotations formatted')
print(train_df.head(5).to_string())

Initialize a model and provide some basic settings

In [None]:
# Create dictionaries for the detected crops (this is used only if plotting is needed)
class_dict = {k:v for k, v in enumerate(["background"] + list(ordinal_encoder.classes_))}

# Set number of classes
num_classes = len(class_dict) # background + classes

# Set prediction threshold - only instances with higher probability will be assigned as detected
prediction_threshold = 0.3



Create an instance of a new model with correct settings

In [None]:
# load a model; pre-trained on COCO

my_model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True, box_score_thresh=prediction_threshold)
# my_model = fasterrcnn_resnet152_fpn(pretrained=False, box_score_thresh=prediction_threshold)
# my_model = fasterrcnn_resnext50_32x4d_fpn(pretrained=False, box_score_thresh=prediction_threshold)

# get number of input features for the classifier
in_features = my_model.roi_heads.box_predictor.cls_score.in_features

# replace the pre-trained head with a new one
my_model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

Now lets  check the execution device 
- GPU / CPU
and 
- setup training hyperparameters

In [None]:
# Get device - cuda / CPU
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
print(f'Device {device} is used for training')
my_model.to(device)
params = [p for p in my_model.parameters() if p.requires_grad]
# Set training hyperparameters
num_epochs = 20
num_eval_step = 10

learning_rate = 0.0005
early_stop_limit = 20
optimizer = torch.optim.Adam(params, lr=learning_rate, weight_decay=0.0005)
# lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
lr_scheduler = None

Define training and validation datasets and dataloaders

In [None]:
# Create train and validation train datasets
# def __init__(self, dataframe, image_dir, transforms=None):
train_dataset = PlantDataset(train_df, DIR_TRAIN, get_train_transform())  
valid_dataset = PlantDataset(valid_df, DIR_VAL, get_valid_transform())
test_dataset = PlantDataset(test_df, DIR_TEST, get_valid_transform())

batch_size = 4
# Create train dataloader with settings
train_data_loader = DataLoader(
    train_dataset,
    batch_size=batch_size,
    shuffle=True,
    num_workers=2,
    collate_fn=collate_fn
)

# Create test dataloader with settings
test_data_loader = DataLoader(
    test_dataset,
    batch_size=batch_size,
    shuffle=False,
    num_workers=2,
    collate_fn=collate_fn
)

# Create validation dataloader with settings
valid_data_loader = DataLoader(
    valid_dataset,
    batch_size=batch_size,
    shuffle=False,
    num_workers=2,
    collate_fn=collate_fn
)



## Set MLflow Auto-Logging

[mlflow.pytorch.autolog](https://www.mlflow.org/docs/latest/python_api/mlflow.pytorch.html#mlflow.pytorch.autolog) only work with PyTorch Lightning only. So, it has no impact.

In [None]:
 mlflow.pytorch.autolog()

## See the Experiment Results - LIVE! 📳

In this tab, you can see the results of the experiment while it's running!

**Notice**: To update the experiment status, simply go back to the "Experiment Tab" and reopen the top experiment in the table.

In [None]:
print(f"https://dagshub.com/{USER_NAME}/{REPO_NAME}/experiments/#/")
display(IPython.display.IFrame(f"https://dagshub.com/{USER_NAME}/{REPO_NAME}/experiments/#/",'100%',600))

## **Training**
And now lets start the **training loop**! The training will take some time even on GPU - usually 10-30 minutes depending on the settings

In [None]:
# Create averager classes - only to keep track of loss function value during training
train_loss_hist = Averager()
val_loss_hist = Averager()

print('-------------- Start training -----------------')

# Training loop initializers
early_stop_counter = 0
best_valid_loss = 100.0

evaluator = None

# Train the model
with mlflow.start_run() as run:
  mlflow.log_params({"epochs": num_epochs, "learning_rate": learning_rate, 
                     "batch_size": batch_size})
  for epoch in range(num_epochs):
      start_time = time.time()

      train_loss_hist.reset()
      val_loss_hist.reset()

      # Loop through the training dataset
      for images, targets, train_img_ids in train_data_loader:
          # Load input images
          images = list(image.to(device) for image in images)
          targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
          # Create loss dictionary
          loss_dict = my_model(images, targets)
          # Calculate sum of losses over one epoch
          losses = sum(loss for loss in loss_dict.values())
          loss_value = losses.item()
          train_loss_hist.send(loss_value)
          # Backpropagate the errors and update the weights
          optimizer.zero_grad()
          losses.backward()
          optimizer.step()

      # Calculate the model loss on validation data
      if (epoch + 1) % num_eval_step == 0:
        _, ground_truth_dict, predictions_dict = evaluate(my_model, valid_data_loader, device, val_loss_hist, mode='Comb')

        # calling COCO Evaluation on Test Dataset
        if evaluator is None:
          evaluator = COCOEvaluator(ground_truth=ground_truth_dict, categories=class_dict, data_type='robo')
        
        ml_stats = {"loss": train_loss_hist.value, "val_loss": val_loss_hist.value}
        stats = evaluator.Evaluate(detections=predictions_dict)
        ml_stats.update(stats)
      else:
        evaluate(my_model, valid_data_loader, device, val_loss_hist)
        ml_stats = {"loss": train_loss_hist.value, "val_loss": val_loss_hist.value}

      # update the learning rate - could be used in the future (now set to none)
      if lr_scheduler is not None:
          lr_scheduler.step()

      print(f"Epoch #{epoch} train loss: {train_loss_hist.value} valid loss: {val_loss_hist.value} time: {time.time() - start_time}")

      # ML flow logging is done here
      mlflow.log_metrics(ml_stats, step=epoch)

      # Check if validation loss is less than the previous best val loss and, if so, save best model
      if best_valid_loss > val_loss_hist.value:
          best_valid_loss = val_loss_hist.value
          print('Best achieved validation loss, saving model')
          torch.save(my_model.state_dict(), my_model_filepath)
          early_stop_counter = 0
      else:
          early_stop_counter += 1
      # If the validation loss has not increased for the specified number of epochs, stop the training loop
      if early_stop_counter > early_stop_limit:
          print(f'Validation loss has not increased for {early_stop_limit} epochs, stopping training loop')
          break

print('-------------- Training finished  -----------------')

## Print MLflow data 

In [None]:
from mlflow.tracking import MlflowClient

In [None]:
def print_auto_logged_info(r):

    tags = {k: v for k, v in r.data.tags.items() if not k.startswith("mlflow.")}
    artifacts = [f.path for f in MlflowClient().list_artifacts(r.info.run_id, "model")]
    print("run_id: {}".format(r.info.run_id))
    print("artifacts: {}".format(artifacts))
    print("params: {}".format(r.data.params))
    print("metrics: {}".format(r.data.metrics))
    print("tags: {}".format(tags))

In [None]:
# fetch the auto logged parameters and metrics
print_auto_logged_info(mlflow.get_run(run_id=run.info.run_id))

## Start Evaluation

In [None]:
# evaluate on the test dataset
_, ground_truth_dict, predictions_dict = evaluate(my_model, test_data_loader, device=device, mode='Test')

# calling COCO Evaluation on Test Dataset
evaluator = COCOEvaluator(ground_truth=ground_truth_dict, categories=class_dict, data_type='robo')
evaluator.Evaluate(detections=predictions_dict)

## Visualization

Load a sample image

In [None]:
# Choose a png file for prediction and visualisation
# You can choose any image file in the data/beetroot/train/validation
image_file = 'repa_trn_23_4598964_2920955_png_jpg.rf.e4c6865c435d89fc09a7307f168a5929.jpg'
pred_file = os.path.join(DIR_VAL, image_file)

# Read the image using cv2
image = cv2.imread(pred_file, cv2.IMREAD_COLOR)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB).astype(np.float32)

convert image to Torch tensor

In [None]:
# Convert the image to torch tensor and send it to the processing device
image_torch = np.expand_dims(np.transpose(image, (2, 0, 1)), axis=0)
image_torch /= 255.0
image_torch = torch.tensor(image_torch).to(device)

Now lets declare function to pretty print the results

In [None]:
def draw_text(img, text, font=cv2.FONT_HERSHEY_PLAIN, pos=(0, 0), font_scale=0.6, font_thickness=1,
              text_color=(0, 0, 0), text_color_bg=(0, 0, 0)):
    """ Function to pretty print label names above the bounding box
    """
    x, y = pos
    text_size, _ = cv2.getTextSize(text, font, font_scale, font_thickness)
    text_w, text_h = text_size
    cv2.rectangle(img, pos, (x + text_w, y + text_h), text_color_bg, -1)
    cv2.putText(img, text, (x, y + text_h + font_scale - 1), font, font_scale, text_color, font_thickness)

    return text_size

**By now you should get a trained model - congratulations! We can visualise its predictions on a single validation PNG tile**

In [None]:
my_model = my_model.eval()

In [None]:
ground_truth = []

gt_df = valid_df.loc[valid_df['image_id'] == image_file]
counter, boxes, labels, scores, keep = 0, [], [], [], [] 
for index, row in gt_df.iterrows():
  boxes.append(np.array([row["x"], row["y"], row["x"] + row["w"], row["y"] + row["h"]]).astype(np.int32))
  labels.append(row["class"])
  scores.append(1)
  keep.append(counter)

  counter += 1

ground_truth.append([np.array(boxes).astype(np.int32), np.array(labels).astype(np.int32), 
                     np.array(scores).astype(np.int32), np.array(keep).astype(np.int32)])  

In [None]:
predictions = []

preds = my_model(image_torch)
for prediction in preds:
  boxes, labels, scores, keep = [], [], [], [] 
  # Get predictions from device and convert them to numpy
  boxes = prediction['boxes'].cpu().detach().numpy().astype(np.int32)
  labels = prediction['labels'].cpu().detach().numpy().astype(np.int32)
  scores = prediction['scores'].cpu().detach().numpy().astype(np.int32)

  # Use NMS to get rid of overlapping predictions
  keep = torchvision.ops.nms(prediction['boxes'], prediction['scores'], prediction_threshold).cpu().detach().numpy().astype(np.int32)

  predictions.append([boxes, labels, scores, keep])

In [None]:
colour_dict = {1:(220, 0, 0), 2:(0, 0, 220), 3:(0, 220, 0), 4:(0, 220, 220)}

print('Visualised results from your custom model')
figure_storage = [ground_truth, predictions]

# Initialize matlplotlib plot
fig, axs = plt.subplots(nrows=len(preds), ncols=2, figsize=(20, 9))

# Show the image data in a subplot
for ax_i, ax in enumerate(axs):
  temp_image = image.copy()
  boxes, labels, scores, keep = figure_storage[ax_i % 2][ax_i//2]

  # Plot all predicted bounding boxes onto a matplotlib plot with corresponding class names
  for b, box in enumerate(boxes):
    if b not in keep:
      continue
    
    cv2.rectangle(temp_image,
                  (box[0], box[1]),
                  (box[2], box[3]),
                  colour_dict[labels[b]], 1)

    draw_text(temp_image,
              class_dict[labels[b]],
              font_scale=1,
              pos=(box[0], box[1] - 10),
              text_color_bg=colour_dict[labels[b]])
  ax.set_axis_off()
  ax.imshow(temp_image)

fig.tight_layout()
# Show the figure on the screen    
plt.show()

WOW, its been a ride, thank you for your attention if you got all the way down here :) In this introductory notebook you have learned:

*   How to use the baseline model to detect and classify crops in the drone images
*   How to visualise the predictions
*   How to train your own model

As for now you can start exploring the other provided datasets, try to play with this object detection training pipeline and get to know the problem and possibly think of some better solutions for object detection.

In the following days we plan to share details on all the problems that will be part of this Omdena challenge. Stay tuned for more info!

Take care and wish you wonderful Sunday,

Martin
