<a href="https://colab.research.google.com/github/SamuelHericles/nuveo_challenge/blob/main/01-WheresWally/src/Train_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Initial considerations

Utilizing the object detection approach for search Wally in pictures, the model architecture model was the Fast RCNN (see this link: https://arxiv.org/abs/1506.01497). The pyTorch framework has provided the fasterrcnn_resnet50_fpn pre-trained model but needed to be trained with our dataset, so this notebook has prepared an image dataset for the training model. Unfortunately, I don't get good results because it's a new world for me, and passed many hours of study and understanding image processing and objection detection in the literature.

I based on these links:
 - https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
 - https://www.kaggle.com/pestipeti/pytorch-starter-fasterrcnn-train
 - https://www.kaggle.com/aryaprince/getting-started-with-object-detection-with-pytorch
 - http://www.galirows.com.br/meublog/opencv-python/opencv2-python27/capitulo2-deteccao/reconhecimento-objetos/
 - https://medium.com/ensina-ai/detec%C3%A7%C3%A3o-de-objetos-pr%C3%B3prios-para-n%C3%A3o-cientista-de-dados-b7fab2aa0e88
 - https://www.lapix.ufsc.br/ensino/visao/visao-computacionaldeep-learning/deteccao-de-objetos-em-imagens/



In [1]:
%shell

!git clone https://github.com/SamuelHericles/nuveo_challenge.git
!git clone https://SamuelHericelsBit@bitbucket.org/SamuelHericlesBit/model-nuveo-challenge.git

Cloning into 'nuveo_challenge'...
remote: Enumerating objects: 358, done.[K
remote: Counting objects: 100% (358/358), done.[K
remote: Compressing objects: 100% (232/232), done.[K
remote: Total 358 (delta 129), reused 337 (delta 118), pack-reused 0[K
Receiving objects: 100% (358/358), 62.36 MiB | 24.60 MiB/s, done.
Resolving deltas: 100% (129/129), done.
Cloning into 'model-nuveo-challenge'...
Unpacking objects: 100% (3/3), done.


# 0. Import needed

In [3]:
import os
import cv2
import json
import torch
import torchvision

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from torch.autograd import Variable
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

# 1. Functions are made for iterate image sample into data loader




# 1.1 Prepare dataset

In [4]:
def collate_fn(batch):
  """  
    This function transforms a images bath in tuple iterator

    @param batch - images batches

    @return a batch iterator
  """
  return tuple(zip(*batch))
    
class Averager:
    def __init__(self):
        """
          This class works to calculate the mean of loss in training runtime .
        """
          
        self.current_total = 0.0
        self.iterations = 0.0

    def send(self, value):
        """
            Update counter number iteration and value losses.

            @param value -  info loss dict values
        """
        self.current_total += value
        self.iterations += 1

    @property
    def value(self):
        """
              Calculate current iteration and mean of loss dict values.
        """        

        if self.iterations == 0:
            return 0
        else:
            return 1.0 * self.current_total / self.iterations

    def reset(self):
        """
            Reset counter number iteration and value losses.
        """        
        self.current_total = 0.0
        self.iterations = 0.0

class WallynDataset():
    def __init__(self):
        """
            This class is to pre-processing image, put target in correct type
            in line of the fasterrcnn_resnet50_fpn suport.
        """            
        # load all image files, sorting them to ensure that they are aligned
        self.imgs = list(sorted(os.listdir('/content/nuveo_challenge/01-WheresWally/data/TrainingSet/images')))
        self.centroids = list(sorted(os.listdir('/content/nuveo_challenge/01-WheresWally/data/TrainingSet/json')))

    def __getitem__(self, idx):
        """
          Make the dataloader get a image and your target

          @param idx - index of image

          @return img - image treated
          @return target - target of image
          @return imgs[idx] - file name image

        """        
        # load images
        img_path = os.path.join('/content/nuveo_challenge/01-WheresWally/data/TrainingSet', "images", self.imgs[idx])
        img = cv2.imread(f'{img_path}', cv2.IMREAD_COLOR)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).astype(np.float32)

        # Torch convert to float32 and divide for beetween 0 and 1.
        img_shape = img.shape
        img = torch.tensor(img, dtype=torch.float32)
        img = torch.reshape(img,(3, img_shape[0], img_shape[1]))
        img /= 255.0

        # Get points for to calculate the max and min image border points
        points_path = os.path.join('/content/nuveo_challenge/01-WheresWally/data/TrainingSet', "json", self.centroids[idx])        
        json_uploaded = open(f'{points_path}','r')

        json_file = json.loads(json_uploaded.read()) 
        points = json_file['shapes'][0]['points']

        xmax, ymax = max(points)
        xmin, ymin = min(points)

        xmin2 = min(xmax, xmin)
        xmax2 = max(xmax, xmin)

        ymin2 = min(ymax, ymin)
        ymax2 = max(ymax, ymin)

        boxes = [xmin2, ymin2, xmax2, ymax2]

        boxes = torch.tensor(boxes, dtype=torch.float32)
        boxes = torch.reshape(boxes, (1, 4))

        # there is only one class
        labels = torch.ones((1,), dtype=torch.int64)
        
        # # suppose all instances are not crowd
        iscrowd = torch.zeros((1,), dtype=torch.int64)
        
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target['iscrowd'] = iscrowd

        return img, target, self.imgs[idx]

In [5]:
#check output
dataset = WallynDataset()
dataset[0]

(tensor([[[0.0471, 0.4510, 0.1412,  ..., 0.4000, 0.1569, 0.0706],
          [0.4275, 0.1961, 0.0510,  ..., 0.1843, 0.0392, 0.3373],
          [0.1843, 0.0706, 0.3686,  ..., 0.0039, 0.2000, 0.1882],
          ...,
          [0.3922, 0.1412, 0.0353,  ..., 0.2078, 0.0353, 0.3490],
          [0.2118, 0.0157, 0.3294,  ..., 0.0275, 0.2235, 0.1804],
          [0.0353, 0.4392, 0.1137,  ..., 0.4392, 0.1882, 0.0431]],
 
         [[0.4314, 0.1765, 0.0196,  ..., 0.1843, 0.0471, 0.3608],
          [0.2235, 0.0471, 0.3608,  ..., 0.0588, 0.2431, 0.2039],
          [0.0196, 0.4314, 0.1020,  ..., 0.4118, 0.1569, 0.0471],
          ...,
          [0.8314, 0.6745, 0.9804,  ..., 0.0118, 0.2275, 0.2039],
          [0.0196, 0.4157, 0.1020,  ..., 0.4235, 0.1765, 0.0941],
          [0.4706, 0.2235, 0.0353,  ..., 0.8588, 0.6510, 0.9647]],
 
         [[0.8275, 0.6706, 0.9765,  ..., 0.0196, 0.2275, 0.1961],
          [0.0196, 0.4157, 0.1098,  ..., 0.3922, 0.1451, 0.0275],
          [0.4157, 0.1647, 0.0118,  ...,

# 2. Training model

# 2.1 Upload model pre-trained and after to train with our dataset


In [None]:
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features

# replace the pre-trained head with a new one
# 1 class (wheat) + background
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes=2)

# 2.2 Transform dataset in iterate torch DataLoader

In [None]:
# use our dataset and defined transformations
dataset = WallynDataset()

# split the dataset in train and test set
torch.manual_seed(1)
indices = list(i for i in range(105))
dataset = torch.utils.data.Subset(dataset, indices)

# define training and validation data loaders
data_loader = torch.utils.data.DataLoader(
    dataset, batch_size = 4, shuffle = False, num_workers=2,
    collate_fn = collate_fn)

# 2.4 Configures the optimizer

Base on this link(https://ichi.pro/pt/tutorial-de-deteccao-de-objetos-com-torchvision-143730852624127) learning rate arround 0.0001 and 0.0005 and weight_decay 0.0001 and 0.0005 are good for start training Faster RCNN model.





In [None]:
# Define device avaliable for train enviroment.
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)

# Define optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.Adam(params, lr=0.0003, weight_decay=0.0005)
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)

# Define number of epochs
num_epochs = 3

# 2.5 Train model step


I try many time for get best loss mean (close 0.01), so there is.

In [None]:
loss_hist = Averager()
itr = 1

for epoch in range(num_epochs):
    loss_hist.reset()
    
    for images, target, _ in data_loader:
      
      #Get image and target and set your device for dataloader type
      images = list(image.to(device) for image in images)
      target = [{k: v.to(device) for k, v in t.items()} for t in target]

      # Forward step
      loss_dict = model(images, target)
      
      # Get loss dict values information
      losses = sum(loss for loss in loss_dict.values())
      loss_value = losses.item()

      loss_hist.send(loss_value)

      # Backward step
      optimizer.zero_grad()
      losses.backward()
      optimizer.step()

      if itr % 10 == 0:
          print(f"Iteration #{itr} loss: {loss_value}")

      itr += 1
    lr_scheduler.step()
    print(f"Epoch #{epoch} loss: {loss_hist.value}")   

  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)


Iteration #10 loss: 0.24474990367889404
Iteration #20 loss: 0.17614886164665222
Epoch #0 loss: 0.47539590851024344
Iteration #30 loss: 0.1435394585132599
Iteration #40 loss: 0.3151240050792694
Iteration #50 loss: 0.18175429105758667
Epoch #1 loss: 0.3041205334442633
Iteration #60 loss: 0.17453822493553162
Iteration #70 loss: 0.18782402575016022
Iteration #80 loss: 0.27125638723373413
Epoch #2 loss: 0.2884931398762597


# 6. Save the model

I utilize torh.save, not onnx, because I have some problem with version package.

In [None]:
torch.save(model.state_dict(), os.path.join('/content/nuveo_challenge/01-WheresWally/', "models", 'model.pt'))