<a href="https://colab.research.google.com/github/ryanro97/player-detector/blob/master/PlayerDetectorTrainer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Player Detector Trainer
---
Trainer for frame by frame player detection, using a custom dataset. Implemented using an untrained [PyTorch's Faster R-CNN model with a ResNet-50-FPN Backbone](https://pytorch.org/docs/stable/torchvision/models.html#faster-r-cnn).

<br />

### Directory Hierarchy:
```
PlayerDetector
├── data
    ├── train
        ├── images
            ├── *.jpg
        ├── targets
            ├── classes.txt
            ├── *.txt
    ├── predict
        ├── video
            ├── *.mp4
├── PlayerDetectorTrainer.ipynb
├── PlayerDetectorPredictor.ipynb
```

<br />

### Target Labeling
Using Tzuta Lin's [LabelImg](https://github.com/tzutalin/labelImg) tool, bounding boxes were hand labeled using the YOLO format.


### Install Dependencies for Google Colab

In [0]:
!pip3 install pillow torch torchvision

### Mount Google Drive

In [0]:
from google.colab import drive
drive.mount('/content/drive')

### Imports

In [0]:
import os
import random
import torch
from torch.utils.data import  Dataset, DataLoader
from torchvision.transforms import functional, ToTensor
from torchvision.models.detection import fasterrcnn_resnet50_fpn
from PIL import Image

### Custom PyTorch Dataset
Handles necessary conversions for YOLO to PyTorch's Faster R-CNN model input format. Also performs augmentation with each image having a 50% probability of being horizontally flipped.

<br />

#### Errors
```
IndexError: Image and Target count mismatch
TypeError: Image and Target file name mismatch
IOError: Target read error
```

In [0]:
class PlayerTrainerDataset(Dataset):
    def __init__(self):
        root = os.getcwd()

        self.images_dir = os.path.join(root, 'data/train/images')
        self.targets_dir = os.path.join(root, 'data/train/targets')

        self.images = sorted(os.listdir(self.images_dir))
        self.targets = [target for target in \
                        sorted(os.listdir(self.targets_dir)) \
                        if target != 'classes.txt']

        if len(self.images) != len(self.targets):
            raise IndexError('Image and Target count mismatch')
    
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, idx):
        image_path = os.path.join(self.images_dir, self.images[idx])
        target_path = os.path.join(self.targets_dir, self.targets[idx])

        image_name = os.path.splitext(os.path.basename(image_path))[0]
        target_name = os.path.splitext(os.path.basename(image_path))[0]

        if image_name != target_name:
            raise TypeError('Image and Target file name mismatch')
        
        image = Image.open(image_path).convert("RGB")
        
        target = None
        with open(target_path) as f:
            target = f.readline().strip().split()
        if not target:
            raise IOError('Target read error')
        
        w, h = image.size
        
        center_x = float(target[1]) * w
        center_y = float(target[2]) * h
        bbox_w = float(target[3]) * w
        bbox_h = float(target[4]) * h
        
        x0 = round(center_x - (bbox_w / 2))
        x1 = round(center_x + (bbox_w / 2))
        y0 = round(center_y - (bbox_h / 2))
        y1 = round(center_y + (bbox_h / 2))
        
        boxes = [x0, y0, x1, y1]
        labels = torch.as_tensor(1, dtype=torch.int64)

        if random.random() < 0.5:
            image = functional.hflip(image)
            boxes = [w - x1 - 1, y0, w - x0 - 1, y1]

        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        image = ToTensor()(image)
        
        target = [{'boxes': boxes, 'labels': labels}]
        
        return image, target

### Train Function
Takes in the working directory (root directory in the data hierarchy diagram), the number of classes to detect, the learning rate, momentum, and weight decay of the optimizer (using stochastic gradient descent), the number of epochs, and trains the model, then saves the weights in the working directory.

<br />

#### Parameters
```
working_dir: String representation of the working directory
num_classes: Integer representation of the number of classes to detect
opt_lr: Float representation of the optimizers learning rate
opt_mom: Float representation of the optimizers momentum
opt_wd: Float representation of the optimziers weight decay
num_epochs: Integer representation of the number of epochs to train for
```

In [0]:
def train_model(working_dir, num_classes, opt_lr, opt_mom, opt_wd, num_epochs):
    os.chdir(working_dir)
    
    model = fasterrcnn_resnet50_fpn(num_classes=num_classes)
    device = torch.device('cuda') if torch.cuda.is_available() \
             else torch.device('cpu')
    model.to(device)

    params = [p for p in model.parameters() if p.requires_grad]
    optimizer = torch.optim.SGD(params, lr=opt_lr, momentum=opt_mom, \
                                weight_decay=opt_wd)

    dataset = PlayerTrainerDataset()
    data_loader = DataLoader(dataset, shuffle=True)

    model.train()
    for epoch in range(num_epochs):
        running_loss = 0.0
        for images, targets in data_loader:
            images = list(image.to(device) for image in images)
            targets = [{k: v.to(device) for k, v in t.items()} for t in targets]
            
            loss_dict = model(images, targets)
            losses = sum(loss for loss in loss_dict.values())
            
            optimizer.zero_grad()
            losses.backward()
            optimizer.step()
            
            running_loss += losses.item()

        print('epoch:%d loss:%.3f' % \
              (epoch + 1, running_loss / len(data_loader)))

    torch.save(model.state_dict(), 'weights.pt')

### Training the Model

In [0]:
train_model('/content/drive/My Drive/PlayerDetector', 2, 0.005, 0.9, 0.0005, 25)