# **Avengers Detector**
Using transfer learning on a custon dataset to detect the original 6 avengers from a media source.
---
---

**Preperation:**


*   Downloading YOLOv3.
*   Downloading the dataset.
*   Attaching configuration file.
*   Downloading and updating required libraries.



In [1]:
# clean up previous session results.
! rm -rf dataset

In [None]:
# downloading yolov3 and installing its requirements.
! git clone https://github.com/ultralytics/yolov3.git
! cd yolov3
! pip install -r yolov3/requirements.txt

In [None]:
# < !!! first, you must sign up to https://www.kaggle.com/, retrieve an api key, 
# and place it in the home dir of the session !!! more info on => https://www.kaggle.com/general/51898 >

# install kaggle.
! pip install kaggle
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json

In [None]:
# download the avengers dataset from kaggle.
! kaggle datasets download Avengers-Dataset
! unzip Avengers-Dataset.zip
! mkdir dataset
! mv images dataset
! mv labels dataset

In [5]:
# copy the custom configuration file to the yolov3 data folder.
! cp avengers.yaml yolov3/data

In [None]:
# install required versions of albumentations and opencv.
# (press 'yes' / 'y' when asked)
! pip uninstall albumentations
! pip install albumentations
! pip uninstall opencv-python-headless 
! pip install opencv-python-headless==4.1.2.30

**Augumentation:**

The code below taked every image from the 'train' folder and passes it through 3 random transformations. A new image and annotations file will be created and saved as another piece of data to train on.

In [7]:
import albumentations as A
import cv2
import glob

save_path = "dataset/images/train/"
generated = 1

transform1 = A.Compose([
    A.RandomCrop(width=100, height=100),
    A.HorizontalFlip(p=0.5),
    A.RandomBrightnessContrast(p=0.2),
], bbox_params=A.BboxParams(format='yolo', min_visibility=0.5))

transform2 = A.Compose([
        A.HorizontalFlip(p=0.5),
        A.ShiftScaleRotate(p=0.5),
        A.RandomBrightnessContrast(p=0.3),
        A.RGBShift(r_shift_limit=30, g_shift_limit=30, b_shift_limit=30, p=0.3),
    ],
    bbox_params=A.BboxParams(format='yolo', min_visibility=0.5))

transform3 = A.Compose(
    [A.CenterCrop(height=100, width=100, p=1)],
    bbox_params=A.BboxParams(format='yolo', min_area=4500, min_visibility=0.5),
)

def save_transform(image, bbox, transformation):
  global generated
  lines = []
  pic_path = save_path + "gen" +  str(generated) + ".jpg"
  label_path = pic_path[:-3] + "txt"
  label_path = label_path.replace("images", "labels")
  transformed = transformation(image=image, bboxes=bbox)
  transformed_image = transformed['image']
  transformed_bboxes = transformed['bboxes']
  cv2.imwrite(pic_path, transformed_image)
  for bbox in transformed_bboxes:
    lines.append(f"{bbox[4]} {bbox[0]} {bbox[1]} {bbox[2]} {bbox[3]}\n")
  with open(label_path, 'w') as f:
    f.writelines(lines)
  generated += 1


for pic_path in glob.iglob(save_path + '*.jpg'):
    bbox = []
    image = cv2.imread(pic_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    label_path = pic_path[:-3] + "txt"
    label_path = label_path.replace("images", "labels", 1)
    with open(label_path, "r") as label_file:
      for line in label_file:
        line = line.strip()
        data = line.split()
        bbox.append([float(data[1]), float(data[2]), float(data[3]), float(data[4]), data[0]])
    save_transform(image.copy(), bbox, transform1)
    save_transform(image.copy(), bbox, transform2)
    save_transform(image.copy(), bbox, transform3)

**Traning:**

Beginning the training process with the following parameters:


*   *freeze* => how many modules to freeze and not re-train.  (we picked 10, which is all the backcone layers, 53 layers in total will be frozen. pick 24 in order to train only the last layer, worse results tho)
*   *img* => size of the images we will train on.
*   *batch* => batch size of the images we will train on. (depends on GPU strength)
*   *epoches* => number of epochs to conclude. (around 30 is enough)
*   *data* => configuration file name. (mentioned above)
*   *weights* => making use of pre-trained weights for the frozen layers.



In [None]:
# delete all previous training result and begin the training process.
# results will be saved in yolov3/runs/train/exp and will consist of the new weights.
# (press (3) - dont visualize my results when asked)
! rm -rf yolov3/runs/train
! python yolov3/train.py --freeze 10 --img 256 --batch 16 --epochs 100 --data avengers.yaml --weights yolov3.pt

**Testing:**

Testing our newly trained machine on the test set with the following parameters:


*   *weights* => the trained weights to use for the detection. (we use the best ones we obtained from our trainig process)
*   *data* => configuration file.
*   *img* => size of the images to test on.
*   *task* => which task to apply, 'test' is configured in the configuration file and has the path to the test set.



In [None]:
# remove any previous results.
# begin the testing process with our newly trained weights.
# results will be saved in yolov3/runs/val/exp.
! rm best.pt
! cp yolov3/runs/train/exp/weights/best.pt .
! rm -rf yolov3/runs/val
! python yolov3/val.py --weights best.pt --data avengers.yaml --img 256 --task test

**Result:**

After the test on the test-set was OK, we will now test detection on a desired video.

A video named input.mp4 has to be uploaded to the home dir of the session and only then the cell can run.

to begin the detection we use the following parameters:


*   *weights* => the trained weights to use for the detection. (we use the best ones we obtained from our trainig process)
*   *imgsz* => adjusted to recieve maximal results. (most likely will be the image size trained on)
*   *conf* => minimal confidence to consider as a detection.
*   *source* => path to the media source.
*   *hide-conf* => do not show confidence probabilities in the output video.



In [None]:
# < !!! a file named 'input.mp4' has to be uploaded to the home dir of the session (drag & drop) !!! >
# remove any previous results and begin detection.
# restults will be in the 'res' folder.
! rm -rf res
! rm -rf yolov3/runs/detect
! python yolov3/detect.py --weights best.pt --imgsz 256 --conf 0.55 --source input.mp4 --hide-conf
! cp -r yolov3/runs/detect/exp ./res