## NOTEBOOK OBJECTIVE
The Objective of this notebook is to perform various augmentations (geometrical as well as color based) on the **1008 professional+amateur tennis match frames** extracted in Module1_Step2 and whose tennis ball positions have been labeled in YOLO fromat using the opensource tool **labelimg**. We will be using the python package **abumentations** for video frame augmentation. Data Augmentation is expected to benefit the Deep Learning based YOLOV4-tiny Object Detection Model.

Wherever there is a comment for **## UPDATE** the code (often a path) needs to be updated

Libraries involved:
1. opencv-python-headless==4.5.5.62 
2. albumentations
3. os
4. shutil
5. random
6. zipfile
7. watermark

Steps involved:
1. STEP1: Mounting Drive 
2. STEP2: Installing & Importing Libraries and setting working directory
3. STEP3: Loading the Image Folder, cleaning the class and bounding box text file if needed (often the original bounding box would have multiple detected class IDs which need to be revised as we are only detecting a single class -the tennis ball)
4. STEP4. Zipping the original image folder, as this zipped file will be used in the YOLOv4-tiny Object Detection process
5. STEP5. Augmenting the original images using 3 albumenations based augmentations. Only labeled augmented images selected.
6. STEP6. Zipping the Augmented image folder, as this zipped file will be used in the YOLOv4-tiny Object Detection process
7. STEP7. Dependencies

Inputs will include:
1. Set of labeled Image files

Outputs will include:
1. Zipped file of labeled original images
2. Zipped file of labeled original+augmented images

Source:
1. https://analyticsindiamag.com/hands-on-guide-to-albumentation/
2. https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/
3. https://github.com/tzutalin/labelImg

## STEP1: Mounting Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## STEP2: Installing & Importing Libraries

In [2]:
# Installing Libraries

!pip uninstall opencv-python-headless==4.5.5.62 
!pip install opencv-python-headless==4.1.2.30
!pip install --upgrade albumentations
!pip install watermark

Collecting opencv-python-headless==4.1.2.30
  Downloading opencv_python_headless-4.1.2.30-cp37-cp37m-manylinux1_x86_64.whl (21.8 MB)
[K     |████████████████████████████████| 21.8 MB 1.2 MB/s 
Installing collected packages: opencv-python-headless
Successfully installed opencv-python-headless-4.1.2.30
Collecting albumentations
  Downloading albumentations-1.1.0-py3-none-any.whl (102 kB)
[K     |████████████████████████████████| 102 kB 6.8 MB/s 
Collecting qudida>=0.0.4
  Downloading qudida-0.0.4-py3-none-any.whl (3.5 kB)
Installing collected packages: qudida, albumentations
  Attempting uninstall: albumentations
    Found existing installation: albumentations 0.1.12
    Uninstalling albumentations-0.1.12:
      Successfully uninstalled albumentations-0.1.12
Successfully installed albumentations-1.1.0 qudida-0.0.4
Collecting watermark
  Downloading watermark-2.3.0-py2.py3-none-any.whl (7.2 kB)
Collecting importlib-metadata<3.0
  Downloading importlib_metadata-2.1.3-py2.py3-none-any.whl

In [None]:
# Restarting post installing libraries -if needed
import os
os.kill(os.getpid(), 9)

In [3]:
# importing libraries
import albumentations as A
import cv2
import os
import shutil
from zipfile import ZipFile
import random

In [4]:
# Set Directory path

path="/content/drive/MyDrive/CAPSTONE/CAPSTONE_FINAL/Module2_Object_Detection_Yolov4_tiny"  ## UPDATE
os.chdir(path) # Change Path

## STEP3: Loading the Image Folder, cleaning the class and bounding box text file if needed

Manually copy the *obj_proam_1008* folder (output of manual object labelling on selected extracted video frames from Module1_Step2 using labelimg ) from the BIG_Files_Folders folder to images_labels google drive folder link (https://drive.google.com/drive/folders/1-skl0_iiKYZDJPM6hhFxCqmW-cgqGh3q?usp=sharing) in Module2_Object_Detection_Yolov4_tiny folder.

In [17]:
# Copy initial image folder
source_fldr = r"images_labels/obj_proam_1008" ## UPDATE
copy_folder=r"images_labels/obj_proam_1008_copy"  ## UPDATE
try:
  shutil.copytree(source_fldr, copy_folder)
except:
  # delete previous folder if present
  shutil.rmtree(copy_folder)
  shutil.copytree(source_fldr, copy_folder)


# Open all files in the image folder
folder1=copy_folder
file_list_yolo=os.listdir(folder1)

print(len(file_list_yolo))

2017


In [18]:
# Print available images and text files
image_file_list=[]
bbox_file_list=[]

for fil in file_list_yolo[:]:
  if str(fil).split('.')[1]=="jpg":
    image_file_list.append(fil)

  elif str(fil).split('.')[1]=="txt":
    bbox_file_list.append(fil)

print('image_file_list')
print(len(image_file_list))
print(image_file_list[:5])

print('-'*150)
print('-'*150)
print('bbox_file_list')
print(len(bbox_file_list))
print(bbox_file_list[:5])


image_file_list
1008
['vid_am_2_frame_756.jpg', 'vid_am_2_frame_868.jpg', 'vid_am_2_frame_882.jpg', 'vid_am_2_frame_770.jpg', 'vid_am_2_frame_910.jpg']
------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------
bbox_file_list
1009
['vid_am_4_frame_1560.txt', 'vid_am_4_frame_1620.txt', 'vid_am_4_frame_1650.txt', 'vid_am_4_frame_1710.txt', 'vid_am_4_frame_1800.txt']


In [19]:
## Modify classes file and bbox file contents

# Modifying classes text file
for fil in bbox_file_list:
  fil_path=folder1+"/"+str(fil)
  if str(fil).split('.')[0]=="classes":
    with open(fil_path,'r') as f:
      contents = f.readlines()
    contents='ball'
       
    with open(fil_path,'w') as f:
      f.writelines(contents)

    


# Modifying bbox file
for fil in bbox_file_list[:]:
  fil_path="images_labels/obj_proam_1008_copy"+"/"+str(fil) ## UPDATE
  if str(fil).split('.')[0]!="classes":

    with open(fil_path,'r') as f:
      contents = f.readlines()

    # # Update bbox file
    contents_2=[]
    for c1 in contents:
      if c1.split(' ')[0]=='15': # If the class is not updated while using labelimg, the additional class of ball becomes class 15
        c2=c1.split(' ')[1:]
        c2='0'+' '+ ' '.join(c2)
        contents_2.append(c2)

      elif c1.split(' ')[0]=='0': # If the class is updated while using labelimg, the additional class of ball becomes class 15
        c2=c1.split(' ')[1:]
        c2='0'+' '+ ' '.join(c2)
        contents_2.append(c2)

    #   # Update bbox file
    with open(fil_path,'w') as f:
      f.writelines(contents_2)



## STEP4. Zipping the original image folder, as this zipped file will be used in the YOLOv4-tiny Object Detection process

In [20]:
# Zipping the folder of original images

# Zip the folder
full_folder="images_labels/obj_proam_1008" ## UPDATE
zip_folder="ZIPPED_images_labels/obj_proam_1008_original_zip" ## UPDATE
shutil.make_archive(zip_folder, 'zip',full_folder)

# Original folder count
original_len=len(os.listdir(full_folder))


# Count contents in the zipped folder
with ZipFile("ZIPPED_images_labels/obj_proam_1008_original_zip.zip", 'r') as zipObj: ## UPDATE
   # Get list of files names in zip
  listOfiles = zipObj.namelist()
  zip_len=len(listOfiles)

# Are the number of files pre and post zipping same
print ('Difference in file count pre and post zipping',original_len-zip_len)

Difference in file count pre and post zipping 0


## STEP5. Augmenting the original images using 3 albumenations based augmentations. Only labeled augmented images selected.


In [21]:
## Setting random seed to ensure reproducibility of augmentation
random.seed(42)

transform1 = A.Compose([
    A.RandomCrop(width=500, height=500),
   
], bbox_params=A.BboxParams(format='yolo', ))

transform2 = A.Compose([
    
    A.HorizontalFlip(p=1),
    
], bbox_params=A.BboxParams(format='yolo', ))

transform3 = A.Compose([
    
    A.RandomBrightnessContrast(p=1),

], bbox_params=A.BboxParams(format='yolo',  ))





In [22]:
%%time
# Save a set of Augmentations and BBoxes for Input Images

# Create Folder and copy contents
source_fldr="images_labels/obj_proam_1008"  ## UPDATE
destination_fldr = r"images_labels/obj_proam_1008_aug" ## UPDATE
try:
  shutil.copytree(source_fldr, destination_fldr)
except:
  # delete previous folder if present
  shutil.rmtree(destination_fldr)
  shutil.copytree(source_fldr, destination_fldr)


folder1=destination_fldr

aug_lst=[transform1,transform2,transform3,
                 ]

aug_lst_name=['transform1','transform2','transform3',
                 ]

for img in image_file_list[:]: 
  bbox_file=img.split('.')[0]+'.txt'
  
  image = cv2.imread(folder1+"/"+img) 
  image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

 

  # Read Bbox
  bbox_file_path=folder1+"/"+bbox_file 
  
  lst_of_lst=[[]]
  fil = open(bbox_file_path, 'r')

  for line in fil:
    stripped_line = line.strip()
    line_list = stripped_line.split()
    lst_of_lst.append(line_list)
  fil.close()

  # Pop 1st element in list of list
  lst_of_lst.pop(0)

  # For each list in list of list pop 1st item
  lst_of_lst=[a[1:] for a in lst_of_lst]

  # converting values to float
  lst_of_lst=[[float(a) for a in b] for b in lst_of_lst]
 
  # For each list in list of list append 'balls'
  abc=[]

  for a in lst_of_lst:
    a.append('ball')
    abc.append(a) 

  lst_of_lst=abc
  bboxes=lst_of_lst


  for idx,val in enumerate(aug_lst[:]): # Update to blank
    transformed=val(image=image,bboxes=bboxes)
    transformed_image = transformed['image']
    transformed_bboxes = transformed['bboxes']


    aug_nam=aug_lst_name[idx]

    # Save Transformed txt file
    save_txt_path=folder1+"/"+img.split('.')[0]+"_"+aug_nam+".txt"
  
    txt=''
    for s in transformed_bboxes:
      # Pop last item from set
      s=s[:-1]

      s=' '.join(map(str,s))
      s='0 '+s
      txt+=s+'\n'

    # Only capturing the augmented images which have annotation, as augmentation could end up removing the part of the image with detected ball
    if len(transformed_bboxes)>0: 
      # print('len(transformed_bboxes)>0')
       # Save Transformed image and text
      save_image_path=folder1+"/"+img.split('.')[0]+"_"+aug_nam+".jpg"
      cv2.imwrite(save_image_path, transformed_image)
      with open(save_txt_path, "w") as f:
        f.write(txt)


    


    

  




  




CPU times: user 1min 32s, sys: 7.08 s, total: 1min 39s
Wall time: 3min 37s


In [23]:
import os

folder_len=len(os.listdir(folder1))

print('folder_len',folder_len)



folder_len 7051


## STEP6. Zipping the Augmented image folder, as this zipped file will be used in the YOLOv4-tiny Object Detection process

In [24]:
# Zip the folder
full_folder="images_labels/obj_proam_1008_aug"  ## UPDATE
zip_folder="ZIPPED_images_labels/obj_proam_1008_aug_zip" ## UPDATE
shutil.make_archive(zip_folder, 'zip',full_folder)

'/content/drive/MyDrive/CAPSTONE/CAPSTONE_FINAL/Module2_Object_Detection_Yolov4_tiny/ZIPPED_images_labels/obj_proam_1008_aug_zip.zip'

In [25]:
# Count contents in the zipped folder
from zipfile import ZipFile

with ZipFile("ZIPPED_images_labels/obj_proam_1008_aug_zip.zip", 'r') as zipObj: # UPDATE
   # Get list of files names in zip
  listOfiles = zipObj.namelist()
  zip_len=len(listOfiles)
print(zip_len)

7051


In [26]:
# Are the number of files pre and post zipping same
print ('Difference in file count pre and post zipping',folder_len-zip_len)

Difference in file count pre and post zipping 0


In [27]:
print('End of Augmentation for 1008 proam Images')

End of Augmentation for 1008 proam Images


## STEP7. Dependencies

In [28]:
# Dependencies
%reload_ext watermark
%watermark
%watermark --iversions

Last updated: 2022-04-20T01:25:44.945649+00:00

Python implementation: CPython
Python version       : 3.7.13
IPython version      : 5.5.0

Compiler    : GCC 7.5.0
OS          : Linux
Release     : 5.4.144+
Machine     : x86_64
Processor   : x86_64
CPU cores   : 2
Architecture: 64bit

google        : 2.0.3
IPython       : 5.5.0
albumentations: 1.1.0
cv2           : 4.1.2

