## Basic scripting with Python

Create or find small dataset of imgs, using an online data source such as Kaggle. At the very least, your dataset should contain no fewer than 10 images.

Write a Python script which does the following:
- For each image, find the width, height, and number of channels
- For each image, split image into four equal-sized quadrants (i.e. top-left, top-right, bottom-left, bottom-right)
- Save each of the split images in JPG format
- Create and save a file containing the filename, width, height for all of the new images.


__Import libraries__

In [1]:
import os
import numpy as np 
import cv2 
import csv 
from pathlib import Path 

In [2]:
import sys
sys.path.append(os.path.join(".."))
from utils.imutils import jimshow #Import utility function jimshow

__Functions__

This function saves images. It takes the destination path for the image, the name (without the destination path) and an image object as parameters. 

In [3]:
def save_img(img, path, name):
    outfile = os.path.join(path, f"{name}.jpg")
    cv2.imwrite(outfile, img)

This function splits an image object into four equally sized squares. It converts doubles and floats to integers, since an image array only takes integers as values. It always returns an array called "all_imgs", which contains the splitted images. The index will always show the following slices:
- 0: upper left corner
- 1: lower left corner
- 2: Upper right corner
- 3: Lower right corner

In [4]:
def split_img(img):
    #Find the size of height and widt divided by half
    split_x = int(img.shape[1]/2)
    split_y = int(img.shape[0]/2) 
    
    #Split img 
    upper_left_corner = img[0:split_y, 0:split_x]
    bottom_left_corner = img[split_y:, 0:split_x]
    upper_right_corner = img[0:split_y, split_x:]
    bottom_right_corner = img[split_y:, split_x:]
    
    #Save results in an array
    all_imgs = [upper_left_corner, bottom_left_corner, upper_right_corner, bottom_right_corner]
    return all_imgs

This function adds rows to a csv-file with 5 columns: "folder, filename, height, widt and channels". It requires that an open writer object. 

In [5]:
def add_csv_row (folder, file_name, height, width, channels):
    writer.writerow({"folder": folder,
                     "filename": file_name,
                     "height": height,
                     "width": width,
                     "channels": channels})

My data contains multiple subfolders with different animals, and subfolders for splitted images.
- raw-img
    - cane
        - splitted_images
    - cavallo
        - splitted_images
    - elefante
        - splitted_images
    - etc
        - etc.

I start by creating a new csv-file.
I then use a loop to navigate between folders. With a nested loop I then split each image and calculate height, width and channels. 

With another nested loop I save each splitted image in the subfolder called "splitted_images" and add a new row to the csv-file. Since I don't know how to use regexes in python yet, I use indexes (the variable "img_index") to name the new images. Moreover I also use indexes "splitted_img_index" to determine, if the new file should be called "left-upper-corner", "left-lower-corner", "right-upper-corner" or "right-lower-corner".

__Loop__

In [6]:
main_folder = os.path.join("..", "data", "raw-img") #Path to folders with images
csv_path = os.path.join("..", "data", "new_imgs.csv") #Path to the new csv-file

with open(csv_path, mode = "w") as csv_file:#create and open new csv-file
    writer = csv.DictWriter(csv_file, fieldnames=["folder", "filename", "height", "width", "channels"]) #create writer object
    writer.writeheader() #add headers
    
    #for each subfolder in raw-img
    for folder_path in Path(main_folder).glob("*"):
        print(f"---------------------------------------------FOLDER {str(folder_path)}!-----------------------------------------")
        img_index = 0 #image index in the folder. Used for naming new files
        img_destination = os.path.join(folder_path, "splitted_images") #Destination for splitted images. Used for saving images
    
        #For each file in the subfolder. 
        for file in Path(folder_path).glob("*.*"):
            file = str(file) #Convert filepath to a string.
            original_img = cv2.imread(file) #Read the image
            height = original_img.shape[0] #Calculate height
            width = original_img.shape[1] #Calculate width
            channels = original_img.shape[2] #Calculate number of channels
            
            print(f"file {file}, Height: {height}, Width: {width}, Channels: {channels}") #print tihs
            
            splitted_img_index = 0 #Set index for splitted images to 0
            for splitted_img in split_img(original_img): #For each splitted image in the returned array from split_img
                height_splitted = splitted_img.shape[0] #calculate height
                width_splitted = splitted_img.shape[1] #calculate width
                channels_splitted = splitted_img.shape[2] #Calculate channels
                
                if splitted_img_index == 0: #If index for splitted image is 0
                    img_name =  f"file-{str(img_index)}-left-upper-corner" #Use this name
                    save_img(splitted_img, img_destination, img_name) #Save image
                    add_csv_row(img_destination, img_name, height_splitted, width_splitted, channels_splitted) #Add row to csv
                    
                    splitted_img_index = splitted_img_index + 1 #Add 1 to index for splitted images
                
                elif splitted_img_index == 1: #Else If index for splitted image is 1
                    img_name =  f"file-{str(img_index)}-left-lower-corner" #Use this name
                    save_img(splitted_img, img_destination, img_name) #Save image
                    add_csv_row(img_destination, img_name, height_splitted, width_splitted, channels_splitted) #Add row to csv
                    
                    splitted_img_index = splitted_img_index + 1 #Add 1 to index for splitted images
                            
                elif splitted_img_index == 2: #Else If index for splitted image is 2
                    img_name =  f"file-{str(img_index)}-right-upper-corner"  #Use this name
                    save_img(splitted_img, img_destination, img_name) #Save image
                    add_csv_row(img_destination, img_name, height_splitted, width_splitted, channels_splitted) #Add row to csv
                    
                    splitted_img_index = splitted_img_index + 1 #Add 1 to index for splitted images
                    
                else: #The last index will always be 3, so no condition is required
                    img_name =  f"file-{str(img_index)}-right-lower-corner" #Use this name
                    save_img(splitted_img, img_destination, img_name) #Save image
                    add_csv_row(img_destination, img_name, height_splitted, width_splitted, channels_splitted) #Add row to csv
                    #No need to add 1 to the splitted images index, since the loop will end.
                    
            img_index = img_index + 1 #Add one to image index 

---------------------------------------------FOLDER ../data/raw-img/cane!-----------------------------------------
file ../data/raw-img/cane/OIP-_5yuhCcjtUE3kLT33YEvHQHaJ4.jpeg, Height: 300, Width: 225, Channels: 3
file ../data/raw-img/cane/OIP-_3acmW_iSr12XgQTNz0IdQHaFj.jpeg, Height: 225, Width: 300, Channels: 3
file ../data/raw-img/cane/OIF-e2bexWrojgtQnAPPcUfOWQ.jpeg, Height: 225, Width: 300, Channels: 3
file ../data/raw-img/cane/OIP-_3S-iEDMQnko7ZHgq_FTcwHaEL.jpeg, Height: 169, Width: 300, Channels: 3
file ../data/raw-img/cane/OIP-_4M8lLVlk06o0YOtolSlvQHaHL.jpeg, Height: 291, Width: 300, Channels: 3
file ../data/raw-img/cane/OIP-_5Em--O1RA44HxiWK_ybawHaF4.jpeg, Height: 238, Width: 300, Channels: 3
file ../data/raw-img/cane/OIP-_5GCQGVN9m7ed1UX_dUtTQHaFv.jpeg, Height: 233, Width: 300, Channels: 3
file ../data/raw-img/cane/OIP-_2Itmpob3Q0nbJKrHvtnfAHaJ3.jpeg, Height: 300, Width: 226, Channels: 3
file ../data/raw-img/cane/OIP-_2iBsOsobKZsP76-9Cd-qAHaEM.jpeg, Height: 170, Width: 300, C

file ../data/raw-img/ragno/e034b90b20f11c22d2524518b7444f92e37fe5d404b0144390f8c47ba6ebb4_640.jpg, Height: 401, Width: 640, Channels: 3
file ../data/raw-img/ragno/e83cb00a2bf0053ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb3b9_640.jpg, Height: 426, Width: 640, Channels: 3
file ../data/raw-img/ragno/e83cb60f2dfd1c22d2524518b7444f92e37fe5d404b0144390f8c47ba7ebb0_640.jpg, Height: 423, Width: 640, Channels: 3
file ../data/raw-img/ragno/e83cb30c2bf6043ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb5b1_640.jpg, Height: 426, Width: 640, Channels: 3
file ../data/raw-img/ragno/e83cb0062ff5073ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb3b9_640.jpg, Height: 422, Width: 640, Channels: 3
file ../data/raw-img/ragno/e83cb2082bfd013ed1584d05fb1d4e9fe777ead218ac104497f5c97ca5ecb5b1_640.jpg, Height: 480, Width: 640, Channels: 3
---------------------------------------------FOLDER ../data/raw-img/gatto!-----------------------------------------
file ../data/raw-img/gatto/10.jpeg, Height: 188, Width: 300,