<a href="https://colab.research.google.com/github/ArthurCalvi/Classifieur-Bois/blob/master/Neural_net_Preprocessing_images.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Wood classification: preprocessing images**

Before giving the image to the neural network so that it emits a prediction on what the image represents, it is necessary to prepare it. Indeed, the neural network takes an input of fixed dimension: a square image of dimension 256x256 pixels. In this notebook, the images contained in the Google Drive * Raw_IMAGES * folder are prepared and saved in a new Google Drive * Prepared_IMAGES * folder.

**Part 0 - Google Drive Access** 

In this part we define access to your Google Drive where the raw images will be stored and also access to a folder where we will save the prepared images.


* NB-1: As stated in the README if this is your first time running this notebook, please create a "Project_google_colab" folder in your google drive. *

* NB-2: You will simply have to launch the two lines of code below, click on the generated link in order to authorize access to your google drive. Copy the generated key on the link and enter it below. *


In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


***First run***: The GitHub folder containing the codes, models and image directories is copied to your google drive with the following path: Goodle Drive -> Project_google_colab -> Classification-Bois.

In [None]:
%cd /content/drive/My Drive/Project_google_colab
! git clone https://github.com/ArthurCalvi/Classifieur-Bois

/content/drive/My Drive/Project_google_colab
Cloning into 'Classifieur-Bois'...
remote: Enumerating objects: 34, done.[K
remote: Counting objects: 100% (34/34), done.[K
remote: Compressing objects: 100% (32/32), done.[K
remote: Total 34 (delta 5), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (34/34), done.


***Other executions :*** If you want to update the GitHub folder.

In [None]:
%cd /content/drive/My Drive/Project_google_colab/Classifieur-Bois
! git pull

[Errno 2] No such file or directory: 'gdrive/My Drive/Classifieur-Bois'
/content/drive/My Drive/Project_google_colab
fatal: not a git repository (or any parent up to mount point /content)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).


**Part 1 - Imports**



In [None]:
#Import of the PIL image management library
from PIL import Image 

#Import of os and sys libraries to manipulate files
import sys,os

**Part 2 - Parameters**

In this part, we define the parameters such as the image width and the path to retrieve the images and save them.

In [None]:
#image width in px
desired_size = 256

#Directory to retrieve raw images
access = '/content/drive/My Drive/Project_google_colab/Classifieur-Bois/IMAGES_raw'

#Directory to save preprocessed images
save = '/content/drive/My Drive/Project_google_colab/Classifieur-Bois/IMAGES_preprocessed'

**Part 3 - Preprocessing images**

In this part we define the function that will perform the following tasks:

1. Resize the smallest side of the image to 256 pixels
2. Crop the longest side so that it is also 256 pixels in order to obtain a square image.


In [None]:
def prep_image(image):
    323/5000
    """This function takes as argument an image of any size and aspect opened
    according to the PIL Image.open method and transforms it into a square
    image of side 256 pixels. The cropping is performed such that the smallest
    side is  fixed at 256 pixels and the largest side is then cropped to obtain
    a square format."""

    #dimensions recovery
    width,height = image.size

    #smallest side recovery
    small_side = min(width,height)

    #definition of the ratio between the old and the new image
    ratio = desired_size / small_side

    #scaling
    image = image.resize((round(ratio*width), round(ratio*height)), Image.ANTIALIAS )

    #dimensions recovery
    width,height = image.size

    #definition of the region to crop
    crop_region = ( round(( width-desired_size)/2), 0 , round((width+desired_size)/2) , desired_size)

    #cropping 
    image = image.crop(crop_region)

    return image

**Part 4 - Preprocessing images of a folder and saving**

In this part we preprocess all the images of the * Classification_Wood / IMAGES_raw * folder and we save it in the * Classification_Wood / IMAGES_preprocessed * folder. The raw images are then deleted from the * Classification_Wood / IMAGES_raw folder.

In [None]:
def prep_and_save(path):
    """This function browses the images in the IMAGES_raw folder and saves it
    in the IMAGES_preprocessed folder"""

    dirs = os.listdir(path)
    i=0

    for item in dirs:
        
        super_path = path+"/"+item
        
        if os.path.isfile(super_path):
            
            image = Image.open(super_path)
            
            filename_w_ext = os.path.basename(super_path)
            filename, extension = os.path.splitext(filename_w_ext)

            image = prep_image(image)

            image.save(save+"/"+filename+"_{}.jpg".format(desired_size), 'JPEG')
            os.remove(super_path)

            i+=1

    print('{} images have been prepared and have been saved to Google Drive : My Drive/Project_google_colab/Classification-Bois/Images_preprocessed'.format(i))


prep_and_save(access)

1 images ont été préparées et ont été enregistrées dans le Google Drive My Drive/Classification_Bois/Images_préparées
