<a href="https://colab.research.google.com/github/ArthurCalvi/Classifieur-Bois/blob/master/Filter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **WOOD CLASSIFIER - FILTER**

In this notebook, I provide a code to filter out false positives of a dataset with the help of the neural network. The net is used as a binary classifier to remove false positives from the dataset. 

## **PART 0 : Google Drive Access**


In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


***First run:*** The GitHub folder containing the codes, models and image directories is copied to your google drive with the following path: Google Drive -> Project_google_colab -> Classification-Bois.

In [None]:
%cd /content/drive/My Drive/Project_google_colab
! git clone https://github.com/ArthurCalvi/Classifieur-Bois

/content/drive/My Drive/Project_google_colab
fatal: destination path 'Classifieur-Bois' already exists and is not an empty directory.


***Other executions***: If you want to update the GitHub folder:

In [2]:
%cd /content/drive/My Drive/Project_google_colab/Classifieur-Bois
! git pull

/content/drive/My Drive/Project_google_colab/Classifieur-Bois
remote: Enumerating objects: 4, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (3/3), done.[K
remote: Total 3 (delta 1), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (3/3), done.
From https://github.com/ArthurCalvi/Classifieur-Bois
   95d35b0..95e17a7  master     -> origin/master
Updating 95d35b0..95e17a7
Fast-forward
 Creation_of_your_dataset.ipynb | 422 [32m+++++++++++++++++++++++++++++++++++++++++[m
 1 file changed, 422 insertions(+)
 create mode 100644 Creation_of_your_dataset.ipynb


## **PART 1: Creation of the Dataset**

In this part all the images of the *IMAGES_raw* folders are first preprocessed and saved into the *IMAGES_preprocessed* folder, then the neural network is used to filter out the false positives and save the images of defects in the */IMAGES_filtered* folder. 

*NB : WARNING! All the images stored in the IMAGES_preprocessed folder are filtered.*

### **Parameters**

Please upload all the images in the IMAGES_raw folder on google drive.

In [3]:
#image width in px
desired_size = 256

#Nbr of augmented images used for prediction
nbr_images = 10

### **Script**

In [4]:
##IMPORTS
#Import of the PIL image management library
from PIL import Image 

#Import of os and sys libraries to manipulate files
import sys,os

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

#Imports of github personal files
from custom_functions_v1 import crop_generator, random_crop, colorize_v2

# Keras API
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import img_to_array

from tensorflow.keras.models import load_model

##FUNCTIONS
def prep_image(image):
    323/5000
    """This function takes as argument an image of any size and aspect opened
    according to the PIL Image.open method and transforms it into a square
    image of side 256 pixels. The cropping is performed such that the smallest
    side is  fixed at 256 pixels and the largest side is then cropped to obtain
    a square format."""

    #dimensions recovery
    width,height = image.size

    #smallest side recovery
    small_side = min(width,height)

    #definition of the ratio between the old and the new image
    ratio = desired_size / small_side

    #scaling
    image = image.resize((round(ratio*width), round(ratio*height)), Image.ANTIALIAS )

    #dimensions recovery
    width,height = image.size

    #definition of the region to crop
    crop_region = ( round(( width-desired_size)/2), 0 , round((width+desired_size)/2) , desired_size)

    #cropping 
    image = image.crop(crop_region)

    return image

def prep_and_save(path):
    """This function browses the images in the IMAGES_raw folder and saves it
    in the IMAGES_preprocessed folder"""

    dirs = os.listdir(path)
    i=0

    for item in dirs:
        
        super_path = path+"/"+item
        
        if os.path.isfile(super_path):
            
            image = Image.open(super_path)
            
            filename_w_ext = os.path.basename(super_path)
            filename, extension = os.path.splitext(filename_w_ext)

            image = prep_image(image)

            image.save(save+"/"+filename+"_{}.jpg".format(desired_size), 'JPEG')
            os.remove(super_path)

            i+=1

    print('{} images have been prepared and have been saved to Google Drive : My Drive/Project_google_colab/Classification-Bois/Images_preprocessed'.format(i))


def prediction(img,nbr_images):
    """ 
    Perform the prediction of the image [img] from [nbr_images] augmented
    images.
    
    INPUT :
        -img : array loaded with load_img 
        -nbr_images : integer representing the number of augmented images used
         for prediction 
        
        
    OUTPUT :
        -prediction : final prediction averaged over several images
    """
    
    # convert to numpy array
    data = img_to_array(img)
    # expand dimension to one sample
    samples = np.expand_dims(data, 0)
    
    # create data for the test
    datagen = ImageDataGenerator( fill_mode='reflect', 
                                  samplewise_center=True,
                                  samplewise_std_normalization=True,
                                  horizontal_flip=True, vertical_flip=True, 
                                  rotation_range=10, brightness_range= [0.6,1.4], 
                                  preprocessing_function = colorize_v2, 
                                  zoom_range = [1.0,1.3])
    
    batch = datagen.flow(samples, batch_size=1)
    
    #add random cropped
    prediction = []
    
    for i in range(nbr_images):
        
        img = batch.next()
        img = random_crop(img[0].astype('float32'), (224,224))
        img = np.expand_dims(img, 0)
        prediction.append(model.predict(img))
    
        
    prediction = sum(prediction)/nbr_images
    prediction = np.array(prediction).tolist()[0][0]
    
    return prediction  



##SCRIPT

#Directory to retrieve raw images
access = '/content/drive/My Drive/Project_google_colab/Classifieur-Bois/IMAGES_raw'

#Directory to save preprocessed images
save = '/content/drive/My Drive/Project_google_colab/Classifieur-Bois/IMAGES_preprocessed'

#Preparation of the images
prep_and_save(access)

#Loading model
model = load_model('/content/drive/My Drive/Project_google_colab/Classifieur-Bois/MODEL_CNN1_bs32_ep100_augTrue_t1593511641.h5')

#Use of the neural net to classify and save images
dirs = os.listdir(save)

i=0
d=0
fp=0
for item in dirs:
    
    if os.path.isfile(save+'/'+item):

        img = load_img(save+'/'+item)
        score = prediction(img, nbr_images)
        #revoir proba
        if score>0.5:
            img.save('/content/drive/My Drive/Project_google_colab/Classifieur-Bois/IMAGES_filtered/'+item)
            i+=1
            d+=1
        else : 
            i+=1
            fp+=1

print("-----------------------------------------------------")  
print("{0} false positives detected ({1:.2%} of the dataset)".format(fp, fp/i))
print("{0} images kept ({1:.2%} of the dataset)".format(d, d/i))
print("-----------------------------------------------------") 

0 images have been prepared and have been saved to Google Drive : My Drive/Project_google_colab/Classification-Bois/Images_preprocessed
-----------------------------------------------------
2 false positives detected (20.00% of the dataset)
8 images kept (80.00% of the dataset)
-----------------------------------------------------


### **Warning !**

The false positives of the dataset have been filtered out by the neural network wich have an overall accuracy of 94%. 