<a href="https://colab.research.google.com/github/codeclassifiers/fastai_assigments/blob/master/Imageclassification/FoodImagesClassification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## As a part of assignments from fast.ai deep learning course this notebook covers how to implement a basic image classifier using fast.ai library. You can add your own categories and play around with this notebook either locally or on google colab.

In [0]:
# Mandatory notebook commands
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [0]:
#We are using fast ai libraries for image classification
from fastai.vision import *
from fastai.metrics import error_rate

### Downloading and cleaning images for creating a dataset

In [0]:
#In order to train our model we create dataset folder in which we will download images for training
dir_path = "sample_data/dataset"

We need three separate folders called train, test and valid to add images for training, testing and validation purposes. This is because we are using *from_folder* function from fast ai **ImageDataBunch** class of fast.ai vision library. Details could be found in documentation of ImageDataBunch class here [doc](https://docs.fast.ai/vision.data.html#ImageDataBunch.from_folder)

In [0]:
!mkdir $dir_path
!mkdir $dir_path/train
!mkdir $dir_path/test
!mkdir $dir_path/valid

**google_images_download** is a python package for downloading images from google images search. You can find details and arguments passed to it here: [link](https://google-images-download.readthedocs.io/en/latest/index.html)

In [13]:
#Install a helper pip module to download images from google for creating dataset
!pip install google_images_download



Below we have created an array called `categories` where we can add all the image categories between which our image classifier is supposed to distinguish between.

In [0]:
from google_images_download import google_images_download   #importing the library
categories = ["mcdonalds burger","mcdonalds french fries", "mcdonalds float", "mcdonalds nuggets"]
folder_names = [
                {"folder_name":"train","image_limit":300,"extra_keyword":"jpg"},
                {"folder_name":"test","image_limit":60,"extra_keyword":"png"},
                {"folder_name":"valid","image_limit":40,"extra_keyword":"hd"}
]

We need to install chromedriver since it is required by google-images-download package if number of images to be downloaded are more than 100

In [0]:
!sudo apt-get update
!sudo apt-get install chromium-chromedriver

In [0]:
#Create instance of google_images_download package
response = google_images_download.googleimagesdownload()   #class instantiation

In [0]:
#start downloading images of various categories in respective folders
#Outer loop iterates through categories list and inner one through train,test and valid folders
for individual_category in categories:
  for folder_item in folder_names:
    extra_keyword = folder_item['extra_keyword']
    limit = folder_item['image_limit']
    arguments = {
                "keywords":f"{individual_category} {extra_keyword}",
                "limit": limit,
                "print_urls":True, 
                "output_directory":f"{dir_path}/{folder_item['folder_name']}/",
                "image_directory": individual_category,
                "size": "medium",
                "chromedriver": "/usr/bin/chromedriver"
              }
                  #creating list of arguments
    paths = response.download(arguments)   #passing the arguments to the function

It is observed that some of the downloaded images may be corrupt and need to removed before we can pass them to our learning model. The below helper function helps in identifying and removing such images.

In [0]:
# https://github.com/drigio/FastAI-Tutorials/blob/master/Lesson-1/del_unwanted.py
import os
import sys
from PIL import Image

def remove_corrupted_images(dirname):
  cnt=0
  for filename in os.listdir(dirname):
      try:
          img=Image.open(dirname+"/"+filename)
      except OSError:
          print("FILE: ", filename, "is corrupt!")
          cnt+=1
          os.remove(dirname+filename)
      print("Successfully Completed Operation! Files Courrupted are ", cnt)

In [0]:
#remove corrupted images from folders
for individual_category in categories:
  for folder_item in folder_names:
    remove_corrupted_images(f"{dir_path}/{folder_item['folder_name']}/{individual_category}/")

### Create a dataset from downloaded images

In [0]:
#create a databunch from images
bs = 64
data = ImageDataBunch.from_folder(path='sample_data/dataset',ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)

In [0]:
#show a part of databuncg images
data.show_batch(rows=3, figsize=(7,6))

In [0]:
#get the classes created from databunch object
print(data.classes)
len(data.classes),data.c

### Train the model on dataset using transfer learning

In [0]:
#initialize cnn learner
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

In [0]:
#check out layer of the model
learn.model

In [0]:
#fit model using fit_one_cycle method
learn.fit_one_cycle(4)

In [0]:
#save the current state of model
learn.save('stage-1')

In [0]:
#optional steps
interp = ClassificationInterpretation.from_learner(learn)

losses,idxs = interp.top_losses()

len(data.valid_ds)==len(losses)==len(idxs)

In [0]:
#get top losses from the learning process
interp.plot_top_losses(9, figsize=(15,11))

In [0]:
#get top confused categories
interp.most_confused(min_val=2)

In [0]:
#optional steps if model accuracy is not great. In this case it was nor reuqired
learn.unfreeze()

In [0]:
learn.fit_one_cycle(1)

In [0]:
learn.load('stage-1');

In [0]:
learn.lr_find()

In [0]:
learn.recorder.plot()

In [0]:
learn.unfreeze()
learn.fit_one_cycle(2, max_lr=slice(1e-03,1e-02))

### Predict image category using the trained model

Make prediction on one of the training images using the model


In [0]:
img = learn.data.train_ds[0][0]
img

In [0]:
learn.predict(img)

Make prediction on externally downloaded images using same model

In [0]:
test_image_arguments = {
                "keywords":"mcdonalds french fries png",
                "limit": 1,
                "print_urls":True, 
                "output_directory":f"{dir_path}/final_test/",
                "image_directory": "frech_fries",
                "size": "medium",
                "chromedriver": "/usr/bin/chromedriver"
}
img = response.download(test_image_arguments)

In [0]:
#source:- https://medium.com/@swapp19902/image-classifier-using-fastai-and-google-colab-87dfc4e90e63
filename = f'{dir_path}/final_test/frech_fries/1.585ac06b4f6ae202fedf293b.png'
img2 = open_image(filename)

In [0]:
# See how the image looks
img2

In [0]:
prediction = learn.predict(img2)
print(prediction)

### Export the model in .pkl format ot be reused later

In [0]:
learn.export()