# Image Classification using pytorch and fast.ai

* for documentation and tutorials see https://docs.fast.ai/index.html
* for the courses etc see https://www.fast.ai/

In [0]:
%matplotlib inline
%reload_ext autoreload
%autoreload 2

In [0]:
import torch
import fastai
from fastai.vision import *
from fastai.utils.show_install import *

torch.cuda.set_device(0) # this controls which gpu should be used
show_install()

## connect to your google drive

In [0]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"

## define paths to data

In [0]:
PATH = Path(root_dir + 'beer_pizza')
DATA_PATH = PATH / 'training_data'
EXAMPLE_PATH = PATH / 'test_data'
DATA_PATH.ls()

## collect data

* to run this file you need to provide some image data (as mentioned I can not provide the beer and pizza images)
* you can either collect the imagery yourself using google image search or some other source
* or you could use the fast.ai `download_google_images()` function like this:

In [0]:
# from fastai.widgets import *
# labels = ['beer', 'pizza']
# for label in labels: 
#     download_google_images(DATA_PATH, label, size='>400*300', n_images=100)

* this will download images for the provided labels from google image search to the provided path
* note that this most likely will result in some crashes of the function, however in my short experiments it still downloaded the images
* you have to manually check if there are images which are not valid
* note that for more than 100 images you need to install chromedriver, have look here: https://docs.fast.ai/widgets.image_cleaner.html#ImageDownloader

## define transformations

* the transformations are defined as a tuple of two arrays
* the first array contains the transformations for the training data
* the second array contains the transformations for the validation data
* you can use `get_transformations()` to directly get that tuple containing some default transformations
* note that not every transformation makes sense for every dataset


In [0]:
transforms = ([
    rotate(degrees=3.0, p=0.5),
#     brightness(change=(0.05, 0.05), p=0.5),
#     contrast(scale=(0.05, 0.05), p=0.5),
    flip_affine(p=0.5)
], [])

## create dataset
* the fast.ai library provides many different options to create a dataset, have a look at the documentation!
* note that the `data.show_batch()` does not apply the final normalization, thats why the images look naturally


In [0]:
image_size = 224 # the architecture is designed to consume images of size 224 x 224
batch_size = 12

data = (
  ImageList.from_folder(DATA_PATH)
    .split_by_rand_pct(seed=0, valid_pct=0.4)
    .label_from_folder() # this selects the folder names as labels
    .transform(transforms, size=image_size, resize_method=ResizeMethod.SQUISH, padding_mode='zeros')
    .databunch(bs=batch_size)
    .normalize(imagenet_stats)
)
print(data)
data.show_batch()

## creater learner object

* the `cnn_learner` creates a network and loads a pretrained model (so transfer learning is the default)
* it can automatically remove the fully connected head and create a new one for the classes provided in the dataset
* it does all the heavy lifting: it runs the training loop, loads data, computes metrics, plots data, etc.
* you can add stuff, like metrics, by implementing callbacks
* we use a ResNet-34 
* and add accuracy as an additional metric
* note that the learner freezes the backend part initially, you can unfreeze it using `learner.unfreeze()` to also train the backend
* by default fast.ai rel

In [0]:
learner = cnn_learner(data, models.resnet34, metrics=accuracy, callback_fns=[ShowGraph])

## run learning rate finder

* use this to get an idea how different learning rates behave
* this is more of an intuition guided thing than pure science
* try to pick a learning rate which is quite high and in a steep area
* stay away from the minimal point or fast changing regions
* experiment and you will get a feeling for that
* note that the `learner.recorder.plot()` removes some of the samples at the front and back, this can be changed using parameters 

In [0]:
learner.lr_find()
learner.recorder.plot()

## run training in a "traditional" way

* pick one learning rate and let it run for some epochs
* you can optionally provide a learning rate scheduler to the learner object or the `.fit()` function

In [0]:
learner.fit(10, 1e-3)

## run training using the "one cycle" scheme

* use training scheme where the learning rate is changed using ramps
* it is based on this paper https://arxiv.org/pdf/1803.09820.pdf
* it should improve training times and lead to "super convergence"
* the number epochs provided here defines how long the ramp is
* experiment with the number of epochs

In [0]:
# uncomment the the following to try the one cycle scheme
# learner = cnn_learner(data, models.resnet34, metrics=accuracy, callback_fns=[ShowGraph]) # we recreate the learner to start from the beginning

In [0]:
# lr = 1e-2
# learner.fit_one_cycle(4, max_lr=lr)

#have a look at the data

* predict
* visualize "difficult" data
* see heat maps for important parts of the images (see Grad-CAM)
* compute a confusion matrix

## get some visualizations for the networks performance

In [0]:
interp = ClassificationInterpretation.from_learner(learner)
losses, idxs = interp.top_losses()

In [0]:
interp.plot_top_losses(9, figsize=(15,11))

In [0]:
interp.plot_confusion_matrix(figsize=(6,6), dpi=100)

## let the network do predictions on single images

In [0]:
test_files = [
    'car.jpg',
    'wine.jpg',
    'beer_pizza1.jpg',
    'beer_pizza2.jpg',
    'cake.jpg'
]

image = open_image(EXAMPLE_PATH/test_files[4])
image = image.resize(224)
image_data = normalize(image.data, torch.tensor(imagenet_stats[0]), torch.tensor(imagenet_stats[1]))
image = Image(image_data)
image.show()
learner.predict(image)