# Flowers classification - Oxford flowers 102 dataset 


We have created a 102 category dataset, consisting of 102 flower categories. The flowers chosen to be flower commonly occuring in the United Kingdom. Each class consists of between 40 and 258 images. The details of the categories and the number of images for each class can be found on this category statistics page.

The images have large scale, pose and light variations. In addition, there are categories that have large variations within the category and several very similar categories. The dataset is visualized using isomap with shape and colour features.

In [0]:
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [0]:
# Set up fastai for collab 
!curl -s https://course.fast.ai/setup/colab | bash
!pip uninstall torch torchvision -y
!pip install torch==1.4.0 torchvision==0.5.0

Set up drive access for file storage 

In [0]:
#from google.colab import drive
#drive.mount('/content/gdrive', force_remount=True)
#root_dir = "/content/gdrive/My Drive/"
#base_dir = root_dir + 'fastai-v3/'

Import fastai libs

In [0]:
from fastai.vision import *
from fastai.metrics import error_rate

If you're using a computer with an unusually small GPU, you may get an out of memory error when running this notebook. If this happens, click Kernel->Restart, uncomment the 2nd line below to use a smaller *batch size* (you'll learn all about what this means during the course), and try again.

In [0]:
bs = 64
# bs = 16   # uncomment this line if you run out of memory even after clicking Kernel->Restart

## Looking at the data / Data preparation


https://www.robots.ox.ac.uk/~vgg/data/flowers/102/

We are going to use the `untar_data` function to which we must pass a URL as an argument and which will download and extract the data.

In [0]:
path = untar_data(URLs.FLOWERS); path

In [0]:
path.ls()

In [0]:
#path_anno = path/'annotations'
path_img = path/'jpg'
path_train = path/'train.txt'
path_valid = path/'valid.txt'
path_test = path/'test.txt'

The first thing we do when we approach a problem is to take a look at the data. We _always_ need to understand very well what the problem is and what the data looks like before we can figure out how to solve it. Taking a look at the data means understanding how the data directories are structured, what the labels are and what some sample images look like.

Labels are stored in txt files - **train.txt** and **valid.txt** 

In [0]:
fnames = get_image_files(path_img)
fnames[:5]

In [0]:
#f = open (path_train, "r")
#print("Train data ---")
#print(f.readline());
#print(f.readline())

#print("Valid data ---")
#f = open (path_valid, "r")
#print(f.readline())
#print(f.readline())

#print("Test data ---")
#f = open (path_test, "r")
#print(f.readline())
#print(f.readline())

Create data frames for test and valid. Merge into single (train_df) as ImagedataBunch allocated 20% validation set. 

In [0]:
import pandas as pd

train_df = pd.read_fwf(path_train, header=None)
valid_df = pd.read_fwf(path_valid, header=None)
train_df = pd.concat ([train_df, valid_df])
train_df.sort_index

In [0]:
data = ImageDataBunch.from_df(path, train_df, ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)

In [0]:
data.show_batch(rows=3, figsize=(7,6))

In [0]:
print(data.classes)
len(data.classes),data.c

## Training: resnet34

Now we will start training our model. We will use a [convolutional neural network](http://cs231n.github.io/convolutional-networks/) backbone and a fully connected head with a single hidden layer as a classifier. Don't know what these things mean? Not to worry, we will dive deeper in the coming lessons. For the moment you need to know that we are building a model which will take images as input and will output the predicted probability for each of the categories (in this case, it will have 37 outputs).

We will train for 4 epochs (4 cycles through all our data).

In [0]:
learn = cnn_learner(data, models.resnet34, metrics=error_rate)

In [0]:
learn.model

In [0]:
learn.fit_one_cycle(8)

In [0]:
learn.save('stage-1')

## Results

Let's see what results we have got. 

We will first see which were the categories that the model most confused with one another. We will try to see if what the model predicted was reasonable or not. In this case the mistakes look reasonable (none of the mistakes seems obviously naive). This is an indicator that our classifier is working correctly. 

Furthermore, when we plot the confusion matrix, we can see that the distribution is heavily skewed: the model makes the same mistakes over and over again but it rarely confuses other categories. This suggests that it just finds it difficult to distinguish some specific categories between each other; this is normal behaviour.

In [0]:
interp = ClassificationInterpretation.from_learner(learn)

losses,idxs = interp.top_losses()

len(data.valid_ds)==len(losses)==len(idxs)

In [0]:
interp.plot_top_losses(9, figsize=(15,11))

In [0]:
#doc(interp.plot_top_losses)

In [0]:
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

In [0]:
interp.most_confused(min_val=1)

## Unfreezing, fine-tuning, and learning rates

Since our model is working as we expect it to, we will *unfreeze* our model and train some more.

In [0]:
learn.unfreeze()

In [0]:
learn.fit_one_cycle(1)

In [0]:
learn.load('stage-1');

In [0]:
learn.lr_find()

In [0]:
learn.recorder.plot()

In [0]:
learn.unfreeze()
learn.fit_one_cycle(4, max_lr=slice(1e-6,1e-4))

That's a pretty accurate model!

## Training: resnet50 / Densenet121

Now we will train in the same way as before but with one caveat: instead of using resnet34 as our backbone we will use resnet50 (resnet34 is a 34 layer residual network while resnet50 has 50 layers. It will be explained later in the course and you can learn the details in the [resnet paper](https://arxiv.org/pdf/1512.03385.pdf)).

Basically, resnet50 usually performs better because it is a deeper network with more parameters. Let's see if we can achieve a higher performance here. To help it along, let's us use larger images too, since that way the network can see more detail. We reduce the batch size a bit since otherwise this larger network will require more GPU memory.

In [0]:
#data = ImageDataBunch.from_df(path, train_df, ds_tfms=get_transforms(), size=224, bs=bs//2
#                                  ).normalize(imagenet_stats)
data = ImageDataBunch.from_df(path, train_df, ds_tfms=get_transforms(), size=224, bs=bs//2
                                  ).normalize(imagenet_stats)

In [0]:
learn = cnn_learner(data, models.densenet121, metrics=error_rate)

In [0]:
learn.lr_find()
learn.recorder.plot()

In [0]:
learn.fit_one_cycle(8)

In [0]:
learn.save('stage-1-50')

Let's see if full fine-tuning helps:

In [0]:
learn.unfreeze()
learn.fit_one_cycle(3, max_lr=slice(1e-4,1e-1))

If it doesn't, you can always go back to your previous model.

In [0]:
learn.load('stage-1-50');

In [0]:
interp = ClassificationInterpretation.from_learner(learn)

In [0]:
interp.most_confused(min_val=2)

Notes:
1. With ResNet50 - error rate ~14%. Full fine tuning with all layers degrades performance. Why? 
2. With DenseNet121 - error rate ~ 8% - very good. Full fine tuning with all layers degrades performance. Why?
3. Check this - https://medium.com/@iamvince/flowers-classification-using-fastai-and-attaining-an-accuracy-above-95-28fd53b59940 