**Important: This notebook will only work with fastai-0.7.x. Do not try to run any fastai-1.x **

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
#print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

In [None]:
!pip install fastai==0.7.0
!pip install torchtext==0.2.3
!pip install torch==0.3.1

In [None]:
from fastai.transforms import *
from fastai.conv_learner import * 
from fastai.model import *
from fastai.dataset import *
from fastai.sgdr import *
from fastai.plots import *

In [None]:
from fastai.imports import *

In [None]:
torch.cuda.is_available()

In [None]:
torch.backends.cudnn.enabled

In [None]:
os.listdir("../input")

In [None]:
filenames = os.listdir('../input/train/train')[:5]
filenames

In [None]:
path = "../input/train/train"
img = plt.imread(f'{path}/{filenames[0]}')
plt.imshow(img)

In [None]:
img.shape

In [None]:
shutil.rmtree(f'{path}tmp', ignore_errors=True)#

In [None]:
img[:4,:4]

In [None]:
PATH = "../input/"
TMP_PATH = "/tmp/tmp"
MODEL_PATH = "/tmp/model/"
sz=224

In [None]:
fnames = np.array([ f'train/train/{i}'  for i in sorted(os.listdir(f'{PATH}train/train'))])
label = np.array([0 if 'cat' in fname else 1 for fname in fnames]).astype(np.double)

## **The following error can be resolved by the solution [here](https://forums.fast.ai/t/windows-10-installation-notes-windows-command-and-wsl-bash/6500/55)**

In [None]:
arch = resnet34
data = ImageClassifierData.from_names_and_array(path=PATH,
                                                fnames=fnames ,
                                                y=label,
                                                classes=['dogs', 'cats'],
                                                test_name=(f'{PATH}test1/test1'),
                                      tfms=tfms_from_model(arch , sz))
learn = ConvLearner.pretrained(arch, data, precompute=True, tmp_name=TMP_PATH, models_name=MODEL_PATH)
learn.fit(0.01 , 2)

In [None]:
data.val_y

In [None]:
data.classes

In [None]:
log_preds = learn.predict()
log_preds.shape

In [None]:
log_preds[:10]

In [None]:
preds = np.argmax(log_preds, axis=1)  # from log probabilities to 0 or 1
probs = np.exp(log_preds[:,1])        # pr(dog)

In [None]:
def rand_by_mask(mask): return np.random.choice(np.where(mask)[0], 4, replace=False)
def rand_by_correct(is_correct): return rand_by_mask((preds == data.val_y)==is_correct)

In [None]:
def plots(ims, figsize=(12,6), rows=1, titles=None):
    f = plt.figure(figsize=figsize)
    for i in range(len(ims)):
        sp = f.add_subplot(rows, len(ims)//rows, i+1)
        sp.axis('Off')
        if titles is not None: sp.set_title(titles[i], fontsize=16)
        plt.imshow(ims[i])

In [None]:
def load_img_id(ds, idx): return np.array(PIL.Image.open(PATH+ds.fnames[idx]))

def plot_val_with_title(idxs, title):
    imgs = [load_img_id(data.val_ds,x) for x in idxs]
    title_probs = [probs[x] for x in idxs]
    print(title)
    return plots(imgs, rows=1, titles=title_probs, figsize=(16,8))

In [None]:
plot_val_with_title(rand_by_correct(True), "Correctly classified")

In [None]:
# 2. A few incorrect labels at random
plot_val_with_title(rand_by_correct(False), "Incorrectly classified")

In [None]:
def most_by_mask(mask, mult):
    idxs = np.where(mask)[0]
    return idxs[np.argsort(mult * probs[idxs])[:4]]

def most_by_correct(y, is_correct): 
    mult = -1 if (y==1)==is_correct else 1
    return most_by_mask(((preds == data.val_y)==is_correct) & (data.val_y == y), mult)

In [None]:
plot_val_with_title(most_by_correct(0, True), "Most correct cats")

In [None]:
plot_val_with_title(most_by_correct(1, True), "Most correct dogs")

In [None]:
plot_val_with_title(most_by_correct(0, False), "Most incorrect cats")

In [None]:
plot_val_with_title(most_by_correct(1, False), "Most incorrect dogs")

In [None]:
most_uncertain = np.argsort(np.abs(probs -0.5))[:4]
plot_val_with_title(most_uncertain, "Most uncertain predictions")

### **Choosing a learning rate**
The learning rate determines how quickly or how slowly you want to update the weights (or parameters). Learning rate is one of the most difficult parameters to set, because it significantly affects model performance.

The method `learn.lr_find()` helps you find an optimal learning rate. It uses the technique developed in the 2015 paper Cyclical Learning Rates for Training Neural Networks, where we simply keep increasing the learning rate from a very small value, until the loss stops decreasing. We can plot the learning rate across batches to see what this looks like.

We first create a new learner, since we want to know how to set the learning rate for a new (untrained) model.

In [None]:
learn = ConvLearner.pretrained(arch, data, precompute=True, tmp_name=TMP_PATH, models_name=MODEL_PATH)

In [None]:
lrf=learn.lr_find()

Our `learn` object contains an attribute `sched` that contains our learning rate scheduler, and has some convenient plotting functionality including this one:

In [None]:
learn.sched.plot_lr()


Note that in the previous plot iteration is one iteration (or minibatch) of SGD. In one epoch there are (num_train_samples/num_iterations) of SGD.

We can see the plot of loss versus learning rate to see where our loss stops decreasing:

In [None]:
learn.sched.plot()

## **Improving our model**
### **Data augmentation**
If you try training for more epochs, you'll notice that we start to overfit, which means that our model is learning to recognize the specific images in the training set, rather than generalizing such that we also get good results on the validation set. One way to fix this is to effectively create more data, through data augmentation. This refers to randomly changing the images in ways that shouldn't impact their interpretation, such as horizontal flipping, zooming, and rotating.

We can do this by passing `aug_tfms` (augmentation transforms) to tfms_from_model, with a list of functions to apply that randomly change the image however we wish. For photos that are largely taken from the side (e.g. most photos of dogs and cats, as opposed to photos taken from the top down, such as satellite imagery) we can use the pre-defined list of functions `transforms_side_on`. We can also specify random zooming of images up to specified scale by adding the `max_zoom` parameter.

In [None]:
tfms = tfms_from_model(resnet34, sz, aug_tfms=transforms_side_on, max_zoom=1.1)

In [None]:
def get_augs():
    data = ImageClassifierData.from_names_and_array(
        path=PATH, 
        fnames=fnames, 
        y=label, 
        classes=['dogs', 'cats'], 
        test_name=f'{PATH}test1/test1', 
        tfms=tfms,
        num_workers=1,
        bs=2
    )
    x,_ = next(iter(data.aug_dl))
    return data.trn_ds.denorm(x)[1]

In [None]:
ims = np.stack([get_augs() for i in range(6)])

In [None]:
plots(ims , rows=2)