<a href="https://colab.research.google.com/github/AbertayProgrammers/differentiable-programming/blob/master/Intro_to_Differentiable_Programming.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Intro to Differentiable Programming

Usually, writing a computer program involves explicitly specifying all the steps involved, using loops, if..else blocks, and other standard programming language features. 

[Differentiable programming](https://en.wikipedia.org/wiki/Differentiable_programming) involves a different process, where a program is *trained* to produce the desired outputs using a set of examples. These examples contain some data (such as images or numeric records) to input into the program, and the output which the program should produce when this data is given as its input. This is the technique behind deep leanring algorithms/artifical neural networks, which power most modern 'AI' applications. [See this article](https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/) for an explanation of machine learning, deep learning and artificial intelligence.

A good example of when this approach would be useful is image recognition. Trying to manually design a program to do this would be extremely complicated, but using an artificial neural network, and a dataset of images and labels of the objects they contain, we can train a program to accurately recognise images.

The details of this approach can be quite complicated, but there are many high-level libraries which allow us to create and train models like this in only a few lines of code. We'll see how to do this using the [fastai](https://docs.fast.ai) and [PyTorch](https://pytorch.org) libries in Python, to get an idea of how differentiable programming works in practice.

##Setting up the environment

First, make sure this notebook is running and GPU acceleration is enabled. Click the 'Runtime' menu, select 'Change runtime type' and make sure GPU is selected as the hardware accelerator. Once this is set, click the Connect button in the upper righthand area of the page (if it does not already say 'Connected').

Once the notebook is running, run the code cell below by clicking inside it and pressing Ctrl+Enter, or clicking the play button on the lefthand side of the cell. This first code cell will install the needed libraries.

In [0]:
# hit Ctrl-Enter in here to run this code
! curl -s https://course.fast.ai/setup/colab | bash
! pip install kaggle

The training dataset we'll be using comes from [Kaggle](https://www.kaggle.com), an online platform for machine learning competitions. To make sure you can download the data from Kaggle, follow the steps below.

*   Go to https://www.kaggle.com and create an account
*   Go to https://www.kaggle.com/your_username/account (substituting your_username for your actual username) and click Create New API Token. This will download a file called kaggle.json. 
* Open kaggle.json in Notepad or any text editor
* Select the whole file (which should be very short) and paste it into the command below, replacing kaggle_json_text (but keeping the single quotes).
* Run the code cell



In [0]:
! echo 'kaggle_json_text' > kaggle.json
! mkdir -p ~/.kaggle/
! mv kaggle.json ~/.kaggle/
! chmod 600 /root/.kaggle/kaggle.json

After this, run the next four code cells to set some settings for the notebook, import all the required libraries and set the random seed

In [0]:
# IPython magic commands - this notebook is a Jupyter notebook interactive python environment
%reload_ext autoreload
%autoreload 2
%matplotlib inline

In [0]:
# fastai library
from fastai.vision import *
from fastai.metrics import error_rate

# PyTorch library (fastai is built on PyTorch)
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms.functional as TF

# Python image library
from PIL import Image
# a library for showing progress bars
from tqdm import tqdm_notebook as tqdm

In [0]:
random.seed(0)
np.random.seed(0)
torch.manual_seed(0);

In [0]:
# disable warnings
import warnings
warnings.filterwarnings('ignore')

## Preparing the Dataset


Run the next four cells to set up the needed directories and download the data from kaggle.  If you haven't seen this syntax for working with folder and file paths before, and are interested, take a look at Python 3's [pathlib](https://docs.python.org/3/library/pathlib.html).

The dataset we'll be using is of images of plant seedlings. This is a handy dataset for this example since it contains 12 similar-looking categories of object to identify (training an image regognition program won't be overly easy) but it's small enough to train fairly quickly.

In [0]:
# set the dataset path
dataset_path = Config.data_path()/'seedlings'
# make sure this folder is created
dataset_path.mkdir(parents=True, exist_ok=True)
# output the path
dataset_path

In [0]:
! kaggle competitions download -c plant-seedlings-classification -f train.zip -p {dataset_path}

In [0]:
! unzip -q -n {dataset_path}/train.zip -d {dataset_path}

In [0]:
train_path = dataset_path/'train'

Once the data has been downloaded, run the cell below to print out all the categories present in the data. There should be 12 categories of plant which we will attempt to teach an artificial neural network to recognise.

In [0]:
categories = sorted([f.name for f in train_path.ls()]); categories

We'll need to split the dataset into two, so we can train the model on one part and then test it on another, to check whether it performs well on data which it has not previously seen. Run the next 6 cells to do that, and show how many samples are in each section of the dataset.

In [0]:
# create a path for the validation set
valid_path = dataset_path/'valid'
valid_path.mkdir(exist_ok=True)
# create subfolders for each category
for category in categories:
  (valid_path/category).mkdir(exist_ok=True)

In [0]:
# function to print out how many samples of each category are present in a dataset folder
def show_num_samples(path):
  for category in categories:
    print('{0:<25} {1:>15}'.format(category + "", str(len((path/category).ls())) + " samples"))

In [0]:
show_num_samples(train_path)

In [0]:
# randomly move 20% of the data to the validation set
random.seed(42)
for category in categories:
  for file in (train_path/category).ls():
    if random.randint(1, 10) >= 8:
      shutil.move(file, valid_path/category/file.name)

In [0]:
show_num_samples(train_path)

In [0]:
show_num_samples(valid_path)

Before moving on, if at any point you want to check the documentation for any of the features of fastai or any other library, type in a question mark before the name of a class or function and run the cell - this will show you some documentation. Two question marks will show you the source code.

In [0]:
?ImageDataBunch.from_folder

Once all the image files are in the right place, we can load the datasets using the fastai library, and display a few of the images to visualise the dataset. We'll name the dataset clean_data, because later we'll investigate how to alter the data in order to fool the model, exposing some of the potential weaknesses of these artificial neural network models.

In [0]:
# create a fastai ImageDataBunch using the training set and validation sets we've created
# size is the image size (224x224) and bs is the batch size
clean_data = ImageDataBunch.from_folder(dataset_path, 
  train='train', valid='valid', ds_tfms=get_transforms(), size=224, bs=48)
# normalize the image data so it has a mean of 0.5 and standard deviation of 0.5
clean_data.normalize(([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]));

In [0]:
clean_data.show_batch(rows=3, figsize=(7,6))

## Training a Classifier

Now the data is loaded, run the cell below to create an image classification model which is based on the ResNet34 [convolutional neural network](https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/) (CNN) architecture, and pretrained on ImageNet, a database of over 14 million images. Since the model has been pretrained on an image recognition problem, we will be able to fine-tune it for this specific problem much quicker than if we started from scratch.

In [0]:
classifier_resnet34 = cnn_learner(clean_data, models.resnet34, metrics=error_rate)

Run the cell below to train the model. The one argument we pass to the fit_one_cycle function is the number of times the model looks over the whole dataset. Each loop over the whole dataset, known as an epoch, should take about 1 minute 30 seconds. You can choose a number of epochs - 6 should lead to an error_rate of around 10%, i.e around 90% accuracy. Keep in mind this dataset has 12 categories, so random guessing mean around 8% accuracy. Also, if you run this cell more than once, training will continue from the previous time, since training updates the parameters of the model directly. 

The 'loss' in the table refers to a measure of how far the model's outputs were to the desired outputs which are part of the training set. Training involves calculating the partial derivative of all of the parameters (internal vairables which we can change to change the model's behaviour) of the model with respect to the loss, so that we know whether to make each parameter bigger or smaller to make the loss lower next time. This is what makes this technique 'differentiable' programming.

In [0]:
classifier_resnet34.fit_one_cycle(6)

Run the following cell to save the parameters of the trained model - we'll need to load this saved model again later.

In [0]:
classifier_resnet34.save('resnet34')

## Training a Classifier: Results

Run the following four cells to evaluate and plot the accuracy of the model we've trained. The confusion matrix is a plot of actual categories vs. predicted categories, and if the model has been sufficiently trained, the highest numbers should be along the diagonal, which corresponds to correct predictions.

In [0]:
interp_classifier_resnet34 = ClassificationInterpretation.from_learner(classifier_resnet34)

In [0]:
interp_classifier_resnet34.plot_confusion_matrix()

In [0]:
def accuracy_from_confusion_matrix(confusion_matrix):
  return confusion_matrix.diagonal().sum() / confusion_matrix.sum()

In [0]:
accuracy_from_confusion_matrix(interp_classifier_resnet34.confusion_matrix())

## Generating Adversarial Examples

These deep learning algorithms are incredibly powerful, and have [many uses](http://www.yaronhadad.com/deep-learning-most-amazing-applications/) beyond image recognition, but they are not without their weaknesses. It can be surprisingly easy to create inputs which look perfectly normal to a person, but which completely fool the model. These are known as adversarial examples, and they can have serious implications, especially when models are used in critical applications, like real-time computer vision in self-driving cars or drones. (These applications would use a camera input, but similar principles can be used to create [physical objects](https://bair.berkeley.edu/blog/2017/12/30/yolo-attack/) which fool the model.)

Run the following cell to define a function to convert normal images to create adversarial ones. Basically, it alters the pixel data of the image to increase the error of the model in the same way as the parameters of the model are updated to decrease the error during usual training. This is the most basic method, and samples created this way will be unlikely to fool other models, but the general principle is the same for more sophisticated methods.

In [0]:
def fast_gradient_sign(img, label, e, model):
  # keep track of the gradients (partial derivatives) of the image's pixel data
  img.requires_grad = True
  # get the model's output when it is passed the image
  out = model(img)
  # calculate the loss using the cross entropy loss function
  loss = F.cross_entropy(out, label)
  # calculate the partial derivatives of the pixel data
  loss.backward()
  # get whether these derivatives are positive or negative
  s = img.grad.sign()
  # update each pixel by a tiny amount in the direction which would cause the loss
  # to increase. If the gradient/slope/paritial derivative of the loss function with
  # respect to part of a pixel is positive, making that number bigger would increase
  # the loss, and vice versa. 
  img_adv = img + e * s
  return img_adv

Now run the cell below to create a copy of the validation set where each image has been converted to an adversarial image.

In [0]:
# create a path for the adversarial images. 'Untargeted' refers to the fact that
# this method simply tries to get the model to make any wrong prediction, not
# a specific wrong prediction
adv_untargeted_valid_path = dataset_path/'valid_adv_untargeted'
adv_untargeted_valid_path.mkdir(exist_ok=True)
# for each category
for i, category in enumerate(categories):
  # create a folder for that category
  (adv_untargeted_valid_path/category).mkdir(exist_ok=True)
  # for each file in the validation set (and generate a progress bar)
  for file in tqdm((valid_path/category).ls(), desc=category):
    # open the image
    img = Image.open(file).convert('RGB')
    # covert the image to a PyTorch tensor (basically a fancy multi-dimensional array)
    img_as_tensor = TF.normalize(TF.to_tensor(TF.resize(img, 224)), 
                                 (0.5, 0.5, 0.5), 
                                 (0.5, 0.5, 0.5))
    # use our fast gradient sign function to alter the image tensor
    img_adv_as_tensor = fast_gradient_sign(img_as_tensor.unsqueeze(0).cuda(), 
                                           torch.tensor([i]).cuda(),
                                           0.001,
                                           classifier_resnet34.model)
    # convert the image back to a python image
    img_adv = TF.to_pil_image(img_adv_as_tensor.squeeze().detach().cpu() * 0.5 + 0.5)
    # save to disk
    img_adv.save(adv_untargeted_valid_path/category/file.name)

We can now load this adversarial data using fastai, and show a few of the images to make sure they look normal.

In [0]:
data_adv_untargeted = ImageDataBunch.from_folder(dataset_path,
                                                 train='train',
                                                 valid = adv_untargeted_valid_path.name,
                                                 ds_tfms=get_transforms(),
                                                 size=224,
                                                 bs=48)
data_adv_untargeted.normalize(([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]));

In [0]:
data_adv_untargeted.show_batch(rows=3, figsize=(7,6), ds_type=DatasetType.Valid)

## Generating Adversarial Examples (Results)


Run the following cell to create a copy of the original model which uses the adversarial validaiton set, so we can check the model's accuracy on it. Rather than training the model again, we'll load the previously saved model data.

In [0]:
classifier_resnet34_adv_untargeted = cnn_learner(data_adv_untargeted,
                                                 models.resnet34,
                                                 metrics=error_rate,
                                                 pretrained=False)
classifier_resnet34_adv_untargeted.load('resnet34');

Run the next three cells to check the model's accuracy on the adversarial data. You should see a much more scattered confusion matrix and a very low accuracy.

In [0]:
interp_resnet34_adv_untargeted = ClassificationInterpretation.from_learner(classifier_resnet34_adv_untargeted)

In [0]:
interp_resnet34_adv_untargeted.plot_confusion_matrix()

In [0]:
accuracy_from_confusion_matrix(interp_resnet34_adv_untargeted.confusion_matrix())

## Generating Targeted Adversarial Examples

The adversarial examples generated above were not targeted to a specific class - the aim was simply for the model to fail to classify them correctly. We can also generate adversarial examples which are targeted, so the model will produce a specific wrong result.

Run the next two code cells to define functions to do just this. This time, we'll alter the images multiple times in sequence, which will improve their effectiveness (but they will be more bound to this specific model).

In [0]:
# return altered image which attempts to make the model classify it as the given label
def fast_gradient_sign_targeted(img, label, e, model):
  img.requires_grad = True
  out = model(img)
  loss = F.cross_entropy(out, label)
  loss.backward()
  s = img.grad.sign()
  x_adv = img - e * s
  return x_adv

In [0]:
# alter the image a number of times in sequence
def fast_gradient_sign_targeted_iterative(img, label, e, it, model):
  for _ in range(it):
    x_adv = fast_gradient_sign_targeted(img, label, e, model)
    img = x_adv.detach()
  return x_adv

The next three cells will select a false target category for each true category and create a copy of the validation set composed of targeted adversarial examples.

In [0]:
# generate a target category for each real category
random.seed(12)
target_categories = []
for i in range(len(categories)):
  other_categories = list(range(12))
  other_categories.remove(i)
  target_categories.append(random.choice(other_categories))

In [0]:
target_categories

In [0]:
# create a copy of the validation set where each image is a targeted adversarial example
adv_targeted_valid_path = dataset_path/'valid_adv_targeted'
adv_targeted_valid_path.mkdir(exist_ok=True)
for i, category in enumerate(categories):
  (adv_targeted_valid_path/category).mkdir(exist_ok=True)
  print(category + ' -> ' + categories[target_categories[i]])
  for file in tqdm((valid_path/category).ls()):
    img = Image.open(file).convert('RGB')
    img_as_tensor = TF.normalize(TF.to_tensor(TF.resize(img, 224)), 
                                 (0.5, 0.5, 0.5), 
                                 (0.5, 0.5, 0.5))
    img_adv_as_tensor = fast_gradient_sign_targeted_iterative(img_as_tensor.unsqueeze(0).cuda(), 
                                           torch.tensor([target_categories[i]]).cuda(),
                                           0.0002,
                                           40,
                                           classifier_resnet34.model)
    img_adv = TF.to_pil_image(img_adv_as_tensor.squeeze().detach().cpu() * 0.5 + 0.5)
    img_adv.save(adv_targeted_valid_path/category/file.name)

Now run the next two cells to load this data and display it to confirm they look like regular images.

In [0]:
data_adv_targeted = ImageDataBunch.from_folder(dataset_path,
                                               train='train',
                                               valid = adv_targeted_valid_path.name,
                                               ds_tfms=get_transforms(),
                                               size=224,
                                               bs=48)
data_adv_targeted.normalize(([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]));

In [0]:
data_adv_targeted.show_batch(rows=3, figsize=(7,6), ds_type=DatasetType.Valid)

## Generating Adversarial Examples (Targeted, White-box): Results


Finally, we can check the model's performance on these targeted adversarial examples. As before, we create a copy of the model and load the saved state, and generate a confusion matrix. You should find the results are concentrated in particular cells corresponding to the target categories.

In [0]:
classifier_resnet34_adv_targeted = cnn_learner(data_adv_targeted,
                                               models.resnet34,
                                               metrics=error_rate,
                                               pretrained=False)
classifier_resnet34_adv_targeted.load('resnet34');

In [0]:
interp_resnet34_adv_targeted = ClassificationInterpretation.from_learner(classifier_resnet34_adv_targeted)

In [0]:
interp_resnet34_adv_targeted.plot_confusion_matrix()

In [0]:
accuracy_from_confusion_matrix(interp_resnet34_adv_targeted.confusion_matrix())

In [0]:
target_categories

We can also check how well the model followed our target categories. This should be very high.

In [0]:
def targeted_accuracy(interp):
  hits = 0
  for i in range(12):
    hits += interp.confusion_matrix()[i,target_categories[i]]
  return hits / interp.confusion_matrix().sum()

In [0]:
targeted_accuracy(interp_resnet34_adv_targeted)

##Further resources

There is a lot more you can do with deep learning using fastai and other libraries. https://course.fast.ai is home to an excellent series of video courses which have accompanying interactive notebooks like this one. You can have a look at some of these by clicking File -> Open notebook... and going to the GitHub tab. Type in fastai/course-v3 in the seach bar and look for notebooks in the nbs/dl1 directory. Also, there are several good PyTorch tutorials on the [PyTorch website](https://pytorch.org/tutorials/).