# Exercise 4 - Help! My Network Doesn't Work!

Once you get to applying deep learning on your own work -or just for fun-, you will run into problems. A lot of problems. These can be a cause of headaches, time-loss and stress.

![John](http://www.paulvangent.com/files/DL_Course/day2_images/John.jpg)

This goal of this notebook is to give you a set of debugging tools. When things don't work, where do you start looking? Hopefully it will help you side-step a lot of frustration and wasted hours.

Some sections will have an exercise where we've made a mystery mistake. It will be up to you to 'debug' our mess and tell us what we need to do better!

### Index
- 4.1 - Problems with your data
- 4.2 - Problems with fitting the network (learning)
- 4.3 - Do you need more data?
- 4.4 - Problems with prediction accuracy

------

As always, run the cell below first to import everything needed

In [None]:
#download required datasets for this notebook (might take a bit, be patient!)
from urllib.request import urlretrieve
import os
from zipfile import ZipFile

def download(url, file):
    if os.path.isfile(file):
        os.remove(file)
    if not os.path.isfile(file):
        print("Download file... " + file + " ...")
        urlretrieve(url,file)
        print("File downloaded")
        
def unzip(file):
    with ZipFile(file) as f:
        f.extractall()
    print('unzipped file: %s\n' %file)
    
try:
    download('http://www.paulvangent.com/files/DL_Course/misc_day2.zip', 'misc_day2.zip')
except:
    download('https://onedrive.live.com/download?cid=39383A5AFCD95065&resid=39383A5AFCD95065%21754608&authkey=AOJO5mcV9eCZsJ8', 'misc_day2.zip')
unzip('misc_day2.zip')

try:
    download('http://www.paulvangent.com/files/DL_Course/catsndogs2.zip', 'catsndogs2.zip')
except:
    download('https://onedrive.live.com/download?cid=39383A5AFCD95065&resid=39383A5AFCD95065%21754593&authkey=AGQ0ehV6iCgTUhQ', 'catsndogs2.zip')
unzip('catsndogs2.zip')

import numpy as np
import matplotlib.pyplot as plt
import random

import cv2
import utils_day2 as utils
from misc import convnet
from keras import callbacks

## 4.1 - Problems with your data

![baddata](http://www.paulvangent.com/files/DL_Course/day2_images/Iseebaddata.png)

The first thing you need to check when your model isn't performing as expected, is what you're feeding it. Maybe you've made a mistake in scaling your labels, maybe your labels are ordered differently from your training samples, or maybe the data is not being loaded properly. As with any modelling task: **Garbage in: Garbage out!**

Rather than checking every single thing every time something goes wrong, we recommend you look at the following things:

### 0. Is your input loaded?
When loading (imoage) data, some libraries may return an array with 'NaN' or zeros whenever they run into an issue. This array could have the expected shape! Be sure to check and/or visualise the data after you load it into a training set.

### 1. Did you normalize / standardize inputs?
As you learned in Exercise 2, normalization of data and labels can have a substantial impact on training time and model efficacy, to the point of making the difference between the model learning or not learning anything.

### 2. Did you scale your input labels? Do they still make sense?
Visualise, visualise, visualise! Quite often you may accidentally have mixed data axes. For example when dealing with images, opencv and numpy use different axis orders. Always visualise or plot your data after preprocessing steps and ensure that whatever property is important is preserved. In one case this led to the following labeled data for faces:

**-image of swapped landmarks-**

### 3. Is your input connected to your output?
Even if you didn't scale the labels, did you load them properly? Make sure that whatever data you feed the network matches the output label you want it to map to. The network has no conceptual knowledge of anything that's happening, so it doesn't care if you feed it an image of a cat or a few minutes of Beethoven's 9th (assuming at least the array shapes match). Again, visualise before fitting, don't jump in blindly.

### 4. How noisy is your data?
There will be bad labels in many datasets. If you collect and label data yourself, or someone else does it, mistakes will be made. Usually the network will fit just fine even with high numbers of incorrect labels assuming the errors are somewhat random. For example, [this paper](https://arxiv.org/pdf/1412.6596.pdf) still learned from MNIST with half the labels being wrong!

### 5. Did you shuffle your training data?
If most batches you feed to your network contain a single class, the network will not learn optimally. Make sure each batch contains examples of multiple classes. You can achieve this by shuffling the data before training (See 2.4)

### 6. How many of each class are there in your set?
If your task is classification, check how many examples do you have of each class? Does 80% of your datasets consist of a single class? This can be problematic. You can try augmenting the data for the under-represented class, change your loss function to adjust for the imbalance, or you can try one of [several other techniques for class imbalance](https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/)


----------

**Exercise:** The model below tries to classify whether the images are of a cat or a dog, but it doesn't work: it doesn't seem to learn from the training set and right off the bat gets 100% on our validation set. These are both red flags! 

We suspect it's because we made some mistakes in handling the data. Use the checklist above to find out what we did wrong. 

- Run the cell below

In [None]:
model = convnet.build_model(input_shape=(100,100,3), classes=2,
                            final_activation = 'softmax')
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

X, Y = utils.get_data_exercise4()

model.fit(X[:3800], Y[:3800], 
          epochs=5, 
          validation_data=(X[3800:], Y[3800:]))

**What do you see happening? What could be an issue?**

- Use the cell afterwards to explore the data 
- Then describe the problem in the cell after that

In [None]:
#Use this cell to explore the data
##Start of coding segment##



###End of coding segment###

**Describe your problem here** (double click this text to edit it, press shift+enter when you're done)


I've seen the following problems:

- ...


I would solve this by:

- ...

-----------

### Fixing our mess

**Exercise:** Fix the problems you discovered and re-train the model so that it works better than our version.

- You can implement whatever fixes you want in the cell below
- You should be able to reach >95% validation accuracy

Hint: we will tell you the problem is *with the data, not the model*

In [None]:
model = convnet.build_model(input_shape=(100,100,3), classes=2,
                            final_activation = 'softmax')
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

X, Y = utils.get_data_exercise4()


###Start of coding segment###


###End of coding segment###

model.fit(X[:3800], Y[:3800], 
          epochs=10, 
          validation_data=(X[3800:], Y[3800:]))

This exercise was based on an excerpt from the [cats vs dogs kaggle competition](https://www.kaggle.com/c/dogs-vs-cats/data)

## 4.2 - Learning Problems

![learning problems](http://www.paulvangent.com/files/DL_Course/day2_images/waitingtotrain.jpg)

So you've checked your data and found no issues, but the network is still not learning. What can you do now?

### 1. Can the network fit the data?
Check if your network has enough capacity to fit the function required. try overfitting on a really small subset of the data. For example take 2 images of each class, turn off any automated early stopping and let the network overfit. If training accuracy doesn't get very high, the network may not have enough capacity or there may be some software defect preventing fitting. Check how you load the data, how you build the network, and how you fit. If you cannot find anything wrong with your code, try adding layers to create a deeper network, widening existing layers, or use different architecture.

### 2. Is the activation function on the output layer appropriate?
This is something a lot have fallen into especially when re-using network architectures. Consider you've used a network for a classification task in the past, and now switch to predicting 4 coordinates for a face bounding box (like we did in Exercise 3). For the class-based approach you likely have a Softmax function on the final layer, which scales the output vector so that it has a maximum sum of 1. This is great for mutually exclusive classes, less so for predicting 4 coordinates (whose sum will likely never be exactly 1).

### 3. Try less regularization
Do you have a lot of dropout layers in your network, a lot of batchnormalization layers, or a lot of other regularizing tricks? Turn them off and run the network in a 'no bells and whistles' variant. Does it fit better now? You may have too much regularization in the network.

### 4. Try a different weight initialization
Do you initialize your weights when compiling the network? The way you initialize might by bad luck lead your optimizer to end up in a local minimum. See [this part of the Keras API for initializers](https://keras.io/initializers/). When in doubt, start with Xavier (glorot_uniform) or He initializers.

### 5. Tune the hyperparameters
- Check your learning rate, change it a little to see if it helps the network fit.
    - some optimizers have multiple parameters to set, but start with the learning rate.
- Try a different batch size, it may be too large or small.

If you have enough compute time available, you can try a grid search. This means systematically (and automatically) trying different hyperparameter combinations and noting down the effects on network performance. [see the link here for an example on how to implement](https://blog.floydhub.com/guide-to-hyperparameters-search-for-deep-learning-models/).

### 6. Try a different optimizer
The optimizer you're using may not be able to converge to a minimum. Try a different optimizer to see if it has an effect.

### 7. Give it more time
Sometimes you need to let the network run for multiple epochs before it will start learning. Giving it a few epochs more time to start fitting might help.

### 8. Is the problem solvable at all?
Try [one of the standard networks baked into Keras](https://keras.io/applications/). Do these networks reach acceptable performance levels? If not, consider that the problem may not be easily solvable in its current form.

----------

**No exercise!**

This section has no exercise. We hope that whenever you run into issues with your models, you will return here for troubleshooting.

# 4.3 - Do you need more data?

![more data](http://www.paulvangent.com/files/DL_Course/day2_images/moredata.jpg)

if performance on the training set is poor, the model is not using what is in the set properly. Throwing more data at it will not change this. Make sure there are no quality problems with the data as discussed under **4.1**.

However, if the model performs well on the training set but not on the test set, gathering more data is usually very effective. The model can learn, but doesn't have enough examples to learn to generalize. You can also try augmentation on the data you have to see whether that improves accuracy.

**Additional remark: consider what is the quality of your data set**
Look critically at what you have. Maybe you can clean the dataset by reducing noise in your samples or correcting incorrect labels. See what is discussed under **1**

------

**Exercise:**

Consider the training log output below.

|epoch | acc | loss | val_acc | val_loss|
|------|-----|------|---------|---------|
|0|0.6668|0.6502|0.5|0.7121
|1|0.6692|0.6391|0.5|0.7405
|2|0.6692|0.6307|0.5|0.7287
|3|0.666|0.6277|0.5|0.7646
|4|0.6728|0.6171|0.56|0.7096
|5|0.682|0.5963|0.51|0.7655
|6|0.72|0.5425|0.555|0.7345
|7|0.7844|0.4577|0.605|0.7682
|8|0.8452|0.3566|0.605|0.9756
|9|0.906|0.2335|0.6|1.4574
|10|0.9344|0.1659|0.625|1.2655
|11|0.9604|0.1031|0.59|1.4402
|12|0.9692|0.0851|0.585|1.6788
|13|0.9812|0.0559|0.565|2.0274
|14|0.9784|0.0599|0.595|2.0819
|15|0.9956|0.0173|0.63|2.7099
|16|0.9892|0.0322|0.575|2.4551
|17|0.9924|0.0234|0.59|2.8146
|18|0.9892|0.0267|0.61|2.8822
|19|0.9836|0.0441|0.615|2.1279

**Would more data help? Why (not)?**

# 4.4 - Problems with predicting

![img](http://www.paulvangent.com/files/DL_Course/day2_images/mistakes.jpg)

What if your model worked well in training, performance on both the validation and test set was great as well, but when predicting on other data performance is disappointing?

The most common cause of this is that the data you're feeding during prediction differs significantly from what you trained the network on. If, for example, you trained your network to detect cats in urban environments, don't be surprised when it doesn't work well when detecting cats in forest environments! Deep learning networks are really great at learning exactly what you tell them.

This happens more often than you think as you cannot always predict exactly what 'real life data' will look like. [Take a look at this paper about generalization issues of deep learning in hospital settings for example](https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1002683).

## Further reading

[This blog is a more in-depth version of this notebook](https://blog.slavv.com/37-reasons-why-your-neural-network-is-not-working-4020854bd607)