# Course 3. Structuring machine learning project

## ML strategy 1

### Why ML strategy?

List of ideas that you can try for improving nn performance:

1. collect more data
2. create more diverse training set
3. train longer with GD
4. try Adam instead of GD
5. try bigger network
6. try smaller network
7. try dropout
8. add L2 regularization
9. try different network architecture

What can easily happen is that you spend 6 months collecting more data and eventually you realize that that didn't helped. This course is about figuring out what is the right direction and how to chose wisely your next steps.

### Orthogonalization

Chain of assumption in ML:
`fit train set well` $\rightarrow$ `fit dev set well` $\rightarrow$ `fit test set well` $\rightarrow$ `working well in real world` 

### Single Number Evaluation Metric

Using single evalutation metric speeds up the iterative process of experimenting.

<img src="imgs/f1.png">

Usually 2 metrics that we are often using are precision and recall and in the example on the image, looking only at precision and recall, its hard to tell which classifier is better (A or B). Introducing F1 score helps us with that.

__Precision__ - Of the examples our classifer recognized as cats what percentage actually are cats?

__Recall__ - Of all the images that really are cats what percentage is classified correctly?

__F1 Score__ - We can think of it as average of `P` and `R` (a way to combine `P` and `R`) to gain single number evaluation metric.

\begin{align}
F1 = \frac{2}{\frac{1}{P} + \frac{1}{R}}
\end{align}


### Satisficing and Optimizing Metrics


### Train / dev / test sets

Guidelines:

1. Dev and test set should always come from the same distribution
2. `old way split` 70:30 or 60:20:20 vs `new way split` 98:1:1
3. set your test set to be big enough to high confidence in overall performance of the system

### When to Change Dev/Test Sets

### Why Human Level Performance

__Baysian optimal error__ is the best possible theoretical error.

Understanding human level performance for certain task can be very helpful when setting up a target for your model.

### Avoidable Bias



Avoidable bias is a difference between human level error and training error.

### Understanding Human-Level Performance

How to define human level performance?

<img src="imgs/humanlevelerror.png">

### Surpassing Human-Level Performance

### Improving Performance

<img src="imgs/improvingperformance.png">

## ML strategy 2

### Carrying Out Error Analysis

The advice is the following:

Try to manually carry out analysis of misclassified examples and to spot a categories (for cat image classification its dogs, great cats, blurry images etc) and then prioritize fixing the classifier for certain category based on how many images there are in each of these categories.

### Cleaning Up Incorrectly Labelled Data

### Build First System Quickly, Then Iterate

1. Quickly set up dev / test set and metric (place the target)
2. Build your initial system
3. Use bias & variance error analysis as well as error analysis to prioritize next steps

### Training and Testing on Different Distributions

<img src="imgs/distribution.png">
<img src="imgs/speechrecognitionex.png">


### Bias and Variance with missmatched data

This video nicely explains how to debug your model. Watch it over and over https://youtu.be/2BH49JG_sTs
<img src="imgs/biasandvariance.png">

### Addressing Data Mismatch

`understand difference between training and dev / test set` $\rightarrow$ `make training data more similar`

For example, if you're working with audio data maybe dev set contains samples that are a bit noisier then in training set. In this case it could be helpful to generate more data by taking samples from the training set and artificially add noise to them. This process is called __aritficial data synthesis__.

### Transfer learning

Idea of transfer learning is that we can take knowledge that model learned on one task and then apply that knowledge on separate task. For example, you can take neural net that learned to classify cats and reuse it for the task of classifying radiology images (nn hidden layers can learn various features like edge detection etc. which can be extremly valuable for the other task).

<img src="imgs/transferlearning.png">

Transfer learning can be also very helpfull when you don't have enough data.

This can be applied to other types of data, not just images. However, tasks have to be related ... When it makes sense to apply transfer learning?

* Task A and task B have the same input
* You have a lot more data for task A then for task B
* Low level features from task A could be helpful for task B

Papers about transfer learning: https://en.wikipedia.org/wiki/Transfer_learning#CITEREFThrunPratt2012 (wow ... idea was born in the early 90s)

### Multi-task learning

In transfer learning we first trained the model on one task and then transfered that knowledge to other task. In multitask learning we are solving several tasks in parallel. Here is what this means ...

For example if you have the self-driving car system and you want to detect pedestrians, traffic ligts, stop signs and cars on the images, you can create neural net where output layer is dimension for each example is $4 \times 1$ as opposed to $1 x 1$

<img src="imgs/multitask.png">

Multitask learning makes sense when:

* training on a set of tasks that could benefit from sharing low level features
* usually amount of data that you have for each task is similar and by doing multitask learning you could benefit of having more data then doing these tasks separately
* can train a big nn that can do well on all tasks

In practice transfer learning is used much often in average ... but in computer vision, multitask learning is used more offen.

### What is end-to-end deep learning?
### Whether to Use End-To-End Deep Learning
