# Good practices of NN/DL project design
## What to do and - more importantly perhaps - not to do

# Is my project right for Neural Networks?

* The thought process should not be: “I have some data, why don’t we try neural networks”
* But it should be: “Given the problem, does it make sense to use neural networks?”

    * Do I really need non-linear modelling?
    * What literature is out there for similar problems?
    * How much data will I be able to gather or put my hands on?
    * Are there datasets out there that I can re-use before I collect my data?



## Do I really need non-linear modelling?

* Sometimes linear methods perform just as well if not better
* Less risk of catastrophic overfitting
* Faster to code, optimize, run, debug
* Use linear modelling as a baseline before you move to non-linear methods?

## Real-life example

Drop-in question: "I tried deep learning on my data and it didn't perform better than this other simpler method"

* Classifying gene expression samples
* O(1000) features
* O(1000) samples
* 2 classes
* NN looked like this:

In [4]:
from keras.layers import Dense
from keras.models import Sequential

model = Sequential()
model.add(Dense(1000, input_dim=5000))
model.add(Dense(500))
model.add(Dense(2, activation="softmax"))

model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_12 (Dense)             (None, 1000)              5001000   
_________________________________________________________________
dense_13 (Dense)             (None, 500)               500500    
_________________________________________________________________
dense_14 (Dense)             (None, 2)                 1002      
Total params: 5,502,502
Trainable params: 5,502,502
Non-trainable params: 0
_________________________________________________________________


## Parameters (weights) vs. samples

* If the number of parameters is many times higher than the number of samples a NN will never work
* Ideally, we are looking for the inverse: way more samples than parameters
* Some rule of thumbs out there:
    * 10x as many labelled samples as there are weights
    * A few thousand samples per class
    * Just try it and downscale/regularize until you're not overfitting anymore (or until you have a linear model)

## And even if I have enough data for a NN...

... is Deep Learning the right choice?

* The tasks were Deep Learning shine are those that require feature extraction:
    * Imaging -> edge/object detection
    * Audio/text -> sound/word/sentence detection
    * Protein structure prediction -> mutation patterns/local structure/global structure

* Deep Learning makes feature extraction automatic and seem to work best when there is a hierarchy to these features
* Is your data made that way? 
    * Does it have an order (spatial/temporal)? 
    * Are smaller patterns going to form higher-order patterns?
* All these different types of layers need to be there for a reason


## And even when both these conditions have met

... you need a few more things:

* Domain knowledge is not enough
* Sometimes people with NN/DL knowledge and no domain knowledge end up being the right ones for the job (see Alphafold)
* You also need lots of patience and time, these things rarely work out of the box

## A few more things to keep in mind

* You need extensive knowledge of your data:
    * Split the data in a rigorous way to avoid introducing biases
    * Check for _information leakage_ before you get overly optimistic results
    * Make sure that there are no errors in your data

And therein lies the main issue:
* Some think that DL is about having a model magically fixing your data
* Instead, DL is _mostly_ about knowing your data

## Neural Nets are very good at detecting patterns and they will use this against you

## Target leakage

## Lab 1: looking for target leakage in a text dataset (~1 h.)

## Know your train/validation/test sets

## Lab 2: how to rigorously split a protein dataset (~1 h.)

## Having the right data for NN/DL, but not enough of it: now what?

## Lab 3: transfer learning in imaging data (~1 h.)

## Tips and tricks on training your Neural Networks