# Keras

This is a set of notes on `keras` based on either Prof. Andrew Ng's Deeplearning.ai course, or Francois Chollet's book Deep Learning with Python.

**Inputs** in `keras` are also represented as layers: `keras.layers.Input()`.

The **first** layer needs an `input_shape=` parameters, the shape here should be the input data shape **without** the batch dimension. 

**Activations** can also be standalone layers, `keras.layers.Activations()`. Some layers have optional parameters to build in activations, such as in `Dense()` layers.

Two ways for multi-class classification:
1. one-hot labels, use `categorical_crossentropy` loss.
2. integer labels, use `sparse_categorical_crossentropy` loss. 

**Avoid** shrinking layer dimension smaller than input dimension too quickly to avoid loss of information early in the chain.

When calling `model.compile()`, you can specify a loss function with parameter `loss=`, as well as a metric to monitor with `metrics=` param.

Validation: `model.evaluate()`

## RNN 

`keras` RNN layers take input in the shape of `(batch_size, timesteps, input_features)`.

Recurrent layers all have **dropout** related params: 
* `dropout=` floating dropout rate for layer inputs
* `recurrent_dropout=` dropout rate for the recurrent unit

Yarin Gal 2015 PhD thesis: recurrent layer dropout should use the **same** dropout mask for every timestep.

`keras.layers.LSTM()` has boolean parameter `return_sequences=` to either return sequences, or the last element of the returned sequence. 

Parameter `implementation=` (either 1 or 2) controls how computations are done. Looks like mode 2 is vectorized for batch processing. See code [here](https://github.com/keras-team/keras/blob/d9f26a92f4fdc1f1e170a4203a7c13abf3af86e8/keras/layers/recurrent.py#L1821)

`keras.layers.Bidirectional()` for Bidirectional RNN.

### Load Model Weights

Once you build a model, you can use `model.load_weights()` to load previously saved weights.

### Layer Weights

To **freeze** layer weights, set `trainable=False` when instantiating the layer. 

Use `set_weights()` to set layer weights to pre-trained values. Example below, thanks to Andrew Ng's Deeplearning.ai Coursera course:


```
embedding_layer = Embedding(input_dim=vocab_len, output_dim=emb_dim, trainable=False)
# or set embedding_layer.trainable = False

# Build the embedding layer, it is required before setting the weights of the embedding layer. 
# Do not modify the "None".
embedding_layer.build((None,))

# Set the weights of the embedding layer to the embedding matrix. 
# Your layer is now pretrained.
embedding_layer.set_weights([emb_matrix])
```

## Training

### Regularization

Use `keras.regularizers.*`. Instances can be passed to layers using param `kernel_regularizer=`.

### Metrics

* **Balanced**-classification: ROC AUC
* **Imbalanced**-classification: precision and recall, F1 score
* **Ranking/Multi-label classification**: mean average precision. 

**TODO**: data generators, `model.fit_generator()`

### Multiple Inputs


### Multiple Outputs / Loss functions

### Callbacks

### Tensorboard



## Preprocessing

Always **remove** redundancy in your data. 

**One Hot Encoding**: `keras.utils.np_utils.to_categorical()`

**Sequence Padding**: `keras.preprocessing.sequence.pad_sequence()`

## Modeling Tips

Be aware of **nonstationary** problems. Because such problems change over time, the right move is:
* constantly training on recent data, or
* gather data at a timescale where the problem is stationary.

