<a href="https://colab.research.google.com/github/Black3rror/AI/blob/master/Keras_cheat_sheet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Goal

I will write everything useful I know in Keras, so you can just copy and paste parts of this notebook that you need

# # TODO
- Add preprocessing cheats

---
# Importing stuff

In [None]:
import numpy as np    # tf uses np so probabily we use np in our code
import tensorflow as tf
from tensorflow import keras

# use below as you need
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D, Activation

import matplotlib.pyplot as plt   # if u want to show imgs by pyplot

from tensorflow.keras.callbacks import TensorBoard
import datetime   # to organize TensorBoard files

In [None]:
from keras import backend as K    # usually used to make custom things in Keras

from scipy.io import loadmat    # to load from .mat files

from keras.utils import to_categorical    # to change a number to one-hot key

---
# Initialization

## Check if we run on GPU

To enable GPUs for the notebook:

- Navigate to Edit→Notebook Settings
- select GPU from the Hardware Accelerator drop-down

In [None]:
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Or

In [None]:
assert len(tf.config.list_physical_devices('GPU')) > 0

## Load the dataset

### Downloading from Keras datasets

In [None]:
(trainX, trainy), (testX, testy) = keras.datasets.cifar10.load_data()   # import well known datasets

### Downloading from internet

In [None]:
trainURL = 'http://ufldl.stanford.edu/housenumbers/train_32x32.mat'
urllib.request.urlretrieve(trainURL, 'train_32x32.mat')

## Preprocessing

---
# Build the model

## Dense layers

To learn more: [Keras Dense Layer](https://keras.io/api/layers/core_layers/dense/)

```python
tf.keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
```
Glorot uniform is also called Xavier uniform

In [None]:
model = Sequential()
model.add(Dense(64, input_shape = (trainX.shape[1], ) ))    # train shape is like 1000 X 100
model.add(Activation('sigmoid'))
model.add(Dense(32, activation='relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

One line version

In [None]:
model = keras.Sequential([
  layers.Dense(64, activation = 'relu', input_shape = (trainX.shape[1], ) ),
  layers.Dense(64, activation = 'relu'),
  layers.Dense(1)
])

## Conv layers

To learn more: [Keras Conv2D Layer](https://keras.io/api/layers/convolution_layers/convolution2d/)

```python
tf.keras.layers.Conv2D(
    filters,
    kernel_size,
    strides=(1, 1),
    padding="valid",
    data_format=None,
    dilation_rate=(1, 1),
    groups=1,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)
```

In [None]:
model.add(Conv2D(256, (3, 3), input_shape=x_train.shape[1:]))   # train shape is like 1000 X 32 X 32 X 3
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

model.add(Conv2D(256, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))

model.add(Dense(10))
model.add(Activation('softmax'))

## Custom layers

To learn more: [Keras Custom Layer](https://www.tensorflow.org/tutorials/customization/custom_layers#implementing_custom_layers) or with more information here: [Keras Subclassing Layers And Models](https://keras.io/guides/making_new_layers_and_models_via_subclassing/)\
To learn more about Layer class in keras: [Keras Layer Class](https://keras.io/api/layers/base_layer/)

In [None]:
class RBFLayer(Layer):
    def __init__(self, units, gamma, **kwargs):
        super(RBFLayer, self).__init__(**kwargs)
        self.units = units
        self.gamma = K.cast_to_floatx(gamma)

    def build(self, input_shape):   # input_shape = [batch_size (None), input_dim]
        self.mu = self.add_weight(name='mu',
                                  shape=(int(input_shape[1]), self.units),
                                  initializer='uniform',
                                  trainable=True)

    def call(self, inputs):
        diff = K.expand_dims(inputs) - self.mu
        l2 = K.sum(K.pow(diff, 2), axis=1)
        res = K.exp(-1 * self.gamma * l2)
        return res

    def compute_output_shape(self, input_shape):    # not needed
        return (input_shape[0], self.units)

In [None]:
Or

In [None]:
class Dense_custom(Layer):
  def __init__(self, units, activation=None, **kwargs):
    super(Dense_custom, self).__init__(**kwargs)
    self.units = units
    self.activation = activation
  
  def build(self, input_shape):
    self.w = self.add_weight(name="weights", shape=(input_shape[1], self.units), 
                             initializer="glorot_uniform", trainable=True)
    self.b = self.add_weight(name="biases", shape=(self.units, ), 
                             initializer="zeros", trainable=True)
  
  def call(self, inputs):
    if self.activation == None:
      return tf.matmul(inputs, self.w) + self.b
    else:
      z = tf.matmul(inputs, self.w) + self.b
      act_layer = Activation(self.activation)
      return act_layer(z)

## Custom model

To learn more: [Keras Model Class](https://keras.io/api/models/model/) or this one which has a good example: [Layers And Models Via Subclassing](https://www.tensorflow.org/guide/keras/custom_layers_and_models#the_model_class)

In [None]:
class Model_custom(tf.keras.Model):

  def __init__(self):
    super(Model_custom, self).__init__()
    self.dense1 = tf.keras.layers.Dense(4, activation='relu')
    self.dense2 = tf.keras.layers.Dense(5, activation='softmax')

  def call(self, inputs):
    x = self.dense1(inputs)
    return self.dense2(x)

## See the summery

In [None]:
model.summary()

---
# Compile and fit

## Compile

To learn about Keras compile, fit, evaluate, predict, ... methods: [Keras Model Training APIs](https://keras.io/api/models/model_training_apis/)

```python
Model.compile(
    optimizer="rmsprop",
    loss=None,
    metrics=None,
    loss_weights=None,
    weighted_metrics=None,
    run_eagerly=None,
    **kwargs
)
```

To learn more about Keras metrics: [Keras Metrics](https://keras.io/api/metrics/)

In [None]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

Or

In [None]:
opt = keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss='categorical_crossentropy', optimizer=opt)

### Loss functions

To learn more about Keras loss functions: [Keras Loss Functions](https://keras.io/api/losses/)

#### Probabilistic losses

**Binary Cross Entropy** :\
Use this cross-entropy loss when there are only two label classes (assumed to be 0 and 1). So use it when the output is like: Yes or No, 0 or 1, Left or Right\
Note that `binary cross entropy` is special case of `categorical cross entropy`

**Categorical Cross Entropy** :\
Use this crossentropy loss function when there are two or more label classes. We expect labels to be provided in a one_hot representation. So use it when the output is like: `[0, 1, 0, 0, 0, 0, 0]`

**Sparse Categorical Cross Entropy** :\
Same as `categorical cross entropy` but here the true labels are integer, not one_hot key. although the prediction is still for one_hot format.

In [None]:
y_true = [1, 2]
y_pred = [[0.05, 0.95, 0], [0.1, 0.8, 0.1]]
loss = tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
loss.numpy()

#### Regression losses

**Mean Squared Error (MSE)** :\
Its also called L2 loss.

**Mean Absolute Error (MAE)** :\
Its also called L1 loss.

### Metrics

 To learn more: [Keras Metrics](https://keras.io/api/metrics/)\
 Note that you may use any loss function as a metric.

**Accuracy** :\
Calculates how often predictions equals labels. labels can be float and more than one. But its important that prediction must be exactly equal to label to be count as correct

**Binary Accuracy** :\
Calculates how often predictions matches binary labels. So the predictions will convert to binary.

```python
tf.keras.metrics.BinaryAccuracy(
    name="binary_accuracy", dtype=None, threshold=0.5
)
```

**Categorical Accuracy** :\
Calculates how often predictions matches one-hot labels. So it will assume maximum probability as 1 and all the others as 0

## Fit

```python
Model.fit(
    x=None,
    y=None,
    batch_size=None,
    epochs=1,
    verbose=1,
    callbacks=None,
    validation_split=0.0,
    validation_data=None,
    shuffle=True,
    class_weight=None,
    sample_weight=None,
    initial_epoch=0,
    steps_per_epoch=None,
    validation_steps=None,
    validation_batch_size=None,
    validation_freq=1,
    max_queue_size=10,
    workers=1,
    use_multiprocessing=False,
)
```

`batch_size` defaults to 32

In [None]:
model.fit(trainX, trainy, validation_data=(testX, testy), epochs=100)

### TensorBoard

To learn more about TensorBoard: [Keras TensorBoard](https://www.tensorflow.org/tensorboard)

To see TensorBoard arguments: [Keras TensorBoard Callback](https://keras.io/api/callbacks/tensorboard/)

```python
tf.keras.callbacks.TensorBoard(
    log_dir="logs",
    histogram_freq=0,
    write_graph=True,
    write_images=False,
    update_freq="epoch",
    profile_batch=2,
    embeddings_freq=0,
    embeddings_metadata=None,
    **kwargs
)
```

In [None]:
# suggested after import

# Load the TensorBoard notebook extension (to be able to see it in the notebook). 
# cant write comment in front of it
%load_ext tensorboard
!rm -rf ./logs/   # Clear any logs from previous runs

In [None]:
log_dir = "logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

model.fit(x=x_train, y=y_train, epochs=5, validation_data=(x_test, y_test), 
          callbacks=[tensorboard_callback])

To see the TensorBoard results

In [None]:
%tensorboard --logdir logs/fit    # to run the TensorBoard in the notebook

In [None]:
tensorboard --logdir logs/fit     # execute in cmd

### ModelCheckpoint

```python
tf.keras.callbacks.ModelCheckpoint(
    filepath,
    monitor="val_loss",
    verbose=0,
    save_best_only=False,
    save_weights_only=False,
    mode="auto",
    save_freq="epoch",
    options=None,
    **kwargs
)
```

`save_freq`: 'epoch' or integer. 'epoch' means save after each epoch, n (integer) means save after n batches\
Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch).

In [None]:
checkpoint_filepath = '/tmp/checkpoint'

model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath, save_weights_only=True, save_best_only=True)

model.fit(x=x_train, y=y_train, epochs=5, validation_data=(x_test, y_test), 
          callbacks=[model_checkpoint_callback])

### EarlyStopping

```python
tf.keras.callbacks.EarlyStopping(
    monitor="val_loss",
    min_delta=0,
    patience=0,
    verbose=0,
    mode="auto",
    baseline=None,
    restore_best_weights=False,
)
```

Its better to choose `patience` with respect to batch size and learning rate (how much the loss follows zig-zag format).

In [None]:
earlystopping_callback = tf.keras.callbacks.EarlyStopping(patience=3)
model.fit(x=x_train, y=y_train, epochs=5, validation_data=(x_test, y_test), 
          callbacks=[earlystopping_callback])

---
# Other individual stuff

## Using plt

In [None]:
plt.imshow(img)
plt.show()    # if we want to show diff images seperately we should use it after each image

## Get weights and biases

In [None]:
weights = model.get_weights()
print("weights: ", weights[0])
print("biases: ", weights[1])

## Save and load model (or weights)

In [None]:
model.save('path/to/location')    # save all info necessary to specify a model
model.save_weights('./weights_model_name')

In [None]:
model = keras.models.load_model('path/to/location')
model.load_weights('./weights_model_name')

## Download a file

In [None]:
dataset_path = keras.utils.get_file("/content/train_32x32.mat", 
                                    "http://ufldl.stanford.edu/housenumbers/train_32x32.mat")
print(dataset_path)

## Zip a folder from colab (for download)

In [None]:
# create a zip file. then download that
!zip -r /content/file.zip /content/Folder_To_Zip

## Unrar a rar file

In [None]:
!unrar x /content/logs.rar /content/

## Delete a folder from colab

In [None]:
!rm -rf /content/Folder_To_Delete

## EpochDots to reduce the verbosity

In [None]:
# to reduce logging amount. simply prints a `.` for each epoch, and a full set of metrics every 100 epochs.
history = model.fit(X, y, epochs=1000, validation_split = 0.2, 
                    verbose=0, callbacks=[tfdocs.modeling.EpochDots()])

## Gradient computation with GradientTape ###


In [None]:
# y = x^2 where x = 3
x = tf.Variable(3.0)

# Initiate the gradient tape
with tf.GradientTape() as tape:
  # Define the function
  y = x * x

dy_dx = tape.gradient(y, x)   # derivative of y with respect to x

assert dy_dx.numpy() == 6.0

Or we can minimize a function like below.\
For more information: [Using GradientTape In Keras](https://colab.research.google.com/github/aamini/introtodeeplearning/blob/master/lab1/solutions/Part1_TensorFlow_Solution.ipynb#scrollTo=dQwDhKn8kbO2)

In [None]:
x_1 = tf.Variable([tf.random.normal([1])])
learning_rate = 1e-2 # learning rate for SGD

# We will run SGD for a number of iterations.
for i in range(500):
  with tf.GradientTape() as tape:
    # "forward pass": record the current loss on the tape
    x_2 = 2 * x_1 + 1
    loss = (x_2 - 4)**2

  grad = tape.gradient(loss, x_1) # compute the derivative of the loss with respect to x_1
  new_x = x_1 - learning_rate*grad # sgd update
  x_1.assign(new_x) # update the value of x
  
print(x_1)

## Custom loss functions

In [None]:
def my_loss_fn(y_true, y_pred):
    squared_difference = tf.square(y_true - y_pred)
    return tf.reduce_mean(squared_difference, axis=-1)

model.compile(optimizer='adam', loss=my_loss_fn)

## Add loss for regularization

It can be done by explicitly add it to the loss function in the GradientTape, or it can be done by add_loss() function of Layer or Model class.\
To learn more about add_loss: [Keras add_loss Function](https://keras.io/api/losses/#the-addloss-api)

In [None]:
# use it in the call() function of a Layer or a Model class
self.add_loss(self.rate * tf.reduce_sum(tf.square(inputs)))   # sparsity reg based on the L2 norm of the inputs