<a href="https://colab.research.google.com/github/GiX7000/deep-learning-with-python/blob/main/DeepLearning_with_Python_Part4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Learning examples from the book 'Deep Learning with Python', Part 4, Francois Chollet

## Example 1. Functional API

When we need to have several independent inputs, multiple outputs or more complex internal topology/processes, like inception modules and residual connections(from page 242), functional API helps us to manipulate layers as functions taking tensors as inputs and outputs. In addition, we can reuse a layer instance several times(meaning we can share the same weights/knowledge without creating a new layer) and also use whole models as layers(@page247)

The simplest functional API model with 1 input and 1 output

In [None]:
# define a functional API model
from keras.models import Model
from keras import layers
from keras import Input

input_tensor = Input(shape=(64,))
x = layers.Dense(32, activation='relu')(input_tensor)
x = layers.Dense(32, activation='relu')(x)
output_tensor = layers.Dense(10, activation='softmax')(x)

model = Model(input_tensor, output_tensor)

model.summary()

In [None]:
# compile, train and evaluate the functional API model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# create some very simple data
import numpy as np
x_train = np.random.random((1000, 64))
y_train = np.random.random((1000, 10))

# train the model
model.fit(x_train, y_train, epochs=10, batch_size=128)

# evaluate the model
score = model.evaluate(x_train, y_train)
print('\nThe score of the model is:', score)

2-input functional API model(question-answering model) with 1 output

In [None]:
# define a functional API model with 2 inputs: a reference text and a question
from keras.models import Model
from keras import layers
from keras import Input

text_vocabulary_size = 10000
question_vocabulary_size = 10000
answer_vocabulary_size = 500

# define 'text' part of NN

# input of text part of NN
text_input = Input(shape=(None,), dtype='int32', name='text')
# 1st layer of text part of NN
embedded_text = layers.Embedding(64, text_vocabulary_size)(text_input)
# 2nd layer of text part of NN
encoded_text = layers.LSTM(32)(embedded_text)

# define 'question' part of NN

# 1st layer of question part of NN
question_input = Input(shape=(None,), dtype='int32', name='question')
# 1st layer of question part of NN
embedded_question = layers.Embedding(32, question_vocabulary_size)(question_input)
# 2nd layer of question part of NN
encoded_question = layers.LSTM(16)(embedded_question)

# concatenate the above 2 parts 
concatenated = layers.concatenate([encoded_text, encoded_question], axis=-1)

# put the above into a final dense layer
answer = layers.Dense(answer_vocabulary_size, activation='softmax')(concatenated)

model = Model([text_input, question_input], answer)

model.summary()

To train a model like this, we can feed the model either a list of numpy arrays as inputs or a dictionary that maps input names to numpy arrays

In [None]:
# compile and train the functional API model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['acc'])

# create some random data
import numpy as np
num_samples = 1000
max_length = 100

text = np.random.randint(1, text_vocabulary_size, size=(num_samples, max_length))   # generates dummy numpy data
question = np.random.randint(1, question_vocabulary_size, size=(num_samples, max_length))
answers = np.random.randint(0, 1, size=(num_samples, answer_vocabulary_size))   # answers are one-hot encoded, not integers

# train the model
model.fit([text, question], answers, epochs=10, batch_size=128) # fitting using a list of inputs

# fitting using a dictionary of inputs(only if inputs are named)
#model.fit( {'text':text, 'question':question}, answers, epochs=10, batch_size=128 )

Multi-output functional API model with 1 input

This model takes as input a series of posts of social media from a single person and tries to predict attributes of that person, such as age, gender and income level

In [15]:
from keras.models import Model
from keras import layers
from keras import Input

vocabulary_size = 50000
num_income_groups = 10

# model's architecture
posts_inputs = Input(shape=(None,), dtype='int32', name='posts')
embedded_posts = layers.Embedding(256, vocabulary_size)(posts_inputs)
x = layers.Conv1D(128, 5, activation='relu')(embedded_posts)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.MaxPooling1D(5)(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.Conv1D(256, 5, activation='relu')(x)
x = layers.MaxPooling1D()(x)
x = layers.Dense(128, activation='relu')(x)

# model's 3 outputs defining
age_prediction = layers.Dense(1, name='age')(x)
income_prediction = layers.Dense(num_income_groups, activation='softmax', name='income')(x)
gender_prediction = layers.Dense(1, activation='sigmoid', name='gender')(x)

model = Model(posts_inputs, [age_prediction, income_prediction, gender_prediction])

model.summary()

Model: "model_5"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 posts (InputLayer)             [(None, None)]       0           []                               
                                                                                                  
 embedding_8 (Embedding)        (None, None, 50000)  12800000    ['posts[0][0]']                  
                                                                                                  
 conv1d (Conv1D)                (None, None, 128)    32000128    ['embedding_8[0][0]']            
                                                                                                  
 max_pooling1d (MaxPooling1D)   (None, None, 128)    0           ['conv1d[0][0]']                 
                                                                                            

When we have multiple outputs, training requires to specify different loss functions for different heads of network. The simplest way to combine differet losses is to sum them all

In [16]:
# compile the model
#model.compile(optimizer='rmsprop', loss=['mse', 'categorical_crossentropy', 'binary_crossentropy'])

# if you have given names to output layers
#model.compile(optimizer='rmsprop', loss={'age':'mse', 'income':'categorical_crossentropy', 'gender':'binary_crossentropy'})

# we assign different levels of importance to the losses
model.compile(optimizer='rmsprop', loss=['mse', 'categorical_crossentropy', 'binary_crossentropy'],
              loss_weights=[0.25, 1., 10.])

In [None]:
# train the model
model.fit(posts_inputs, [age_prediction, income_prediction, gender_prediction], epochs=10, batch_size=64) # age_targets,income_targets and gender_targets are numpy arrays!

# and if you have given names to output layers
#model.fit(posts, {'age':age_targets, 'income':income_targets, 'gender':gender_targets}, epochs=10, batch_size=64)

Multi-input, Multi-output functional API model

In [3]:
from keras.models import Model
from keras import layers
from keras import Input
import keras

vocabulary_size = 10000
num_tags = 100
num_departments = 4

# model's architecture
title = keras.Input(shape=(vocabulary_size,), name="title")
text_body = keras.Input(shape=(vocabulary_size,), name="text_body")
tags = keras.Input(shape=(num_tags,), name="tags")

features = layers.Concatenate()([title, text_body, tags])
features = layers.Dense(64, activation="relu")(features)

priority = layers.Dense(1, activation="sigmoid", name="priority")(features)
department = layers.Dense(
    num_departments, activation="softmax", name="department")(features)

model = keras.Model(inputs=[title, text_body, tags], outputs=[priority, department])

In [None]:
import numpy as np
num_samples = 1280

# the data
title_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))
text_body_data = np.random.randint(0, 2, size=(num_samples, vocabulary_size))
tags_data = np.random.randint(0, 2, size=(num_samples, num_tags))

priority_data = np.random.random(size=(num_samples, 1))
department_data = np.random.randint(0, 2, size=(num_samples, num_departments))

# compile, train, evaluate and predict
model.compile(optimizer="rmsprop",
              loss=["mean_squared_error", "categorical_crossentropy"],
              metrics=[["mean_absolute_error"], ["accuracy"]])
model.fit([title_data, text_body_data, tags_data],
          [priority_data, department_data],
          epochs=1)
model.evaluate([title_data, text_body_data, tags_data],
               [priority_data, department_data])
priority_preds, department_preds = model.predict([title_data, text_body_data, tags_data])

## Example 2. Using callbacks and tensorboard

Callbacks: a way to measure that the validation loss is no longer improving, so we can stop training and avoid wasting time. They can interrupt training, save a model, load a different weight set or alter the state of the model. Some ways you can use callbacks: model checkpointing, early stopping, dynamically adjusting the value of certain parameters during training, logging training and validation metrics during training or visualizing the representations learned by the model as they are updated. Some callbacks that keras module includes:
*   keras.callbacks.ModelCheckpoint
*   keras.callbacks.EarlyStopping
*   keras.callbacks.LearningRateScheduler
*   keras.callbacks.ReduceLROnPlateau
*   keras.callbacks.CSVLogger



In [None]:
# how to use ModelCheckpoint and EarlyStopping
import keras

callbacks_list = [
    keras.callbacks.EarlyStopping(monitor='acc', patience=1,),    # interrupts when improvement stops=> when validation accuracy has stopped improving for more than 1 epoch
    keras.callbacks.ModelCheckpoint(filepath='mymodel.h5', monitor='val_loss', save_best_only=True,)    # saves the weights after every epoch, if and only if validation loss has improved, which allows us to keep the best model
]

model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

model.fit(x, y, batch_size=32, callbacks=callbacks_list, validation_data=(x_val, y_val))

You can also write your own callback(@page251)

TensorBoard: a browser-based useful tool that helps us visually monitor everything that goes inside your model during training. In this way, you develop a clearer vision of what your model does and does not do. Specifically, you can: 1) visually monitor metrics during training, 2) visualize your model's architecture, 3) visualize histograms of activations and gradients, 4) explore embeddings in 3D

Let's see an example of how to use tensorboard in a text classification model implementation

In [None]:
import keras
from keras import layers
from keras.layers import Sequential
from keras.datasets import imdb
from keras.utils import pad_sequences

max_features = 2000
maxlen = 500

(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)   # loads data as lists of integers

# all reviews with max_len words
x_train = pad_sequences(x_train, maxlen=maxlen)
x_test = pad_sequences(x_test, maxlen=maxlen)

# model's architecture
model = Sequential()
model.add(layers.Embedding(max_features, 128, input_length=maxlen, name='embed'))
model.add(layers.Conv1D(32, 7, activation='relu'))  
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))

model.summary()

model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

In [None]:
# to start using tensorboard, we need to create a directory where we will store the log files it generates
!mkdir my_log_dir

In [None]:
# start training with a tensorboard callback
callbacks = [ keras.callbacks.TensorBoard(log_dir='full path to my_log_dir', histogram_freq=1, embeddings_freq=1,) ] # records activation histograms and embedding data every 1 epoch

history = model.fit(x_train, y_train, epochs=20, batch_size=128, validation_split=0.2, callbacks=callbacks)

In [None]:
%load_ext tensorboard
%tensorboard --logdir /full_path_to my_log_dir

In [6]:
!mkdir celeba_gan
!wget https://drive.google.com/open?id=0B7EVK8r0v71pZjFTYXZWM3FlRnM
!unzip -qq celeba_gan/data.zip -d celeba_gan

mkdir: cannot create directory ‘celeba_gan’: File exists
--2022-11-30 19:51:53--  https://drive.google.com/open?id=0B7EVK8r0v71pZjFTYXZWM3FlRnM
Resolving drive.google.com (drive.google.com)... 173.194.213.139, 173.194.213.102, 173.194.213.101, ...
Connecting to drive.google.com (drive.google.com)|173.194.213.139|:443... connected.
HTTP request sent, awaiting response... 307 Temporary Redirect
Location: https://drive.google.com/file/d/0B7EVK8r0v71pZjFTYXZWM3FlRnM/view?usp=drive_open [following]
--2022-11-30 19:51:53--  https://drive.google.com/file/d/0B7EVK8r0v71pZjFTYXZWM3FlRnM/view?usp=drive_open
Reusing existing connection to drive.google.com:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘open?id=0B7EVK8r0v71pZjFTYXZWM3FlRnM’

open?id=0B7EVK8r0v7     [ <=>                ]  72.92K  --.-KB/s    in 0.003s  

2022-11-30 19:51:55 (21.4 MB/s) - ‘open?id=0B7EVK8r0v71pZjFTYXZWM3FlRnM’ saved [74667]

unzip:  cannot find or open celeba_gan/data

## Some general tips to advance your models and Bye!

When building high-performing deep convnets, try using BatchNormalization layer or SeparableConv2D layer(@page260)

Hyperparameter optimization:
*   choose a set of hyperparameters(automatically)
*   build the corresponding model
*   fit it to your training data, and measure the final performance on validation data
*   choose the next set of hyperparameters to try(automatically)
*   repeat
*   when no improvement, choose the best model and measure performance on your test data



There are several techniques to automatically choose hyperparameters like Bayesian optimization, genetic algorithms, simple random search, and so on. Try Hyperopt(https://github.com/hyperopt/hyperopt)

Another very powerful technique to improve performance of your model is model ensembling: consists of pooling together the predictions of a set of models. The simplest ensembling is to give the same importance/weights to all different model predictions 

The last chapter of the book consists of some advanced concepts like text generation with LSTM, DeamDream, my favorite Neural style transfer, generating images with variational autoencoders and GANs. I won't make any further reference to all of them at this repo at this time

You can also find the official implementations of all the examples I presented in all parts of this repo(which are also included to the book) here: https://github.com/fchollet/deep-learning-with-python-notebooks

That's it from me, this was a quick introduction to deep learning with Python. The next step is to experiment with many different datasets and models for several tasks. I wish you GOOD LUCK !!