**Les packages deep Learning**


1.   Tensorflow - Google
2.   PyTorch - Facebook AI research
3.   Keras - Fraçois Chollet (now at Google)
4.   Chainer - Company in Japan
5.   Caffee - Barkeley vision and learning center
6.   Microsoft







**Implement a neuronal network in keras**


1.   Prepare inputs( image, video, audio, text)
2.   Define the ANN model( MLP, CNN, RNN)
3.   Optimizers ( SGD, RMSprop, Adam)
4.   Loss function ( MSE, MAE, cross entropy )
5.   Train and evaluate model







**Procedure to implement an ANN in keras**

In [None]:
from tensorflow import keras

In [1]:
import numpy as np

In [None]:
# Define the ANN model
model = keras.models.Sequential()
model.add(keras.layers.Dense(units=64, input_dim=100), input_shape=X_train.shape[1:]) # input_dim: taille de l'entrée
model.add(keras.layers.Activation(activation=='relu'))
model.add(keras.layers.Dense(units=10))
model.add(keras.layers.Activation(activation='softmax'))

# optimizers and loss function
model.compile(optimizers='sgd', loss='categorical_crossentropy', metrics= ['accuracy'])

# Train and evaluate the model
model.fit(X_train, y_train, batch_size=32, epochs=10)

**Convolutional Neuronal Network - Sequential model**



*   Input dimension 4D - [N_train, Height, Width, chanels]
*   N_train = Number of train, Height = height of the image, width = width of image, For RGB image channels = 3, For gray scale image channel = 1



In [None]:
model = keras.models.Sequential()
# input: 100x100 images with 3 channels -> (100, 100, 3) tensors
# this applies 32 convolution filters of size 3x3 each

model.add(keras.layers.Conv2D(filters=32, kernel_size=(3,3),activation='relu', input_shape=(100,100,3))
model.add(keras.layers.Conv2D(filters=32, kernel_size=(3,3),activation='relu')
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Dropout(0.25)) # Eliminer les noeuds non utiles

model.add(keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'))
model.add(keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Dropout(0.25))
model.add(keras.layers.Dense(units=10, activation='softmax'))

sgd = keras.optimizers.SGD(learning_rate=0.01, decay = 1e-6, momentum=0.9, nesterov=True)
model.optimizers(optimizers=sgd, loss='categorical_crossentropy')

model.fit(X_train, y_train, batch_size=32, epochs=10)
score = model.evaluate(X_test, y_test, batch_size=32)

**Simple MLP network - Functional model**

In [None]:
from keras.models import Model
from keras.layers import Input, Dense

# this returns a tensor
inputs = Input(shape=(784,))

x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

model = Model(inputs= inputs, outputs =predictions)
model.compile(optimizer = 'rmsprop', loss='categorical_crossentropy', metrics = ['accuracy'])
model.fit(X_train, y_train)

**Recurrent Neuronal Network - RNN**


1.   RNNs are used on sequential data, text, audio, Genomes, etc.
2.   Recurrents networks are of three types
  *   Vanilla RNN
  *   LSTM
  *   GRU
4.   The output at time "t" is dependent  on current input and previous values







In [None]:
from keras.layers import Dense, LSTM, Embedding, Dropout
from keras.models import Sequential

model = Sequential()
model.add(Embedding(input_dim=vocabulaire_size, output_dim=dim_of_output, input_length=timestamp)) # input_length: max len of sequences
model.add(LSTM(128))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics = ['accuracy'])

model.fit(X_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(X_test, y_test, batch_size=16)

**Covolution layers**

• 1D Conv

Applications: Audio signal processing, Natural language processing

• 2D Conv

Applications: Computer vision ‐ Images

• 3D Conv

Applications: Computer vision – Videos (Convolution along temporal dimension)

• Max Polling, Average Polling



**General layers**

*   Dense
*   Dropout
*   Embedding
*   Flatten





**Optimization**

SGD – Stochastic gradient descent

SGD with momentum

Adam

AdaGrad

RMSprop

AdaDelta

**Loading and saving keras models**

In [None]:
from keras.models import load_model

model.save('my_model.h5') # Enrégistrer le modèle
del model # supprimer le modèle

model = load_model('my_model.h5') # charger le modèle

model.save_weights("my_model_save_weights.h5")
model.load_weights("my_model_save_weights.h5", by_name=True)

**Extracting features from pre-trained models**

In [None]:
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
from keras.applications.vgg16 import decode_predictions

In [None]:
img = image.load_img('elephant.jpg', target_size=(224,224))
x = image.img_to_array(img)
x.shape

array([[[0.74509805, 0.74509805, 0.74509805],
        [0.69803923, 0.69803923, 0.69803923],
        [0.73333335, 0.73333335, 0.73333335],
        ...,
        [0.5529412 , 0.5529412 , 0.5529412 ],
        [0.5058824 , 0.5058824 , 0.5058824 ],
        [0.5137255 , 0.5137255 , 0.5137255 ]],

       [[0.7764706 , 0.7764706 , 0.7764706 ],
        [0.6862745 , 0.6862745 , 0.6862745 ],
        [0.6901961 , 0.6901961 , 0.6901961 ],
        ...,
        [0.54901963, 0.54901963, 0.54901963],
        [0.5058824 , 0.5058824 , 0.5058824 ],
        [0.5019608 , 0.5019608 , 0.5019608 ]],

       [[0.78039217, 0.78039217, 0.78039217],
        [0.7529412 , 0.7529412 , 0.7529412 ],
        [0.6509804 , 0.6509804 , 0.6509804 ],
        ...,
        [0.5411765 , 0.5411765 , 0.5411765 ],
        [0.52156866, 0.52156866, 0.52156866],
        [0.50980395, 0.50980395, 0.50980395]],

       ...,

       [[0.30980393, 0.30980393, 0.30980393],
        [0.11764706, 0.11764706, 0.11764706],
        [0.09803922, 0

In [None]:
x = np.expand_dims(x, axis = 0)
x.shape

(1, 224, 224, 3)

In [None]:
x = preprocess_input(x)
x

array([[[[ 86.061    ,  73.221    ,  66.32     ],
         [ 74.061    ,  61.221    ,  54.32     ],
         [ 83.061    ,  70.221    ,  63.32     ],
         ...,
         [ 37.060997 ,  24.221    ,  17.32     ],
         [ 25.060997 ,  12.221001 ,   5.3199997],
         [ 27.060997 ,  14.221001 ,   7.3199997]],

        [[ 94.061    ,  81.221    ,  74.32     ],
         [ 71.061    ,  58.221    ,  51.32     ],
         [ 72.061    ,  59.221    ,  52.32     ],
         ...,
         [ 36.060997 ,  23.221    ,  16.32     ],
         [ 25.060997 ,  12.221001 ,   5.3199997],
         [ 24.060997 ,  11.221001 ,   4.3199997]],

        [[ 95.061    ,  82.221    ,  75.32     ],
         [ 88.061    ,  75.221    ,  68.32     ],
         [ 62.060997 ,  49.221    ,  42.32     ],
         ...,
         [ 34.060997 ,  21.221    ,  14.32     ],
         [ 29.060997 ,  16.221    ,   9.32     ],
         [ 26.060997 ,  13.221001 ,   6.3199997]],

        ...,

        [[-24.939003 , -37.779    , -4

In [None]:
model = VGG16(include_top=True, weights='imagenet')

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5


In [None]:
features = model.predict(x)

In [None]:
decode_predictions(features)

[[('n03804744', 'nail', 0.86517066),
  ('n04208210', 'shovel', 0.053788487),
  ('n03481172', 'hammer', 0.011838953),
  ('n04456115', 'torch', 0.010744348),
  ('n03976657', 'pole', 0.009161635)]]

**Popular Deep learning architectures**


*   Popular Convolution networks
  * Alex Net
  * VGG
  * Res-Net
  * DenseNet

**Generative models**
  * Autoencoders
  * Generative adversarial networks



In [None]:
from keras.applications.densenet import DenseNet121
from keras.applications.densenet import preprocess_input
from keras.preprocessing import image
import numpy as np
from keras.applications.densenet import decode_predictions

In [None]:
img = image.load_img('elephant.jpg', target_size=(224,224))
x = image.img_to_array(img)

In [None]:
x = np.expand_dims(x, axis=0)
x.shape

(1, 224, 224, 3)

In [None]:
x = preprocess_input(x)

In [None]:
model = DenseNet121(include_top=True,weights='imagenet')

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/densenet/densenet121_weights_tf_dim_ordering_tf_kernels.h5


In [None]:
pred = model.predict(x)

In [None]:
decode_predictions(pred)

[[('n03743016', 'megalith', 0.58345884),
  ('n03804744', 'nail', 0.082686745),
  ('n03388043', 'fountain', 0.07826182),
  ('n03498962', 'hatchet', 0.03506918),
  ('n04507155', 'umbrella', 0.023603663)]]

**Autoencoders**

*   Unsupervised representation learning
*   Dimensionality reduction 
*   Denoising





**Embedding example with keras**

In [None]:
import tensorflow as tf

In [None]:
corpus = ['Je ne parviens pas à me connecter.', 'J\'ai un problème d\'authentification!', 'Le site ne fonctionne pas.']

In [None]:
tokenizer = tf.keras.preprocessing.text.Tokenizer()

In [None]:
tokenizer.fit_on_texts(corpus)

In [None]:
sequence = tokenizer.texts_to_sequences(corpus)

In [None]:
sequence

[[3, 1, 4, 2, 5, 6, 7], [8, 9, 10, 11], [12, 13, 1, 14, 2]]

In [None]:
X = tf.keras.preprocessing.sequence.pad_sequences(sequence, padding='post')

In [None]:
X
# NB: SI ON A UNE MATRICE, LE TIMESTAMP EST LA TAILLE DES VECTEURS

array([[ 3,  1,  4,  2,  5,  6,  7],
       [ 8,  9, 10, 11,  0,  0,  0],
       [12, 13,  1, 14,  2,  0,  0]], dtype=int32)

In [None]:
"""
Soit une vecteur de taille 7 donné comme input d'un embedding dont le output_dim = 100,
La sortie de l'embedding sera une matrice de nombre de ligne 7 et de nombre de colonne 100.
Chaque ligne de l'embedding correspond à un élément du vecteur d'entrée (les éléments du vecteur d'entrée sont des mots)
"""

In [None]:
model = tf.keras.layers.Embedding(input_dim=100, output_dim=4, input_length=7)
# L'embedding transforme la représentation avec chaque mot sous forme de vecteur, 3 un vecteur, 1 un vecteur, ...

In [None]:
model.apply(X)



<tf.Tensor: shape=(3, 7, 4), dtype=float32, numpy=
array([[[-0.04294347, -0.03398128, -0.0313089 , -0.03673046],
        [-0.00664299, -0.03758215,  0.02055342,  0.04978806],
        [-0.04625075, -0.04191046, -0.00271885, -0.04791454],
        [-0.02095013, -0.03207104,  0.02627982, -0.0284798 ],
        [ 0.03582582,  0.00028412,  0.00941451,  0.01918032],
        [-0.02664346, -0.01952302, -0.00522021,  0.04646684],
        [-0.02310633,  0.01790834, -0.00562872,  0.04556337]],

       [[ 0.03047632, -0.03357361, -0.03934053, -0.00377906],
        [ 0.00534134,  0.03101858, -0.00937765, -0.02037183],
        [ 0.0153167 ,  0.00015892,  0.01693371,  0.0445505 ],
        [-0.00190324,  0.0191715 , -0.02550181, -0.01562681],
        [ 0.04827042, -0.02102194,  0.03195012,  0.01977127],
        [ 0.04827042, -0.02102194,  0.03195012,  0.01977127],
        [ 0.04827042, -0.02102194,  0.03195012,  0.01977127]],

       [[ 0.00136634,  0.01827686, -0.04504254,  0.02620559],
        [-0.028

In [None]:
import numpy as np

In [None]:
y = np.random.randint(1,3,3) # Générer un vecteur de taille 3 avec les éléments compris entre 1 et 3

In [None]:
input = tf.keras.layers.Input(shape=(None,)) # OR input = f.keras.layers.Input(shape=(timestamp,)), dans l'exemple timestamp = 7
x = tf.keras.layers.Embedding(input_dim=100, output_dim=4, input_length=7)(input)
x = tf.keras.layers.LSTM(units=10,activation='relu')(x)
output = tf.keras.layers.Dense(units=1, activation='sigmoid')(x)
model = tf.keras.models.Model(input, output)

# si on ajoute la couche Embedding, shape c'est (None,), L'Embedding prend en entré une matrice pas des tensors (pour LSTM)
# si on met direcetement LSTM après Input, alors shape c'est (None, nb_features) oubien (timestamp, number_of_features) or (timestamp, 1)

In [None]:
model.summary() # (None, None, 4) -> (batch_size, timestamp, output_dim_embedding)

Model: "model_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_11 (InputLayer)        [(None, None)]            0         
_________________________________________________________________
embedding_12 (Embedding)     (None, None, 4)           400       
_________________________________________________________________
lstm_11 (LSTM)               (None, 10)                600       
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 11        
Total params: 1,011
Trainable params: 1,011
Non-trainable params: 0
_________________________________________________________________


In [None]:
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
model.fit(X,y)



<tensorflow.python.keras.callbacks.History at 0x7f2ec44b0a90>

In [None]:
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Embedding(input_dim=100, output_dim=4, input_length=7))
model.add(tf.keras.layers.LSTM(units=10,activation='relu'))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

In [None]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_15 (Embedding)     (None, 7, 4)              400       
_________________________________________________________________
lstm_14 (LSTM)               (None, 10)                600       
_________________________________________________________________
dense_6 (Dense)              (None, 1)                 11        
Total params: 1,011
Trainable params: 1,011
Non-trainable params: 0
_________________________________________________________________


In [None]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
model.fit(X, y)



<tensorflow.python.keras.callbacks.History at 0x7f2ec6cb0bd0>

**Word2Vec**

In [None]:
from gensim.models import Word2Vec

In [None]:
corpus = [['I', 'am', 'a', 'data', 'scientist'],
          ['This','is', 'a', 'malaware']]

In [None]:
model = Word2Vec(corpus, min_count=1)

In [None]:
print(model) # vocab = mots différents
             # size = taille de la représentation de chaque mot

Word2Vec(vocab=8, size=100, alpha=0.025)


In [None]:
model.wv.vocab

{'I': <gensim.models.keyedvectors.Vocab at 0x7fe3e71bd290>,
 'This': <gensim.models.keyedvectors.Vocab at 0x7fe3e709b550>,
 'a': <gensim.models.keyedvectors.Vocab at 0x7fe3e71bd350>,
 'am': <gensim.models.keyedvectors.Vocab at 0x7fe3e71bd050>,
 'data': <gensim.models.keyedvectors.Vocab at 0x7fe3e71bd610>,
 'is': <gensim.models.keyedvectors.Vocab at 0x7fe3e709b210>,
 'malaware': <gensim.models.keyedvectors.Vocab at 0x7fe3e709bad0>,
 'scientist': <gensim.models.keyedvectors.Vocab at 0x7fe3e709b290>}

In [None]:
model.predict_output_word(['I', 'am', 'a', 'data']) # prédire le mot suivant

[('a', 0.125),
 ('I', 0.125),
 ('am', 0.125),
 ('data', 0.125),
 ('scientist', 0.125),
 ('This', 0.125),
 ('is', 0.125),
 ('malaware', 0.125)]

In [None]:
model['data'] # Représentation de data liste de 100 élément

  """Entry point for launching an IPython kernel.


array([ 2.9008936e-03,  1.4743499e-03,  5.1800738e-04, -3.8437792e-03,
       -1.7304494e-03,  3.1739241e-03, -3.7059486e-03,  2.6845699e-03,
        2.4646639e-03, -3.3242337e-03, -8.2613807e-04, -3.3727719e-03,
       -2.4857877e-03, -4.5956350e-03,  4.8846272e-03,  4.8293262e-03,
       -4.4398955e-03, -1.2457266e-03,  2.9497666e-03, -1.4369456e-03,
        2.7195660e-03, -2.9888127e-03, -3.9194138e-03,  3.8750728e-03,
        3.3346724e-03,  2.6045246e-03,  5.6973536e-04, -3.5752759e-03,
       -1.9478794e-03, -1.0066483e-03,  6.1319716e-04,  4.1683810e-03,
       -2.9652659e-03, -3.7466181e-03,  1.3305707e-03, -4.2706896e-03,
       -4.2390435e-05,  1.4002725e-03,  7.8253960e-04,  4.6119695e-03,
        1.1835456e-03,  4.9502426e-03, -1.5673185e-03, -2.3787894e-04,
       -2.4688540e-03, -4.1923248e-03,  4.5231888e-03, -3.6141539e-03,
        4.6088337e-03,  8.1892277e-04,  3.1642914e-03, -4.2730425e-03,
        1.5486550e-03, -1.6930433e-03,  1.6077273e-03, -4.2019146e-03,
      

In [None]:
model.similar_by_word('data') # les mots proches de data

  """Entry point for launching an IPython kernel.


[('a', 0.12006048858165741),
 ('malaware', 0.09992823004722595),
 ('am', 0.07582263648509979),
 ('This', 0.03445989266037941),
 ('is', 0.03211004659533501),
 ('I', 0.029867811128497124),
 ('scientist', -0.03452153131365776)]

In [None]:
model.similarity('data', 'scientist') # distance entre data et scientist

  """Entry point for launching an IPython kernel.


-0.03452153