# <center><font color='blue'>SENTIMENT ANALYSIS: COVID</center></font>

## Tabla de contenido
- [1 - Objetivos](#1)
- [2 - Librerías necesarias](#2)
- [3 - Carga y visualización de datos](#3)
- [4 - Pre-procesamiento de datos](#4)
    - [4.1. - Datos faltantes](#4.1)
    - [4.2. - Data Categóricos](#4.2)
    - [4.2. - Balanceo de clases](#4.3)
    - [4.4. - Pre-Procesamiento especial para NLP](#4.4)
- [5 - Modelos](#5)
- [6 - Ajuste de hiperparámetros](#6)
- [7 - Conclusiones](#7)
- [8 - Referencias](#8)

<a name="1"></a>
## 1. Objetivos

Practicar con un problema de procesamiento del lenguaje natural.
<br>
Aquí, dado un conjunto de tweets, analizar si el sentimiento es positivo o negativo

<a name="2"></a>
## 2. Librerías necesarias

In [1]:
# que no se impriman info y warnings
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' 

In [2]:
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from tensorflow.keras import layers,callbacks,models,Sequential,losses
import seaborn as sns
from sklearn.metrics import confusion_matrix
from tensorflow import keras
from keras import backend as K
import os,random
import pandas as pd
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.efficientnet import preprocess_input
from tensorflow.keras.layers.experimental import preprocessing
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix
import csv
from datetime import datetime
from sklearn.model_selection import train_test_split 
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import re

<a name="3"></a>
## 3. Carga y visualización de datos

Tenemos 2 datasets, uno para entrenamiento y otro para test:

In [79]:
train_data_pandas = pd.read_csv('data/Corona_NLP_train.csv',encoding='latin-1')
test_data_pandas = pd.read_csv('data/Corona_NLP_test.csv',encoding='latin-1')

In [80]:
train_data_pandas.head()

Unnamed: 0,UserName,ScreenName,Location,TweetAt,OriginalTweet,Sentiment
0,3799,48751,London,16-03-2020,@MeNyrbie @Phil_Gahan @Chrisitv https://t.co/i...,Neutral
1,3800,48752,UK,16-03-2020,advice Talk to your neighbours family to excha...,Positive
2,3801,48753,Vagabonds,16-03-2020,Coronavirus Australia: Woolworths to give elde...,Positive
3,3802,48754,,16-03-2020,My food stock is not the only one which is emp...,Positive
4,3803,48755,,16-03-2020,"Me, ready to go at supermarket during the #COV...",Extremely Negative


<a name="4"></a>
## 4. Pre-procesamiento de datos

<a name="4.1."></a>
### 4.1. Datos faltantes

In [81]:
print(f'Datos faltantes train:\n {train_data_pandas.isnull().sum()} \n')
print(f'Datos faltantes test:\n {test_data_pandas.isnull().sum()}')

Datos faltantes train:
 UserName            0
ScreenName          0
Location         8590
TweetAt             0
OriginalTweet       0
Sentiment           0
dtype: int64 

Datos faltantes test:
 UserName           0
ScreenName         0
Location         834
TweetAt            0
OriginalTweet      0
Sentiment          0
dtype: int64


Vemos que no hay datos faltantes en las columnas que nos interesan (OriginalTweet y Sentiment).

<a name="4.2"></a>
### 4.2. Datos categóricos 

Nos interesaremos en las columnas OriginalTweet y Sentiment; a su vez veremos las distintas opciones de esta última columna:

In [82]:
train_data_pandas['Sentiment'].unique()

array(['Neutral', 'Positive', 'Extremely Negative', 'Negative',
       'Extremely Positive'], dtype=object)

Convertiremos esta columna a valores numéricos; a su vez no nos interesa ser tan específicos respecto a si un sentimiento es postivo o extremadamente positivo, más bien distinguiremos entre positivo y negativo. Los neutrales los consideraremos positivos. 

In [83]:
label_map = {'Extremely Negative':0,'Negative':0,'Neutral':1,'Positive':1,'Extremely Positive':1}
train_data_pandas['Sentiment'] = train_data_pandas['Sentiment'].map(label_map)
test_data_pandas['Sentiment'] = test_data_pandas['Sentiment'].map(label_map)

Chequeamos

In [84]:
train_data_pandas['Sentiment'].unique()

array([1, 0])

<a name="4.3"></a>
### 4.3.  Balanceo de clases

Veamos si las clases están balanceadas.

In [85]:
train_data_pandas['Sentiment'].value_counts()

1    25759
0    15398
Name: Sentiment, dtype: int64

In [86]:
25759/(25759+15398)

0.625871662171684

Tenemos un desbalance moderado.

<a name="4.4"> </a>
### 4.4. Pre-procesamiento especial para NLP

Vamos a pe-procesar el texto de OriginalTweet, para ello:


- Quitaremos las stop-words
- Quitaremos algunos caracteres especiales, como "@"
- Aplicaremos Lemmatization


<b>Nota:</b> Habría que quitar también las puntuaciones, llevar todo a minúscula y tokenizar, pero eso lo haremos luego con TextVectorization.

Descargaremos e imprimiremos para ver las stop words:

In [87]:
nltk.download('stopwords')
# View stopwords
print(stopwords.words('english'))

['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', '

[nltk_data] Downloading package stopwords to /home/marcos/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


También necesitaremos "punkt" y "wordnet":

In [88]:
nltk.download('punkt')
nltk.download('wordnet')

[nltk_data] Downloading package punkt to /home/marcos/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to /home/marcos/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

Quitamos las stop words y aplicaremos la lematización:

In [90]:
# Inicializar lematizador
lemmatizer = WordNetLemmatizer()

# stop words
stop_words = set(stopwords.words('english'))

# Función para quitar palabras de parada y lematizar un texto
def preprocess_text(text):
    words = word_tokenize(text)
    filtered_words = [lemmatizer.lemmatize(word.lower()) for word in words if word.lower() not in stop_words]
    return ' '.join(filtered_words)

# Aplicar la función a la columna 'OriginalTweet' del dataset, tanto en train como test
train_data_pandas['OriginalTweet'] = train_data_pandas['OriginalTweet'].apply(preprocess_text)
test_data_pandas['OriginalTweet'] = test_data_pandas['OriginalTweet'].apply(preprocess_text)

Además vamos a eliminar caracteres especiales, como @ y # (nos quedaremos con otros, como "!", pues pueden ser importantes para el significado).

In [91]:
# Eliminar arrobas en direcciones de correo electrónico o menciones
def preprocess_text2(text):
    return re.sub(r'[@#]', '', text) #&

# lo aplicamos
train_data_pandas['OriginalTweet'] = train_data_pandas['OriginalTweet'].apply(preprocess_text2)
test_data_pandas['OriginalTweet'] = test_data_pandas['OriginalTweet'].apply(preprocess_text2)

Veamos cómo quedaron los datos:

In [92]:
train_data_pandas['OriginalTweet']

0         menyrbie  phil_gahan  chrisitv http : //t.co/...
1        advice talk neighbour family exchange phone nu...
2        coronavirus australia : woolworth give elderly...
3        food stock one empty ... please , n't panic , ...
4        , ready go supermarket  covid19 outbreak . 'm ...
                               ...                        
41152    airline pilot offering stock supermarket shelf...
41153    response complaint provided citing covid-19 re...
41154    know itâs getting tough  kameronwilds rationi...
41155    wrong smell hand sanitizer starting turn ?  co...
41156     tartiicat well new/used rift going $ 700.00 a...
Name: OriginalTweet, Length: 41157, dtype: object

#### Dividimos en train/val

In [93]:
# Dividir los datos en conjuntos de entrenamiento y validación
train_data, val_data = train_test_split(train_data_pandas, test_size=0.2, random_state=42)


#### Vamos ahora a crear los datasets para trabajar con tensorflow

In [94]:
# Cargar el DataFrame de pandas en un objeto tf.data.Dataset
# armo según lo que me interesa
train_dataset = tf.data.Dataset.from_tensor_slices((train_data['OriginalTweet'].values, 
                                              train_data['Sentiment'].values))



validation_dataset = tf.data.Dataset.from_tensor_slices((val_data['OriginalTweet'].values, 
                                              val_data['Sentiment'].values))



In [95]:
# veo un dato de train y uno de test
for example, label in train_dataset.take(1):
  print('text: ', example.numpy())
  print('label: ', label.numpy())
    

text:  b'unemployment claim made online virginia week : monday : 426 tuesday : 2,150 number going get bigger . http : //t.co/fueg2rl2dl'
label:  0


In [96]:
# Repetimos para test

# Cargar el DataFrame de pandas en un objeto tf.data.Dataset
test_dataset = tf.data.Dataset.from_tensor_slices((test_data_pandas['OriginalTweet'].values, 
                                                   test_data_pandas['Sentiment'].values))

Definimos el tamaño del buffer y del lote:

In [97]:
BUFFER_SIZE = 10000
BATCH_SIZE = 64

In [98]:
train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
validation_dataset = validation_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

Veamos algunos ejemplos y sus etiquetas

In [99]:
for example, label in train_dataset.take(1):
  print(f'texts:  {example.numpy()[:3]}\n')
  print(f'labels: , {label.numpy()[:3]}')

texts:  [b'seem forgotten socially distance grocery store , let ( fake ) cough serve reminder .  backup  socialdistancing'
 b'result http : //t.co/m2i4rdpdg5 second survey consumer behavior . see difference 11 day make ... http : //t.co/qpdb6dxaty'
 b'time  coronavirus liquidating everything amazon low price . guitar hanger http : //t.co/78r5y5mhmb morale patch uniform http : //t.co/uu7bakobdu boonie hat http : //t.co/964xoyvjll pond net http : //t.co/3bilq0ssq6']

labels: , [0 1 0]


Ahora crearemos y aplicaremos una capa llamada <a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers/TextVectorization" target='_blanck'>TextVectorization</a>, que quitará las puntuaciones, pasará todo a minúsculas y tokenizará:

In [100]:
VOCAB_SIZE = 10000
max_length = 45 # max length our sequences will be (e.g. how many words from a Tweet does a model see?)


encoder = tf.keras.layers.TextVectorization(max_tokens=VOCAB_SIZE,
                                    output_mode="int",
                                    output_sequence_length=max_length)

# Fit the text vectorizer instance to the training data using the adapt() method
encoder.adapt(train_dataset.map(lambda text, label: text))


A continuación se muestran los primeros 20 tokens:

In [101]:
vocab = np.array(encoder.get_vocabulary())
vocab[:20]

array(['', '[UNK]', 'http', 'coronavirus', 'covid19', 'price', 'store',
       'supermarket', 'food', 'grocery', 'people', 'amp', 'consumer',
       '19', 'covid', 'shopping', 's', 'online', 'need', 'time'],
      dtype='<U27')

Ahora que el vocabulario está configurado, la capa puede codificar el texto en índices. Los tensores de índices son rellenados con 0s para que tengan el tamaño de la secuencia más larga en el lote.

Veamos un ejemplo

In [102]:
encoded_example = encoder(example)[:3].numpy()
encoded_example

array([[1293, 3320, 3087,  375,    9,    6,  172,  835,  738, 1219, 1479,
           1,   99,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0],
       [ 354,    2,    1,  673,  693,   12,  209,   64, 1416, 1276,   37,
          54,    2,    1,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0],
       [  19,    3,    1,  210,  201,  134,    5,    1,    1,    2,    1,
        9555, 5638, 5172,    2,    1,    1, 3978,    2,    1,    1, 3298,
           2,    1,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0,    0,    0,    0,    0,    0,    0,    0,    0,    0,    0,
           0]])

Vemos que rellena con 0s hasta tener siempre un largo de 45.

Con esta configuración, el proceso no es completamente reversible (no hay un mapeo uno a uno)

In [103]:
for n in range(3):
  print("Original: ", example[n].numpy())
  print("Round-trip: ", " ".join(vocab[encoded_example[n]]))
  print()

Original:  b'seem forgotten socially distance grocery store , let ( fake ) cough serve reminder .  backup  socialdistancing'
Round-trip:  seem forgotten socially distance grocery store let fake cough serve reminder [UNK] socialdistancing                                

Original:  b'result http : //t.co/m2i4rdpdg5 second survey consumer behavior . see difference 11 day make ... http : //t.co/qpdb6dxaty'
Round-trip:  result http [UNK] second survey consumer behavior see difference 11 day make http [UNK]                               

Original:  b'time  coronavirus liquidating everything amazon low price . guitar hanger http : //t.co/78r5y5mhmb morale patch uniform http : //t.co/uu7bakobdu boonie hat http : //t.co/964xoyvjll pond net http : //t.co/3bilq0ssq6'
Round-trip:  time coronavirus [UNK] everything amazon low price [UNK] [UNK] http [UNK] morale patch uniform http [UNK] [UNK] hat http [UNK] [UNK] net http [UNK]                     



Puede observarse que hay muchos tokens desconocidos ([UNK])

Finalmente, aplicaremos una capa de embedding.

In [104]:

embedding = tf.keras.layers.Embedding(input_dim=VOCAB_SIZE, # set the input shape; size of our vocabulary
                                 output_dim=128, # set the size of the embedding vector
                                 embeddings_initializer="uniform", # default, initialize embedding vectors randomly
                                 input_length=max_length # how long is each input
                             )

embedding

<keras.layers.core.embedding.Embedding at 0x7f0abc6a1630>

<a name="5"> </a>
## MODELOS

Probaremos distinos modelos.

...




In [None]:
tf.sets

In [151]:
# para guardar los resultados y comparar después

results = []


In [105]:
INPUT_SHAPE=(1,)

<a name="5.1"> </a>
### Modelo 1

In [106]:

def build_model_1(input_shape):
    inputs = layers.Input(shape=input_shape, dtype=tf.string) # inputs are 1-dimensional strings
    x = encoder(inputs) # turn the input text into numbers 
    x = embedding(x)
    x = tf.keras.layers.GlobalAveragePooling1D()(x)
    outputs = tf.keras.layers.Dense(1, activation="sigmoid")(x)
    model = tf.keras.Model(inputs, outputs, name="model_1_dense") # construct the model
    return model


model_1 = build_model_1(INPUT_SHAPE)



In [107]:
model_1.summary()

Model: "model_1_dense"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 1)]               0         
                                                                 
 text_vectorization_1 (TextV  (None, 45)               0         
 ectorization)                                                   
                                                                 
 embedding_1 (Embedding)     (None, 45, 128)           1280000   
                                                                 
 global_average_pooling1d_1   (None, 128)              0         
 (GlobalAveragePooling1D)                                        
                                                                 
 dense_1 (Dense)             (None, 1)                 129       
                                                                 
Total params: 1,280,129
Trainable params: 1,280,129
N

In [146]:
# Compile model
model_1.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy', 'Precision','Recall'])

# Fit the model
history_1 = model_1.fit(train_dataset,
                        #train_labels, 
                        epochs=5,
                        validation_data=validation_dataset)
                        #callbacks=[create_tensorboard_callback(dir_name=SAVE_DIR,
                        #                                       experiment_name="model_1_dense")])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [147]:
# evaluate
score1 = model_1.evaluate(test_dataset)



In [150]:
f"La precisión fue {round(score1[1],2)*100}%, la precisión del {round(score1[2],2)*100}% y el recall de {round(score1[3],2)*100}%"

'La precisión fue 83.0%, la precisión del 83.0% y el recall de 89.0%'

Guardamos los resultados para poder comparar después


In [161]:

model_1_results = {
    'name': 'Model 1',
    'accuracy':score1[1],
    'precision':score1[2],
    'recall':score1[3],
    'f1-score': (2*(score1[2]*score1[3]))/(score1[2]+score1[3])
}


results.append(model_1_results)


[{'name': 'Model 1',
  'accuracy': 0.829383909702301,
  'precision': 0.8256762623786926,
  'recall': 0.8882216811180115,
  'f1-score': 0.85580773419097}]

#### Predicciones

In [162]:
"""
En la inferencia, la entrada debe pasar previamente por el pre-procesamiento (en este ejemplo
por preprocess_text y pre-process_text2); no hace falta que pase por las caspas de text vectorization
y embedding pues son parte del modelo
"""


# Texto de entrada para hacer una predicción
input_text = "this is very, very positive"


# pre-procesamiento
input_text = preprocess_text(input_text)
input_text = preprocess_text2(input_text)


# predicción
pred = model_1.predict(np.array([input_text]))

print(f"pred: {pred}")


# Convertir la salida a una predicción binaria (0 o 1)
binary_prediction = 1 if pred[0, 0] > 0.5 else 0

# Imprimir la predicción
print("Predicción:", binary_prediction)




pred: [[0.9958777]]
Predicción: 1


<a name="5.2"> </a>
### Modelo 2 : LSTM


Arquitectura típica de una RNN:



<img src="images/arq.png" width=80%>






In [182]:
def build_model_2(input_shape):
    inputs = layers.Input(shape=input_shape, dtype="string")
    x = encoder(inputs) # text vectorizer
    x = embedding(x)
    print(f"After embedding: {x.shape}")
    # x = layers.LSTM(64, activation="tanh", return_sequences=True)(x) # use return_sequences=True if you want to stack recurrent layers 
    # print(f"After LSTM cell with return_sequences=True: {x.shape}")
    x = layers.LSTM(64, activation="tanh")(x)
    print(f"After LSTM cell: {x.shape}")
    x = layers.Dense(64, activation="relu")(x) # optional dense layer to have on top of LSTM layer
    outputs = layers.Dense(1, activation="sigmoid")(x)
    model = tf.keras.Model(inputs, outputs, name="model_2_LSTM")
    return model
    

model_2 = build_model_2(INPUT_SHAPE)
    


After embedding: (None, 45, 128)
After LSTM cell: (None, 64)


In [183]:
model_2.summary()

Model: "model_2_LSTM"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_5 (InputLayer)        [(None, 1)]               0         
                                                                 
 text_vectorization_1 (TextV  (None, 45)               0         
 ectorization)                                                   
                                                                 
 embedding_1 (Embedding)     (None, 45, 128)           1280000   
                                                                 
 lstm_1 (LSTM)               (None, 64)                49408     
                                                                 
 dense_3 (Dense)             (None, 64)                4160      
                                                                 
 dense_4 (Dense)             (None, 1)                 65        
                                                      

In [184]:
# Compile model
model_2.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy', 'Precision','Recall'])

# Fit the model
history_2 = model_1.fit(train_dataset,
                        epochs=5,
                        validation_data=validation_dataset)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [185]:
# evaluate
score2 = model_2.evaluate(test_dataset)



In [188]:
model_2_results = {
    'name': 'Model 2',
    'accuracy':score2[1],
    'precision':score2[2],
    'recall':score2[3],
    'f1-score': (2*(score2[2]*score2[3]))/(score2[2]+score2[3])
}


results.append(model_2_results)


<a name="5.3"> </a>
### Modelo 3 : GRU

In [191]:
def build_model_3(input_shape):
    # Build an RNN using the GRU cell
    inputs = layers.Input(shape=(1,), dtype="string")
    x = encoder(inputs)
    x = embedding(x)
    # x = layers.GRU(64, activation="tanh", return_sequences=True)(x) # return_sequences=True is required for stacking recurrent cells
    # print(x.shape)
    x = layers.GRU(64, activation="tanh")(x)
    outputs = layers.Dense(1, activation="sigmoid")(x)
    model= tf.keras.Model(inputs, outputs, name="model_3_GRU")
    return model


model_3 = build_model_3(INPUT_SHAPE)


In [192]:
# Compile model
model_3.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy', 'Precision','Recall'])

# Fit the model
history_3 = model_1.fit(train_dataset,
                        epochs=5,
                        validation_data=validation_dataset)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [193]:
# evaluate
score3 = model_3.evaluate(test_dataset)



In [194]:
model_3_results = {
    'name': 'Model 3',
    'accuracy':score3[1],
    'precision':score3[2],
    'recall':score3[3],
    'f1-score': (2*(score3[2]*score3[3]))/(score3[2]+score3[3])
}


results.append(model_3_results)

<a name="5.4"> </a>
### Modelo 4 : Bidirectional RNN model

<img src="images/model4.png">

In [207]:
def build_model_4(input_shape):
    inputs = layers.Input(shape=input_shape, dtype="string")
    x = encoder(inputs)
    x = embedding(x)
    # x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x) # return_sequences=True required for stacking RNN layers
    x = layers.Bidirectional(layers.LSTM(64))(x)
    x = layers.Dense(64, activation='relu')(x)
    outputs = layers.Dense(1, activation="sigmoid")(x)
    model = tf.keras.Model(inputs, outputs, name="model_4_bidirectional")
    return model

model_4 = build_model_4(INPUT_SHAPE)
    
    

In [208]:
# Compile model
model_4.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy', 'Precision','Recall'])

# Fit the model
history_4 = model_1.fit(train_dataset,
                        epochs=5,
                        validation_data=validation_dataset)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [209]:
# evaluate
score4 = model_4.evaluate(test_dataset)



In [202]:
model_4_results = {
    'name': 'Model 4',
    'accuracy':score4[1],
    'precision':score4[2],
    'recall':score4[3],
    'f1-score': (2*(score3[2]*score3[3]))/(score3[2]+score3[3])
}


results.append(model_4_results)

<a name="5.5"> </a>
### Modelo 5 : Stacking layers

<img src='images/model5.png'>

In [214]:

# text vect. en la foto de arriba y la dense

def build_model_5(input_shape,name):
    inputs = layers.Input(shape=input_shape, dtype='string')
    x = encoder(inputs)
    x = embedding(x)
    x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
    x = layers.Bidirectional(layers.LSTM(32))(x)
    x = layers.Dense(64, activation='relu')(x)
    # dropout?
    outputs = layers.Dense(1, activation='sigmoid')(x)
    model = tf.keras.Model(inputs, outputs, name=name)
    return model
    

model_5 = build_model_5(INPUT_SHAPE, 'model_5')
    


In [215]:
# Compile model
model_5.compile(loss="binary_crossentropy",
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['accuracy', 'Precision','Recall'])

# Fit the model
history_5 = model_1.fit(train_dataset,
                        epochs=5,
                        validation_data=validation_dataset)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [216]:
# evaluate
score5 = model_5.evaluate(test_dataset)



In [217]:
model_5_results = {
    'name': 'Model 5',
    'accuracy':score5[1],
    'precision':score5[2],
    'recall':score5[3],
    'f1-score': (2*(score5[2]*score5[3]))/(score5[2]+score5[3])
}


results.append(model_5_results)

<a name="5.6"> </a>
### Modelo 6 : Conv1D

We've seen before how convolutional neural networks can be used for images but they can also be used for text.

Previously we've used the layer Conv2D (which is great for images with (height, width)).

But if we want to use convolutional layers for sequences (e.g. text) we need to use Conv1D: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv1D

For more of a deep dive into what goes on behind the scenes in a CNN for text (or sequences) see the paper: https://arxiv.org/abs/1809.08037