## BERT 

BERT (Bidirectional Encoder Representations from Transformers) es un nuevo modelo de representación de lenguaje diseñado para entrenar previamente representaciones bidireccionales profundas a partir de un texto sin etiquetar mediante el condicionamiento conjunto del contexto izquierdo y derecho en todas las capas. Como resultado, el modelo BERT previamente entrenado se puede ajustar con solo una capa de salida adicional para crear modelos aplicados a una amplia gama de tareas, como respuesta a preguntas e inferencia de lenguaje, clasificación de textos, entre otras.

Para mayor información, visitar los siguientes enlaces:

- https://github.com/google-research/bert
- https://huggingface.co/bert-base-multilingual-cased


El siguiente código es una adaptación de Koksal, A. (2020), disponible en:

- https://github.com/akoksal/BERT-Sentiment-Analysis-Turkish


### 0. Instalar paquetes 

#### 0.1 Instalar Pytorch

Es una libreria basado en Python, diseñado para realizar cálculos numéricos haciendo uso de la programación de tensores. Además permite su ejecución en GPU para acelerar los cálculos. 

PyTorch dispone una interfaz muy sencilla para la creación de redes neuronales pese a trabajar de forma directa con tensores sin la necesidad de una librería a un nivel superior como pueda ser Keras para Theano o Tensorflow.

PyTorch dispone de soporte para su ejecución en tarjetas gráficas (GPU), utiliza internamente CUDA, una API que conecta la CPU con la GPU que ha sido desarrollado por NVIDIA.

In [2]:
conda install pytorch torchvision -c pytorch

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\jilli\anaconda3

  added / updated specs:
    - pytorch
    - torchvision


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pytorch-1.7.0              |py3.8_cuda110_cudnn8_0      1003.4 MB  pytorch
    ------------------------------------------------------------
                                           Total:      1003.4 MB

The following NEW packages will be INSTALLED:

  torchvision        pytorch/win-64::torchvision-0.8.1-py38_cu110

The following packages will be SUPERSEDED by a higher-priority channel:

  pytorch                                           PyTorch --> pytorch



Downloading and Extracting Packages

pytorch-1.7.0        | 1003.4 MB |            |   0% 
pytorch-1.7.0        | 1003.4 MB |            |   0


pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003.4 MB | 3          |   3% 
pytorch-1.7.0        | 1003

Para que el procesamiento sea más rápido se puede utilizar GPU con el modelo CUDA (Compute Unified Device Architecture) que es una plataforma de computación en paralelo que incluye un compilador y un conjunto de herramientas de desarrollo creadas por nVidia. Sin embargo, debido a que esta computadora no cuenta con GPU, se debe trabajar solo con CPU

Para mayor info sobre "Working with GPU packages" visitar:

- https://docs.anaconda.com/anaconda/user-guide/tasks/gpu-packages/

A continuación, se verifica que la computadora cuenta solo con CPU

In [4]:
import torch
# ver si se esta utilizando gpu o cpu
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

Using device: cpu



In [7]:
import torch
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(torch._C._cuda_getCompiledVersion())
print(device)

11000
cpu


In [4]:
#contar cuantos gpus tiene computadora para procesamiento paralelo
torch.cuda.device_count()

0

#### 0.2 Instalar Tensorflow

TensorFlow es una biblioteca de aprendizaje automático general, pero más popular para aplicaciones de aprendizaje profundo.

In [9]:
pip install tensorflow

Collecting tensorflowNote: you may need to restart the kernel to use updated packages.
  Using cached tensorflow-2.3.1-cp38-cp38-win_amd64.whl (342.5 MB)
Processing c:\users\jilli\appdata\local\pip\cache\wheels\a0\16\9c\5473df82468f958445479c59e784896fa24f4a5fc024b0f501\termcolor-1.1.0-py3-none-any.whl
Collecting keras-preprocessing<1.2,>=1.1.1
  Using cached Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
Collecting protobuf>=3.9.2
  Using cached protobuf-3.13.0-py2.py3-none-any.whl (438 kB)
Collecting grpcio>=1.8.6
  Using cached grpcio-1.33.2-cp38-cp38-win_amd64.whl (2.7 MB)
Collecting astunparse==1.6.3
  Using cached astunparse-1.6.3-py2.py3-none-any.whl (12 kB)
Collecting tensorboard<3,>=2.3.0
  Using cached tensorboard-2.3.0-py3-none-any.whl (6.8 MB)
Collecting opt-einsum>=2.3.2
  Using cached opt_einsum-3.3.0-py3-none-any.whl (65 kB)
Collecting tensorflow-estimator<2.4.0,>=2.3.0
  Using cached tensorflow_estimator-2.3.0-py2.py3-none-any.whl (459 kB)
Collecting google-past

In [14]:
pip install --upgrade tensorflow

Requirement already up-to-date: tensorflow in c:\users\jilli\anaconda3\lib\site-packages (2.3.1)
Note: you may need to restart the kernel to use updated packages.


#### 0.3 Instalar Keras

In [10]:
pip install Keras

Note: you may need to restart the kernel to use updated packages.


#### 0.4 Instalar transformers

Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provee de arquitecturas de propósitos generales (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) para el entendimiento del lenguaje natural con más de 32+ modelos pre entrenados en más de 100 lenguajes (interoperatividad profunda entre TensorFlow 2.0 y PyTorch)

- https://huggingface.co/transformers/


In [15]:
pip install transformers

Collecting transformers
  Downloading transformers-3.4.0-py3-none-any.whl (1.3 MB)
Collecting sentencepiece!=0.1.92
  Downloading sentencepiece-0.1.94-cp38-cp38-win_amd64.whl (1.2 MB)
Collecting sacremoses
  Downloading sacremoses-0.0.43.tar.gz (883 kB)
Collecting tokenizers==0.9.2
  Downloading tokenizers-0.9.2-cp38-cp38-win_amd64.whl (1.9 MB)
Building wheels for collected packages: sacremoses
  Building wheel for sacremoses (setup.py): started
  Building wheel for sacremoses (setup.py): finished with status 'done'
  Created wheel for sacremoses: filename=sacremoses-0.0.43-py3-none-any.whl size=893262 sha256=14807fba660192285eef4740e267d25445c93ca52eadd9801ac58e070ba07163
  Stored in directory: c:\users\jilli\appdata\local\pip\cache\wheels\7b\78\f4\27d43a65043e1b75dbddaa421b573eddc67e712be4b1c80677
Successfully built sacremoses
Installing collected packages: sentencepiece, sacremoses, tokenizers, transformers
Successfully installed sacremoses-0.0.43 sentencepiece-0.1.94 tokenizers-0.9

### 1. Cargar librerías

In [34]:
import json
import random
import warnings
from datetime import datetime
import pandas as pd
from pandas import DataFrame
import re as re
import emoji as emoji
import matplotlib
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import torch
from nltk.stem import SnowballStemmer
from nltk import TweetTokenizer
from sklearn.model_selection import train_test_split 

from keras.callbacks import EarlyStopping
from keras.layers import Dense, Dropout
from keras.models import Sequential
from keras.optimizers import Adagrad
from sklearn.decomposition import PCA
from sklearn.metrics import classification_report, f1_score
from sklearn.utils import class_weight
from tqdm import tqdm
from transformers import AutoModel, AutoTokenizer

### 2. Base de datos

#### 2.1 Importar datos

In [22]:
train= pd.read_csv(r'C:\Users\jilli\Desktop\TFM\2\train.csv')
train[1:2]

Unnamed: 0,Tweet_Id,Tweet_User_Id,Tweet_User,Text,Retweets,Favorites,Replies,Datetime,hashtags,Pais,Tweet_Source,lang,clean_text,clean_text2,Ind_emoji,clean_text3,Ind_Posi,Ind_Nega,Target,text_limpio
1,1243708742161633280,1113483741765222401,MJosealf,Pero que falta de criterio!!!! Y esto está pas...,0,0,0,2020-03-28 01:17:25+00:00,,Chi,Twitter for iPhone,es,{'🤬': 'face with symbols on mouth'},Pero que falta de criterio!!!! Y esto está pas...,1,['face with symbols on mouth'],0,1,N,pero que falta de criterio y esto está pasando...


In [23]:
train  = DataFrame(train,columns=["Tweet_Id", "Text","Datetime", "Pais","Target"])
train

Unnamed: 0,Tweet_Id,Text,Datetime,Pais,Target
0,1249382391120048129,"En esta cuarentena, mi psicólogo me ha dicho q...",2020-04-12 17:02:29+00:00,Per,N
1,1243708742161633280,Pero que falta de criterio!!!! Y esto está pas...,2020-03-28 01:17:25+00:00,Chi,N
2,1239277368986009600,"Gracias Lore, acá todavía no es tan severa la ...",2020-03-15 19:48:44+00:00,Arg,P
3,1235980525694783488,🇵🇪 Casos confirmados: #Dengue: 8221 (18 ☠️) Se...,2020-03-06 17:28:15+00:00,Per,N
4,1242148777357705216,Yo si es que digo que ustedes aunque no conozc...,2020-03-23 17:58:41+00:00,Col,P
...,...,...,...,...,...
22269,1285048010687291392,La pandemia me ha quitado tanto momentos lindo...,2020-07-20 03:04:55+00:00,Col,P
22270,1249688695256403969,"GRÁBATE MIENTRAS VUELAS, SIEMPRE Y CUANDO NO E...",2020-04-13 13:19:37+00:00,Chi,P
22271,1251896052279582721,Lo bueno de la cuarentena es que ya no hay que...,2020-04-19 15:30:52+00:00,Arg,P
22272,1276917451557568515,Día 104 de cuarentena: #BuenSabado y muchas gr...,2020-06-27 16:36:59+00:00,Arg,P


#### 2.2 Preprocesamiento de datos

In [24]:
# definir funciones de limpieza

def clean(text):
    text = text.lower()
    text = re.sub(r'https?://\S+','link',text) #normURLs
    text = re.sub(r'www.\S+','link',text) #normWWW
    text = re.sub(emoji.get_emoji_regexp(), r"", text)   #strip_emoji
    text = re.sub(r'@\S+','usuario',text) #norm_user
    text = re.sub(r'#\S+','hashtag',text) #norm_hashtags    
    text = re.sub(r'– at.+$','',text) #remove_at
    text = re.sub(r'\b(?=\w*[j])[aeiouj]{4,}\b', 'risas', text, flags=re.IGNORECASE)#normalize_risas
    text = re.sub(r'\b(juas+|lol+)\b', 'risas', text, flags=re.IGNORECASE)#normalize_risas
    text= re.sub(r'\b(ja|jaa)\b', 'risas', text, flags=re.IGNORECASE)#normalize_risas
    text = re.sub(r'[^\w\s]','',text) #punct_re_regex
    text= re.sub(r'(.)\1{2,}', r'\1\1', text, flags=re.IGNORECASE)#normalize_repet #quita signos punt   
    text = re.sub('\d+', '', text) #remove_digits
    text = text.replace("_", "")     #quita _
    text = re.sub(r' +',' ',text) #remove_spaces
    text = text.strip()     
    return text

#jergas
jerga = [('d','de'), ('[qk]','que'), ('xo','pero'),('fav','favorito'), ('xa', 'para'), ('[xp]q','porque'),('es[qk]', 'es que'),
              ('fvr','favor'),('(xfa|xf|pf|plis|pls|porfa)', 'por favor'), ('dnd','donde'), ('tb', 'también'),('ud','usted'),
             ('uds','ustedes'),('(ctm|alv|hdp)','insulto'), ('sr','señor'),('(fds|finde)','fin de semana'),('app','aplicación'),
         ('(La concha de tu madre|conchasumadres|reconchadesumadre|conchatumadre|la concha de la madre|la concha de su madre|La concha bien de su madre|la concha de tu hermana)','insulto'),
              ('(tq|tk)', 'te quiero'), ('(tqm|tkm)', 'te quiero mucho'),('bb','bebé'), ('x','por'), ('\+','mas')]
def normalize_jergas(message):
    for s,t in jerga:
        message = re.sub(r'\b{0}\b'.format(s), t, message, flags=re.IGNORECASE)
    return message   

#vocales
vocales= [('á','a'), ('é','e'), ('í','i'), ('ó','o'), ('ú','u'), ('ü','u')]
def norma_vocales(message):
    for s,t in vocales:
        message = re.sub(r'{0}'.format(s), t, message, flags=re.IGNORECASE)
    return message

# Stemming
_stemmer = SnowballStemmer('spanish')
_tokenizer = TweetTokenizer().tokenize
    
def stem(message):
    message = ' '.join(_stemmer.stem(w) for w in _tokenizer(message))
    return message

In [25]:
# aplicar preprocesos
train["text_limpio"]=train["Text"].to_numpy() # duplica columna Text con formato numpy
train["text_limpio"]=train["text_limpio"].apply(lambda tweet: normalize_jergas(tweet)) #normaliza jergas
train["text_limpio"]=train["text_limpio"].apply(lambda tweet: norma_vocales(tweet)) #normaliza tildes
train["text_limpio"]=train["text_limpio"].apply(lambda tweet: clean(tweet)) # limpieza
train["text_limpio"]=train["text_limpio"].apply(lambda tweet: stem(tweet)) #stemming

train.head()

Unnamed: 0,Tweet_Id,Text,Datetime,Pais,Target,text_limpio
0,1249382391120048129,"En esta cuarentena, mi psicólogo me ha dicho q...",2020-04-12 17:02:29+00:00,Per,N,en esta cuarenten mi psicolog me ha dich que l...
1,1243708742161633280,Pero que falta de criterio!!!! Y esto está pas...,2020-03-28 01:17:25+00:00,Chi,N,per que falt de criteri y esto esta pas en muc...
2,1239277368986009600,"Gracias Lore, acá todavía no es tan severa la ...",2020-03-15 19:48:44+00:00,Arg,P,graci lor aca todavi no es tan sever la cuaren...
3,1235980525694783488,🇵🇪 Casos confirmados: #Dengue: 8221 (18 ☠️) Se...,2020-03-06 17:28:15+00:00,Per,N,cas confirm hashtag selv hashtag lim
4,1242148777357705216,Yo si es que digo que ustedes aunque no conozc...,2020-03-23 17:58:41+00:00,Col,P,yo si es que dig que usted aunqu no conozc a l...


In [27]:
# se extraen campos a utilizar
df_BERT = DataFrame(train, columns= ['Tweet_Id','text_limpio','Target'])
df_BERT

Unnamed: 0,Tweet_Id,text_limpio,Target
0,1249382391120048129,en esta cuarenten mi psicolog me ha dich que l...,N
1,1243708742161633280,per que falt de criteri y esto esta pas en muc...,N
2,1239277368986009600,graci lor aca todavi no es tan sever la cuaren...,P
3,1235980525694783488,cas confirm hashtag selv hashtag lim,N
4,1242148777357705216,yo si es que dig que usted aunqu no conozc a l...,P
...,...,...,...
22269,1285048010687291392,la pandemi me ha quit tant moment lind per has...,P
22270,1249688695256403969,grabat mientr vuel siempr y cuand no estes sol...,P
22271,1251896052279582721,lo buen de la cuarenten es que ya no hay que c...,P
22272,1276917451557568515,dia de cuarenten hashtag y much graci a usuari...,P


In [29]:
# se renombran los campos
df_BERT = df_BERT.rename(columns={'Tweet_Id':'id','text_limpio':'sentence','Target':'value'})

In [30]:
df_BERT.reset_index(drop=True,inplace=True)
df_BERT[0:1]

Unnamed: 0,id,sentence,value
0,1249382391120048129,en esta cuarenten mi psicolog me ha dich que l...,N


In [31]:
# Se convierte archivo a formato json
df_BERT.to_json(r'C:\Users\jilli\Desktop\TFM\2\BERT_sample\df_BERT.json',orient="records")

In [32]:
# Se define función para colocar filtros adicionales si se requiere realizar una limpieza adicional 
def filter(text):
    final_text = ''
    for word in text.split():
        if word.startswith('@'):
            continue
        else:
            final_text += word+' '
    return final_text

#### 2.3 Subdividir muestras

In [35]:
# muestra de entrenamiento y validación
train, validation = train_test_split(df_BERT, test_size = 0.20)

In [36]:
train.to_json(r'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_train.json',orient="records")
validation.to_json(r'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_validation.json',orient="records")

In [37]:
# muestra de test
base_predict= pd.read_csv(r'C:\Users\jilli\Desktop\TFM\2\test.csv')

base_predict["text_limpio"]=base_predict["Text"].to_numpy() # duplica columna Text con formato numpy
base_predict["text_limpio"]=base_predict["text_limpio"].apply(lambda tweet: normalize_jergas(tweet)) #normaliza jergas
base_predict["text_limpio"]=base_predict["text_limpio"].apply(lambda tweet: norma_vocales(tweet)) #normaliza tildes
base_predict["text_limpio"]=base_predict["text_limpio"].apply(lambda tweet: clean(tweet)) 
base_predict["text_limpio"]=base_predict["text_limpio"].apply(lambda tweet: stem(tweet))

base_predict = DataFrame(base_predict, columns= ['Tweet_Id','text_limpio','Target'])
base_predict = base_predict.rename(columns={'Tweet_Id':'id','text_limpio':'sentence','Target':'value'})
base_predict


Unnamed: 0,id,sentence,value
0,1241218096221827072,señoress en tiemp de pandemi nad mejor que pel...,P
1,1254624530674393088,per en cuarenten,N
2,1260601329375940610,mi joystickn alcanz a lleg graci minsal por ha...,N
3,1268498533906292738,habl de bellez en la cuarenten bogotan hashtag,P
4,1252826196217262081,hashtag hashtag hashtag vam aguant,P
...,...,...,...
9541,1273759524126588930,dias de cuarenten y vam record tras record,N
9542,1262147333250322432,nos anim al cort de pel muchisim graci por el ...,P
9543,1252681613776949248,no doy mas mabel quier sal hashtag hashtag has...,N
9544,1251206243307393026,vam com avion hashtag,N


In [38]:
base_predict.to_json(r'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_test.json',orient="records")

In [232]:
#train.to_json(r'C:/Users/jilli/Desktop/TFM/1_TFM_BERT/train.json',orient="records")
#validation.to_json(r'C:/Users/jilli/Desktop/TFM/1_TFM_BERT/validation.json',orient="records")
#test.to_json(r'C:/Users/jilli/Desktop/TFM/1_TFM_BERT/test.json',orient="records")

### 3. Modelo

Establecer rutas de bases de datos

In [39]:
train_path = 'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_train.json'
val_path = 'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_validation.json'
test_path = 'C:/Users/jilli/Desktop/TFM/2/BERT_sample/df_BERT_test.json'
device = 'cpu'

Cargar modelo de transformers

In [235]:
from transformers import AutoTokenizer 
tokenizer = AutoTokenizer.from_pretrained("bert-base-multilingual-cased")
bert = AutoModel.from_pretrained("bert-base-multilingual-cased").to(device) 

def feature_extraction(text):
    x = tokenizer.encode(filter(text))
    with torch.no_grad():
        x, _ = bert(torch.stack([torch.tensor(x)]).to(device))
        return list(x[0][0].cpu().numpy())

In [236]:
with open(train_path, 'r') as f:
    train = json.load(f)
with open(val_path, 'r') as f:
    val = json.load(f)
with open(test_path, 'r') as f:
    test = json.load(f)

In [237]:
mapping = {'N':0,'P':1}

def data_prep(dataset):
    X = []
    y = []
    for element in tqdm(dataset):
        X.append(feature_extraction(element['sentence']))
        y_val = np.zeros(2)
        y_val[mapping[element['value']]] = 1
        y.append(y_val)
    return np.array(X), np.array(y)

X_train, y_train = data_prep(train)
X_val, y_val = data_prep(val)
X_test, y_test = data_prep(test)

100%|██████████| 17819/17819 [38:37<00:00,  7.69it/s]
100%|██████████| 4455/4455 [11:07<00:00,  6.68it/s]
100%|██████████| 9546/9546 [23:35<00:00,  6.74it/s]


Training a Model with Keras

In [248]:
class_counts = [0, 0]
for el in y_train:
    class_counts[np.argmax(el)]+=1
class_weights = {idx:sum(class_counts)/el for idx, el in enumerate(class_counts)}

es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=50)

model = Sequential()
model.add(Dense(925, activation='tanh', input_shape=(768,)))
model.add(Dropout(0.5))
model.add(Dense(128, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(32, activation='tanh'))
model.add(Dropout(0.5))
model.add(Dense(2, activation='softmax'))

model.summary()

model.compile(loss='binary_crossentropy',
              optimizer=Adagrad(),
              metrics=['accuracy'])

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_4 (Dense)              (None, 925)               711325    
_________________________________________________________________
dropout_3 (Dropout)          (None, 925)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 128)               118528    
_________________________________________________________________
dropout_4 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 32)                4128      
_________________________________________________________________
dropout_5 (Dropout)          (None, 32)                0         
_________________________________________________________________
dense_7 (Dense)              (None, 2)                

In [250]:
history = model.fit(np.array(X_train), np.array(y_train),
                    batch_size=64,
                    epochs=500,
                    verbose=1,
                    validation_data=(X_val, y_val),
                    class_weight=class_weights,
                    callbacks = [es])

Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/500
Epoch 74/500
Epoch 75/500
Epoch 76/500
Epoch 77/500
Epoch 78

Epoch 114/500
Epoch 115/500
Epoch 116/500
Epoch 117/500
Epoch 118/500
Epoch 119/500
Epoch 120/500
Epoch 121/500
Epoch 122/500
Epoch 123/500
Epoch 124/500
Epoch 125/500
Epoch 126/500
Epoch 127/500
Epoch 128/500
Epoch 129/500
Epoch 130/500
Epoch 131/500
Epoch 132/500
Epoch 133/500
Epoch 134/500
Epoch 135/500
Epoch 136/500
Epoch 137/500
Epoch 138/500
Epoch 139/500
Epoch 140/500
Epoch 141/500
Epoch 142/500
Epoch 143/500
Epoch 144/500
Epoch 145/500
Epoch 146/500
Epoch 147/500
Epoch 148/500
Epoch 149/500
Epoch 150/500
Epoch 151/500
Epoch 152/500
Epoch 153/500
Epoch 154/500
Epoch 155/500
Epoch 156/500
Epoch 157/500
Epoch 158/500
Epoch 159/500
Epoch 160/500
Epoch 161/500
Epoch 162/500
Epoch 163/500
Epoch 164/500
Epoch 165/500
Epoch 166/500
Epoch 167/500
Epoch 168/500
Epoch 169/500
Epoch 170/500
Epoch 171/500
Epoch 172/500
Epoch 173/500
Epoch 174/500
Epoch 175/500
Epoch 176/500
Epoch 177/500
Epoch 178/500
Epoch 179/500
Epoch 180/500
Epoch 181/500
Epoch 182/500
Epoch 183/500
Epoch 184/500
Epoch 

Epoch 226/500
Epoch 227/500
Epoch 228/500
Epoch 229/500
Epoch 230/500
Epoch 231/500
Epoch 232/500
Epoch 233/500
Epoch 234/500
Epoch 235/500
Epoch 236/500
Epoch 237/500
Epoch 238/500
Epoch 239/500
Epoch 240/500
Epoch 241/500
Epoch 242/500
Epoch 243/500
Epoch 244/500
Epoch 245/500
Epoch 246/500
Epoch 247/500
Epoch 248/500
Epoch 249/500
Epoch 250/500
Epoch 251/500
Epoch 252/500
Epoch 253/500
Epoch 254/500
Epoch 255/500
Epoch 256/500
Epoch 257/500
Epoch 258/500
Epoch 259/500
Epoch 260/500
Epoch 261/500
Epoch 262/500
Epoch 263/500
Epoch 264/500
Epoch 265/500
Epoch 266/500
Epoch 267/500
Epoch 268/500
Epoch 269/500
Epoch 270/500
Epoch 271/500
Epoch 272/500
Epoch 273/500
Epoch 274/500
Epoch 275/500
Epoch 276/500
Epoch 277/500
Epoch 278/500
Epoch 279/500
Epoch 280/500
Epoch 281/500
Epoch 282/500
Epoch 283/500
Epoch 284/500
Epoch 285/500
Epoch 286/500
Epoch 287/500
Epoch 288/500
Epoch 289/500
Epoch 290/500
Epoch 291/500
Epoch 292/500
Epoch 293/500
Epoch 294/500
Epoch 295/500
Epoch 296/500
Epoch 

In [269]:
y_true, y_pred = np.argmax(y_test, 1), np.argmax(model.predict(X_test), 1)
print(classification_report(y_true, y_pred, digits=3))

              precision    recall  f1-score   support

           0      0.652     0.702     0.676      4755
           1      0.680     0.629     0.653      4791

    accuracy                          0.665      9546
   macro avg      0.666     0.665     0.665      9546
weighted avg      0.666     0.665     0.665      9546



In [253]:
from sklearn.metrics import roc_auc_score
roc_auc = roc_auc_score(y_true, y_pred)
print('Curva ROC - AUC del modelo:')
print(roc_auc)

Curva ROC - AUC del modelo:
0.6653375446996768
