# Como enseñar a las máquinas a leer y comprender

Curso: Inteligencia Artificial CC421A

Integrantes:

        Lesly Dashiel Sanchez Ramos
        Gladys Alesandra Yagi Vásquez

## ParlIA

ParlAI es un framework de software de código abierto para la investigación de diálogos implementada en Python, disponible en http://parl.ai. Su objetivo es proporcionar un marco unificado para compartir, capacitar y probar modelos de diálogo; integración de Amazon Mechanical Turk para la recopilación de datos, la evaluación humana y el aprendizaje por refuerzo; además cuenta con un repositorio https://github.com/facebookresearch/ParlAI de modelos de aprendizaje automático para comparar con otros modelos y mejorar las arquitecturas existentes. También ofrece más de 20 tareas, incluye conjuntos de datos populares como SQuAD, bAbI tasks, MCTest, WikiQA, QACNN, QADailyMail, CBT, bAbI Dialog, Ubuntu, OpenSubtitles y VQA.

Además de la amplia gama de datasets disponibles, ofrece una amplia gama de ayudantes para crear nuestros propios agentes.

ParlAI integra varios modelos, incluidos modelos neuronales como redes de memoria, Seq2seq y attentive LSTMs, los cuales implementaremos en este proyecto.


## Instalación del ParlIA

In [1]:
!pip3 install -q parlai
!pip3 install -q subword_nmt 

## Importando el Dataset

Para la evaluación del modelo, se decidió usar el dataset Children’S Book Test (CBT), ya que este era el de menor
tamaño y esto facilitaría que el modelo cargue más rápido al momento de entrenar y evaluar.

### Importamos el Dataset.

In [2]:
from parlai.core.build_data import DownloadableFile
import parlai.core.build_data as build_data
import os

#Descargamos todos los recursos que el dataset requiere en el ParlIA para su uso.
RESOURCES = [
    DownloadableFile(
        'http://parl.ai/downloads/cbt/cbt.tar.gz',
        'cbt.tar.gz',
        '932df0cadc1337b2a12b4c696b1041c1d1c6d4b6bd319874c6288f02e4a61e92',
    )
]

# Se define el build para descargar y generar los datos necesarios del dataset CBT.
def build(opt):
    dpath = os.path.join(opt['datapath'], 'CBT')
    version = None

    if not build_data.built(dpath, version_string=version):
        print('[building data: ' + dpath + ']')
        if build_data.built(dpath):
            # Se elimina los archivos desactualizados de la versión anterior.
            build_data.remove_dir(dpath)
        build_data.make_dir(dpath)

        # Descargamos los datos.
        for downloadable_file in RESOURCES:
            downloadable_file.download_file(dpath)

        # Marcamos los datos como construidos.
        build_data.mark_done(dpath, version_string=version)

### Mostramos un ejemplo

In [3]:
# El script display_data se usa para mostrar el contenido de una tarea en particular.
# Mostramos un ejemplo de los datos del train
from parlai.scripts.display_data import DisplayData
# Num_example define el número de ejemplos a mostrar y task es la tarea del dataset, en este caso de cbt.
DisplayData.main(task='cbt', num_examples=1) 

11:40:17 | Opt:
11:40:17 |     allow_missing_init_opts: False
11:40:17 |     batchsize: 1
11:40:17 |     datapath: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data
11:40:17 |     datatype: train:ordered
11:40:17 |     dict_class: None
11:40:17 |     display_add_fields: 
11:40:17 |     download_path: None
11:40:17 |     dynamic_batching: None
11:40:17 |     hide_labels: False
11:40:17 |     ignore_agent_reply: True
11:40:17 |     image_cropsize: 224
11:40:17 |     image_mode: raw
11:40:17 |     image_size: 256
11:40:17 |     init_model: None
11:40:17 |     init_opt: None
11:40:17 |     is_debug: False
11:40:17 |     loglevel: info
11:40:17 |     max_display_len: 1000
11:40:17 |     model: None
11:40:17 |     model_file: None
11:40:17 |     multitask_weights: [1]
11:40:17 |     mutators: None
11:40:17 |     num_examples: 1
11:40:17 |     override: "{'task': 'cbt', 'num_examples': 1}"
11:40:17 |     parlai_home: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages
11:40:17 |     

### Implementamos los agentes.

Implementamos los agentes para que comprendan la estructura de las tareas y que sean capaces de representarlas.
En este caso crearemos cuatro agentes que llamaremos profesores para que cada uno realice tareas específicas diferentes.

In [4]:
# Importamos FbDeprecatedDialogTeacher, MultiTaskTeacher para que nuestros agentes creados hereden sus métodos.
from parlai.core.teachers import FbDeprecatedDialogTeacher, MultiTaskTeacher
# Importamos register_teacher que nos permitirá usar a nuestro teacher 
# llamándolo con "my_teacher"
from parlai.core.teachers import register_teacher, DialogTeacher

import copy
import os

# Creamos el path para devolvernos a la ruta de los archivos de datos correctos del dataset.
def _path(task, opt):
    # Genera los datos si no existen.
    build(opt)
    suffix = ''
    dt = opt['datatype'].split(':')[0]
    if dt == 'train':
        suffix = 'train'
    elif dt == 'test':
        suffix = 'test_2500ex'
    elif dt == 'valid':
        suffix = 'valid_2000ex'

    return os.path.join(
        opt['datapath'], 'CBT', 'CBTest', 'data', task + '_' + suffix + '.txt'
    )

# Se crean los profesores para que entiendan las tareas y sean capaces de representarlas.
class NETeacher(FbDeprecatedDialogTeacher):
    def __init__(self, opt, shared=None):
        opt['datafile'] = _path('cbtest_NE', opt)
        opt['cloze'] = True
        super().__init__(opt, shared)


class CNTeacher(FbDeprecatedDialogTeacher):
    def __init__(self, opt, shared=None):
        opt['datafile'] = _path('cbtest_CN', opt)
        opt['cloze'] = True
        super().__init__(opt, shared)


class VTeacher(FbDeprecatedDialogTeacher):
    def __init__(self, opt, shared=None):
        opt['datafile'] = _path('cbtest_V', opt)
        opt['cloze'] = True
        super().__init__(opt, shared)


class PTeacher(FbDeprecatedDialogTeacher):
    def __init__(self, opt, shared=None):
        opt['datafile'] = _path('cbtest_P', opt)
        opt['cloze'] = True
        super().__init__(opt, shared)


# De forma predeterminada, este último profesor entrena a todas las tareas a la vez.
@register_teacher("my_teacher")
class DefaultTeacher(MultiTaskTeacher):
    def __init__(self, opt, shared=None):
        opt = copy.deepcopy(opt)
        opt['task'] = 'cbt:NE,cbt:CN,cbt:V,cbt:P'
        super().__init__(opt, shared)

In [11]:
DisplayData.main(task="my_teacher")

11:59:02 | Opt:
11:59:02 |     allow_missing_init_opts: False
11:59:02 |     batchsize: 1
11:59:02 |     datapath: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data
11:59:02 |     datatype: train:ordered
11:59:02 |     dict_class: None
11:59:02 |     display_add_fields: 
11:59:02 |     download_path: None
11:59:02 |     dynamic_batching: None
11:59:02 |     hide_labels: False
11:59:02 |     ignore_agent_reply: True
11:59:02 |     image_cropsize: 224
11:59:02 |     image_mode: raw
11:59:02 |     image_size: 256
11:59:02 |     init_model: None
11:59:02 |     init_opt: None
11:59:02 |     is_debug: False
11:59:02 |     loglevel: info
11:59:02 |     max_display_len: 1000
11:59:02 |     model: None
11:59:02 |     model_file: None
11:59:02 |     multitask_weights: [1]
11:59:02 |     mutators: None
11:59:02 |     num_examples: 10
11:59:02 |     override: "{'task': 'my_teacher'}"
11:59:02 |     parlai_home: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages
11:59:02 |     starttime: 

# Creando nuestro modelo

Creamos un modelo seq2seq, que va a consistir en un encoder y decoder cada uno conteniendo una capa LSTM.
La clase TorchGeneratorAgent manejará las características comunes de un decoder, como la decodificación forzada y la búsqueda en haz.

In [6]:
# Importamos register_agent que nos permitirá usar a nuestro agente 
# llamándolo con ""modeloSeq2seq""
from parlai.core.agents import register_agent, Agent
# Importamos torch.nn para la creación de la red neuronal
import torch.nn as nn
# Importamos torch.nn.functional que nos servirá para implementar capas que no tienen parámatros
import torch.nn.functional as F
import parlai.core.torch_generator_agent as tga


# Definimos el encoder 
class Encoder(nn.Module):
    
    #Consta de una capa de incrustación y un LSTM de 1 capa con el
    #tamaño oculto especificado.
    
    #Inicialización.
    def __init__(self, embeddings, hidden_size):
        
        # Llamamos a super en todos los nn.Modules para que lo herede.
        super().__init__()

        self.embeddings = embeddings
        # Se definen los parámetros para la capa LSTM del encoder.
        self.lstm = nn.LSTM(
            input_size=hidden_size,
            hidden_size=hidden_size,
            num_layers=1,
            batch_first=True,
        )

    def forward(self, input_tokens):
                   
        #Realice el forward pass para el codificador.
        
        #La entrada input_tokens, son los tokens de contexto dados
        embedded = self.embeddings(input_tokens)
        # Se devuelven los estados ocultos y de la celda LSTM
        _output, hidden = self.lstm(embedded)
        return hidden

# Definimos el decoder
class Decoder(nn.Module):
    
    #Consta de una capa de incrustación y un LSTM de 1 capa con el
    #tamaño oculto especificado.
   
    #El decodificador permite la decodificación incremental ingiriendo el
    #estado incremental actual en cada pasada hacia adelante.
  
    #Inicialización.
    def __init__(self, embeddings, hidden_size):
        
        # Llamamos a super en todos los nn.Modules para que lo herede.
        super().__init__()
        self.embeddings = embeddings
        # Se definen los parámetros para la capa LSTM del decoder.
        self.lstm = nn.LSTM(
            input_size=hidden_size,
            hidden_size=hidden_size,
            num_layers=1,
            batch_first=True,
        )

    def forward(self, input, encoder_state, incr_state=None):
        
        #Realice el forward pass para el decodificador.
        
        #La entrada son los tokens generados por el decodificador
        embedded = self.embeddings(input)
        if incr_state is None:
            # Sembramos el LSTM con el estado oculto del decodificador.
            state = encoder_state
        else:
            # Reutilizamos el estado del decodificador existente
            state = incr_state

        # Obtenemos la nueva salida y el estado incremental del decodificador
        output, incr_state = self.lstm(embedded, state)

        return output, incr_state

# Implementa los métodos de TorchGeneratorModel para reordenar los estados del codificador y los estados incrementales del
# decodificador. Crea una instancia y también define la capa de salida final.
class ExampleModel(tga.TorchGeneratorModel):
   
    #Inicialización.
    def __init__(self, dictionary, hidden_size=1024):
        super().__init__(
            padding_idx=dictionary[dictionary.null_token],
            start_idx=dictionary[dictionary.start_token],
            end_idx=dictionary[dictionary.end_token],
            unknown_idx=dictionary[dictionary.unk_token],
        )
        self.embeddings = nn.Embedding(len(dictionary), hidden_size)
        self.encoder = Encoder(self.embeddings, hidden_size)
        self.decoder = Decoder(self.embeddings, hidden_size)

    def output(self, decoder_output):
        
        #Realiza la salida final -> transformación logits.
        
        return F.linear(decoder_output, self.embeddings.weight)

    def reorder_encoder_states(self, encoder_states, indices):
        
        #Reordena los estados del codificador para seleccionar solo los índices de lote dados.
        #Se indexa la selección en la dimensión del lote.
        h, c = encoder_states
        return h[:, indices, :], c[:, indices, :]

    def reorder_decoder_incremental_state(self, incr_state, indices):
        # Método es implementado para reducir la complejidad de generación.
        h, c = incr_state
        return h[:, indices, :], c[:, indices, :]

# Creamos el modelo Seq2seq que hereda de TorchGeneratorAgent
@register_agent("modeloSeq2seq")
class Seq2seqAgent(tga.TorchGeneratorAgent):
    
    @classmethod
    def add_cmdline_args(cls, argparser, partial_opt):

        # Agrega todos los argumentos de TorchGeneratorAgent
        super().add_cmdline_args(argparser)

        # Agregamos argumentos personalizados solo para este modelo.
        group = argparser.add_argument_group('Example TGA Agent')
        group.add_argument(
            '-hid', '--hidden-size', type=int, default=1024, help='Hidden size.'
        )

    # Se construye el modelo.
    def build_model(self):
        model = ExampleModel(self.dict, self.opt['hidden_size'])
        self._copy_embeddings(model.embeddings.weight, self.opt['embedding_type'])
        return model

## Entrenando el Modelo


Entrenamos el modelo con TrainModel, usamor el dataset CBT, variamos los parámetros para obtener las 
mejores métricas y mostramos como el modelo entrenado trabaja con algunos ejemplos.

In [14]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-5, 
    optimizer='adam',
    warmup_updates=100,

    # Entrenamiento máximo de 10 min
    max_train_time=600, 
    validation_every_n_epochs=0.25,    
    # Tamaño del lote
    batchsize=8, 
    # Sirve para hacer más rápido las validaciones
    skip_generation=True,
      
)

23:46:29 | building dictionary first...
23:46:29 | [33mOverriding opt["batchsize"] to 8 (previously: 2)[0m
23:46:29 | [33mOverriding opt["skip_generation"] to True (previously: False)[0m
23:46:29 | Using CUDA
23:46:29 | loading dictionary from modeloSeq2seq/model.dict
23:46:30 | num words = 51210
23:46:30 | Total parameters: 69,232,640 (69,232,640 trainable)
23:46:30 | Loading existing model params from modeloSeq2seq/model
23:46:31 | Opt:
23:46:31 |     adafactor_eps: '[1e-30, 0.001]'
23:46:31 |     adam_eps: 1e-08
23:46:31 |     add_p1_after_newln: False
23:46:31 |     aggregate_micro: False
23:46:31 |     allow_missing_init_opts: False
23:46:31 |     batchsize: 8
23:46:31 |     beam_block_full_context: True
23:46:31 |     beam_block_list_filename: None
23:46:31 |     beam_block_ngram: -1
23:46:31 |     beam_context_block_ngram: -1
23:46:31 |     beam_delay: 30
23:46:31 |     beam_length_penalty: 0.65
23:46:31 |     beam_min_length: 1
23:46:31 |     beam_size: 1
23:46:31 |     bet

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(5.954),
  'cbt:NE/ppl': PPLMetric(385.2),
  'cbt:NE/token_acc': AverageMetric(0.4775),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.393),
  'cbt:CN/ppl': PPLMetric(219.9),
  'cbt:CN/token_acc': AverageMetric(0.498),
  'cbt:CN/token_em': AverageMetric(0),
  'cbt:V/exs': SumMetric(2000),
  'cbt:V/clen': AverageMetric(509.2),
  'cbt:V/ctrunc': AverageMetric(0),
  'cbt:V/ctrunclen': Averag

In [15]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='cbt',
    model_file='modeloSeq2seq/model',
    num_examples=2,
)

23:59:45 | Using CUDA
23:59:45 | loading dictionary from modeloSeq2seq/model.dict
23:59:45 | num words = 51210
23:59:46 | Total parameters: 69,232,640 (69,232,640 trainable)
23:59:46 | Loading existing model params from modeloSeq2seq/model
23:59:47 | creating task(s): cbt
23:59:47 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
23:59:47 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
23:59:48 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
23:59:48 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
23:59:48 | Opt:
23:59:48 |     adafactor_eps: '[1e-30, 0.001]'
23:59:48 |     adam_eps: 1e-08
23:59:48 |     add_p1_after_newln: False
23:59:48 |     aggregate_micro: False
23:59:48 |     a

In [52]:
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

17:59:01 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
17:59:01 | Using CUDA
17:59:01 | loading dictionary from modeloSeq2seq/model.dict
17:59:01 | num words = 51210
17:59:01 | Total parameters: 69,232,640 (69,232,640 trainable)
17:59:01 | Loading existing model params from modeloSeq2seq/model
17:59:01 | creating task(s): my_teacher
17:59:01 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
17:59:02 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
17:59:02 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
17:59:02 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
17:59:02 | Opt:
17:59:02 |     adafactor_eps: '[1e-30, 0.001]'
17:59:02 |     adam_eps: 1e-08
17:59:02 |   

In [16]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='cbt',
    model_file='modeloSeq2seq/model',
    num_examples=2,
    skip_generation=False,
)

00:00:03 | [33mOverriding opt["skip_generation"] to False (previously: True)[0m
00:00:03 | Using CUDA
00:00:03 | loading dictionary from modeloSeq2seq/model.dict
00:00:03 | num words = 51210
00:00:04 | Total parameters: 69,232,640 (69,232,640 trainable)
00:00:04 | Loading existing model params from modeloSeq2seq/model
00:00:05 | creating task(s): cbt
00:00:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
00:00:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
00:00:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
00:00:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
00:00:06 | Opt:
00:00:06 |     adafactor_eps: '[1e-30, 0.001]'
00:00:06 |     adam_eps: 1e-08
00:00:06 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-5, 
    optimizer='adam',
    warmup_updates=100,
    # Validamos cada 10s
    max_train_time=600, 
    # Tamaño del lote
    batchsize=8, 
    # Sirve para hacer más rápido las validaciones 
    skip_generation=True,
      
)

00:04:53 | building dictionary first...
00:04:53 | [33mOverriding opt["validation_every_n_secs"] to 10.0 (previously: -1)[0m
00:04:53 | Using CUDA
00:04:53 | loading dictionary from modeloSeq2seq/model.dict
00:04:53 | num words = 51210
00:04:57 | Total parameters: 69,232,640 (69,232,640 trainable)
00:04:57 | Loading existing model params from modeloSeq2seq/model
00:04:58 | Opt:
00:04:58 |     adafactor_eps: '[1e-30, 0.001]'
00:04:58 |     adam_eps: 1e-08
00:04:58 |     add_p1_after_newln: False
00:04:58 |     aggregate_micro: False
00:04:58 |     allow_missing_init_opts: False
00:04:58 |     batchsize: 8
00:04:58 |     beam_block_full_context: True
00:04:58 |     beam_block_list_filename: None
00:04:58 |     beam_block_ngram: -1
00:04:58 |     beam_context_block_ngram: -1
00:04:58 |     beam_delay: 30
00:04:58 |     beam_length_penalty: 0.65
00:04:58 |     beam_min_length: 1
00:04:58 |     beam_size: 1
00:04:58 |     betas: '[0.9, 0.999]'
00:04:58 |     bpe_add_prefix_space: None
00:

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(5.951),
  'cbt:NE/ppl': PPLMetric(384.1),
  'cbt:NE/token_acc': AverageMetric(0.4775),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.391),
  'cbt:CN/ppl': PPLMetric(219.5),
  'cbt:CN/token_acc': AverageMetric(0.498),
  'cbt:CN/token_em': AverageMetric(0),
  'cbt:V/exs': SumMetric(2000),
  'cbt:V/clen': AverageMetric(509.2),
  'cbt:V/ctrunc': AverageMetric(0),
  'cbt:V/ctrunclen': Averag

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='cbt',
    model_file='modeloSeq2seq/model',
    num_examples=2,
    skip_generation=False,
)

00:12:21 | [33mOverriding opt["skip_generation"] to False (previously: True)[0m
00:12:21 | Using CUDA
00:12:21 | loading dictionary from modeloSeq2seq/model.dict
00:12:21 | num words = 51210
00:12:21 | Total parameters: 69,232,640 (69,232,640 trainable)
00:12:21 | Loading existing model params from modeloSeq2seq/model
00:12:23 | creating task(s): cbt
00:12:23 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
00:12:23 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
00:12:23 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
00:12:23 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
00:12:23 | Opt:
00:12:23 |     adafactor_eps: '[1e-30, 0.001]'
00:12:23 |     adam_eps: 1e-08
00:12:23 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Entrenamiento máximo de 10 min
    max_train_time=600, 
    # Tamaño del lote
    batchsize=16, 
      
)

01:05:11 | building dictionary first...
01:05:11 | No model with opt yet at: modeloSeq2seq/model(.opt)
01:05:11 | Using CUDA
01:05:11 | loading dictionary from modeloSeq2seq/model.dict
01:05:11 | num words = 51210
01:05:16 | Total parameters: 69,232,640 (69,232,640 trainable)
01:05:16 | Opt:
01:05:16 |     adafactor_eps: '(1e-30, 0.001)'
01:05:16 |     adam_eps: 1e-08
01:05:16 |     add_p1_after_newln: False
01:05:16 |     aggregate_micro: False
01:05:16 |     allow_missing_init_opts: False
01:05:16 |     batchsize: 16
01:05:16 |     beam_block_full_context: True
01:05:16 |     beam_block_list_filename: None
01:05:16 |     beam_block_ngram: -1
01:05:16 |     beam_context_block_ngram: -1
01:05:16 |     beam_delay: 30
01:05:16 |     beam_length_penalty: 0.65
01:05:16 |     beam_min_length: 1
01:05:16 |     beam_size: 1
01:05:16 |     betas: '(0.9, 0.999)'
01:05:16 |     bpe_add_prefix_space: None
01:05:16 |     bpe_debug: False
01:05:16 |     bpe_dropout: None
01:05:16 |     bpe_merge: N

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0.00025),
  'cbt:NE/bleu-4': BleuMetric(6.767e-14),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(9.687),
  'cbt:NE/ppl': PPLMetric(1.611e+04),
  'cbt:NE/token_acc': AverageMetric(0.4333),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(9.12),
  'cbt:CN/ppl': PPLMetric

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

01:43:04 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
01:43:04 | Using CUDA
01:43:04 | loading dictionary from modeloSeq2seq/model.dict
01:43:04 | num words = 51210
01:43:05 | Total parameters: 69,232,640 (69,232,640 trainable)
01:43:05 | Loading existing model params from modeloSeq2seq/model
01:43:05 | creating task(s): my_teacher
01:43:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
01:43:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
01:43:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
01:43:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
01:43:06 | Opt:
01:43:06 |     adafactor_eps: '[1e-30, 0.001]'
01:43:06 |     adam_eps: 1e-08
01:43:06 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 10 min
    max_train_time=600, 
    # Tamaño del lote
    batchsize=8,  
)

07:51:14 | building dictionary first...
07:51:14 | [33mOverriding opt["warmup_updates"] to 50 (previously: 100)[0m
07:51:14 | Using CUDA
07:51:14 | loading dictionary from modeloSeq2seq/model.dict
07:51:14 | num words = 51210
07:51:17 | Total parameters: 69,232,640 (69,232,640 trainable)
07:51:17 | Loading existing model params from modeloSeq2seq/model
07:51:19 | Opt:
07:51:19 |     adafactor_eps: '[1e-30, 0.001]'
07:51:19 |     adam_eps: 1e-08
07:51:19 |     add_p1_after_newln: False
07:51:19 |     aggregate_micro: False
07:51:19 |     allow_missing_init_opts: False
07:51:19 |     batchsize: 8
07:51:19 |     beam_block_full_context: True
07:51:19 |     beam_block_list_filename: None
07:51:19 |     beam_block_ngram: -1
07:51:19 |     beam_context_block_ngram: -1
07:51:19 |     beam_delay: 30
07:51:19 |     beam_length_penalty: 0.65
07:51:19 |     beam_min_length: 1
07:51:19 |     beam_size: 1
07:51:19 |     betas: '[0.9, 0.999]'
07:51:19 |     bpe_add_prefix_space: None
07:51:19 |   

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.136),
  'cbt:NE/ppl': PPLMetric(1257),
  'cbt:NE/token_acc': AverageMetric(0.4839),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.885),
  'cbt:CN/ppl': PPLMetric(359.7),
  'cbt:CN

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

08:04:05 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
08:04:05 | Using CUDA
08:04:05 | loading dictionary from modeloSeq2seq/model.dict
08:04:05 | num words = 51210
08:04:06 | Total parameters: 69,232,640 (69,232,640 trainable)
08:04:06 | Loading existing model params from modeloSeq2seq/model
08:04:07 | creating task(s): my_teacher
08:04:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
08:04:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
08:04:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
08:04:08 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
08:04:08 | Opt:
08:04:08 |     adafactor_eps: '[1e-30, 0.001]'
08:04:08 |     adam_eps: 1e-08
08:04:08 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 10 min
    max_train_time=600, 
    # Tamaño del lote
    batchsize=16, 
       
)

08:31:20 | building dictionary first...
08:31:20 | [33mOverriding opt["batchsize"] to 16 (previously: 8)[0m
08:31:20 | Using CUDA
08:31:20 | loading dictionary from modeloSeq2seq/model.dict
08:31:20 | num words = 51210
08:31:23 | Total parameters: 69,232,640 (69,232,640 trainable)
08:31:23 | Loading existing model params from modeloSeq2seq/model
08:31:25 | Opt:
08:31:25 |     adafactor_eps: '[1e-30, 0.001]'
08:31:25 |     adam_eps: 1e-08
08:31:25 |     add_p1_after_newln: False
08:31:25 |     aggregate_micro: False
08:31:25 |     allow_missing_init_opts: False
08:31:25 |     batchsize: 16
08:31:25 |     beam_block_full_context: True
08:31:25 |     beam_block_list_filename: None
08:31:25 |     beam_block_ngram: -1
08:31:25 |     beam_context_block_ngram: -1
08:31:25 |     beam_delay: 30
08:31:25 |     beam_length_penalty: 0.65
08:31:25 |     beam_min_length: 1
08:31:25 |     beam_size: 1
08:31:25 |     betas: '[0.9, 0.999]'
08:31:25 |     bpe_add_prefix_space: None
08:31:25 |     bpe_

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.217),
  'cbt:NE/ppl': PPLMetric(1363),
  'cbt:NE/token_acc': AverageMetric(0.4841),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.938),
  'cbt:CN/ppl': PPLMetric(379.2),
  'cbt:CN

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

08:43:41 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
08:43:41 | Using CUDA
08:43:41 | loading dictionary from modeloSeq2seq/model.dict
08:43:42 | num words = 51210
08:43:42 | Total parameters: 69,232,640 (69,232,640 trainable)
08:43:42 | Loading existing model params from modeloSeq2seq/model
08:43:43 | creating task(s): my_teacher
08:43:43 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
08:43:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
08:43:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
08:43:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
08:43:44 | Opt:
08:43:44 |     adafactor_eps: '[1e-30, 0.001]'
08:43:44 |     adam_eps: 1e-08
08:43:44 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 20 min
    max_train_time=1200, 
    # Tamaño del lote
    batchsize=16, 
       
)

08:47:44 | building dictionary first...
08:47:44 | [33mOverriding opt["max_train_time"] to 1200.0 (previously: 600.0)[0m
08:47:44 | Using CUDA
08:47:44 | loading dictionary from modeloSeq2seq/model.dict
08:47:44 | num words = 51210
08:47:47 | Total parameters: 69,232,640 (69,232,640 trainable)
08:47:47 | Loading existing model params from modeloSeq2seq/model
08:47:48 | Opt:
08:47:48 |     adafactor_eps: '[1e-30, 0.001]'
08:47:48 |     adam_eps: 1e-08
08:47:48 |     add_p1_after_newln: False
08:47:48 |     aggregate_micro: False
08:47:48 |     allow_missing_init_opts: False
08:47:48 |     batchsize: 16
08:47:48 |     beam_block_full_context: True
08:47:48 |     beam_block_list_filename: None
08:47:48 |     beam_block_ngram: -1
08:47:48 |     beam_context_block_ngram: -1
08:47:48 |     beam_delay: 30
08:47:48 |     beam_length_penalty: 0.65
08:47:48 |     beam_min_length: 1
08:47:48 |     beam_size: 1
08:47:48 |     betas: '[0.9, 0.999]'
08:47:48 |     bpe_add_prefix_space: None
08:47:

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.274),
  'cbt:NE/ppl': PPLMetric(1443),
  'cbt:NE/token_acc': AverageMetric(0.4841),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.96),
  'cbt:CN/ppl': PPLMetric(387.7),
  'cbt:CN/

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

08:57:03 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
08:57:03 | Using CUDA
08:57:03 | loading dictionary from modeloSeq2seq/model.dict
08:57:03 | num words = 51210
08:57:04 | Total parameters: 69,232,640 (69,232,640 trainable)
08:57:04 | Loading existing model params from modeloSeq2seq/model
08:57:05 | creating task(s): my_teacher
08:57:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
08:57:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
08:57:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
08:57:05 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
08:57:06 | Opt:
08:57:06 |     adafactor_eps: '[1e-30, 0.001]'
08:57:06 |     adam_eps: 1e-08
08:57:06 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 30 min
    max_train_time=1800, 
    # Tamaño del lote
    batchsize=16, 
       
)

09:01:16 | building dictionary first...
09:01:16 | [33mOverriding opt["max_train_time"] to 1800.0 (previously: 1200.0)[0m
09:01:16 | Using CUDA
09:01:16 | loading dictionary from modeloSeq2seq/model.dict
09:01:16 | num words = 51210
09:01:19 | Total parameters: 69,232,640 (69,232,640 trainable)
09:01:19 | Loading existing model params from modeloSeq2seq/model
09:01:20 | Opt:
09:01:20 |     adafactor_eps: '[1e-30, 0.001]'
09:01:20 |     adam_eps: 1e-08
09:01:20 |     add_p1_after_newln: False
09:01:20 |     aggregate_micro: False
09:01:20 |     allow_missing_init_opts: False
09:01:20 |     batchsize: 16
09:01:20 |     beam_block_full_context: True
09:01:20 |     beam_block_list_filename: None
09:01:20 |     beam_block_ngram: -1
09:01:20 |     beam_context_block_ngram: -1
09:01:20 |     beam_delay: 30
09:01:20 |     beam_length_penalty: 0.65
09:01:20 |     beam_min_length: 1
09:01:20 |     beam_size: 1
09:01:20 |     betas: '[0.9, 0.999]'
09:01:20 |     bpe_add_prefix_space: None
09:01

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.331),
  'cbt:NE/ppl': PPLMetric(1527),
  'cbt:NE/token_acc': AverageMetric(0.488),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(6.005),
  'cbt:CN/ppl': PPLMetric(405.6),
  'cbt:CN/

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

09:15:11 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
09:15:11 | Using CUDA
09:15:12 | loading dictionary from modeloSeq2seq/model.dict
09:15:12 | num words = 51210
09:15:12 | Total parameters: 69,232,640 (69,232,640 trainable)
09:15:12 | Loading existing model params from modeloSeq2seq/model
09:15:14 | creating task(s): my_teacher
09:15:14 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
09:15:14 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
09:15:14 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
09:15:14 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
09:15:14 | Opt:
09:15:14 |     adafactor_eps: '[1e-30, 0.001]'
09:15:14 |     adam_eps: 1e-08
09:15:14 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 40 min
    max_train_time=2400, 
    # Tamaño del lote
    batchsize=16, 
       
)

09:17:56 | building dictionary first...
09:17:56 | [33mOverriding opt["max_train_time"] to 2400.0 (previously: 1800.0)[0m
09:17:56 | Using CUDA
09:17:56 | loading dictionary from modeloSeq2seq/model.dict
09:17:56 | num words = 51210
09:17:59 | Total parameters: 69,232,640 (69,232,640 trainable)
09:17:59 | Loading existing model params from modeloSeq2seq/model
09:18:00 | Opt:
09:18:00 |     adafactor_eps: '[1e-30, 0.001]'
09:18:00 |     adam_eps: 1e-08
09:18:00 |     add_p1_after_newln: False
09:18:00 |     aggregate_micro: False
09:18:00 |     allow_missing_init_opts: False
09:18:00 |     batchsize: 16
09:18:00 |     beam_block_full_context: True
09:18:00 |     beam_block_list_filename: None
09:18:00 |     beam_block_ngram: -1
09:18:00 |     beam_context_block_ngram: -1
09:18:00 |     beam_delay: 30
09:18:00 |     beam_length_penalty: 0.65
09:18:00 |     beam_min_length: 1
09:18:00 |     beam_size: 1
09:18:00 |     betas: '[0.9, 0.999]'
09:18:00 |     bpe_add_prefix_space: None
09:18

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.458),
  'cbt:NE/ppl': PPLMetric(1735),
  'cbt:NE/token_acc': AverageMetric(0.4885),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(6.057),
  'cbt:CN/ppl': PPLMetric(427.1),
  'cbt:CN

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

09:28:04 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
09:28:04 | Using CUDA
09:28:04 | loading dictionary from modeloSeq2seq/model.dict
09:28:04 | num words = 51210
09:28:05 | Total parameters: 69,232,640 (69,232,640 trainable)
09:28:05 | Loading existing model params from modeloSeq2seq/model
09:28:06 | creating task(s): my_teacher
09:28:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
09:28:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
09:28:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
09:28:06 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
09:28:06 | Opt:
09:28:06 |     adafactor_eps: '[1e-30, 0.001]'
09:28:06 |     adam_eps: 1e-08
09:28:06 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=100,
    # Entrenamiento máximo de 40 min
    max_train_time=2400, 
    # Tamaño del lote
    batchsize=16, 
       
)

09:35:35 | building dictionary first...
09:35:35 | [33mOverriding opt["warmup_updates"] to 100 (previously: 50)[0m
09:35:35 | Using CUDA
09:35:35 | loading dictionary from modeloSeq2seq/model.dict
09:35:35 | num words = 51210
09:35:38 | Total parameters: 69,232,640 (69,232,640 trainable)
09:35:38 | Loading existing model params from modeloSeq2seq/model
09:35:40 | Opt:
09:35:40 |     adafactor_eps: '[1e-30, 0.001]'
09:35:40 |     adam_eps: 1e-08
09:35:40 |     add_p1_after_newln: False
09:35:40 |     aggregate_micro: False
09:35:40 |     allow_missing_init_opts: False
09:35:40 |     batchsize: 16
09:35:40 |     beam_block_full_context: True
09:35:40 |     beam_block_list_filename: None
09:35:40 |     beam_block_ngram: -1
09:35:40 |     beam_context_block_ngram: -1
09:35:40 |     beam_delay: 30
09:35:40 |     beam_length_penalty: 0.65
09:35:40 |     beam_min_length: 1
09:35:40 |     beam_size: 1
09:35:40 |     betas: '[0.9, 0.999]'
09:35:40 |     bpe_add_prefix_space: None
09:35:40 |  

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 5s
    validation_every_n_secs=5,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 40 min
    max_train_time=2400, 
    # Tamaño del lote
    batchsize=16, 
       
)

09:40:17 | building dictionary first...
09:40:17 | [33mOverriding opt["validation_every_n_secs"] to 5.0 (previously: 10.0)[0m
09:40:17 | [33mOverriding opt["warmup_updates"] to 50 (previously: 100)[0m
09:40:17 | Using CUDA
09:40:17 | loading dictionary from modeloSeq2seq/model.dict
09:40:17 | num words = 51210
09:40:20 | Total parameters: 69,232,640 (69,232,640 trainable)
09:40:20 | Loading existing model params from modeloSeq2seq/model
09:40:21 | Opt:
09:40:21 |     adafactor_eps: '[1e-30, 0.001]'
09:40:21 |     adam_eps: 1e-08
09:40:21 |     add_p1_after_newln: False
09:40:21 |     aggregate_micro: False
09:40:21 |     allow_missing_init_opts: False
09:40:21 |     batchsize: 16
09:40:21 |     beam_block_full_context: True
09:40:21 |     beam_block_list_filename: None
09:40:21 |     beam_block_ngram: -1
09:40:21 |     beam_context_block_ngram: -1
09:40:21 |     beam_delay: 30
09:40:21 |     beam_length_penalty: 0.65
09:40:21 |     beam_min_length: 1
09:40:21 |     beam_size: 1
09:

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.456),
  'cbt:NE/ppl': PPLMetric(1731),
  'cbt:NE/token_acc': AverageMetric(0.4885),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(6.055),
  'cbt:CN/ppl': PPLMetric(426.3),
  'cbt:CN

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

09:49:19 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
09:49:19 | Using CUDA
09:49:19 | loading dictionary from modeloSeq2seq/model.dict
09:49:19 | num words = 51210
09:49:20 | Total parameters: 69,232,640 (69,232,640 trainable)
09:49:20 | Loading existing model params from modeloSeq2seq/model
09:49:21 | creating task(s): my_teacher
09:49:21 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
09:49:21 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
09:49:21 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
09:49:22 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
09:49:22 | Opt:
09:49:22 |     adafactor_eps: '[1e-30, 0.001]'
09:49:22 |     adam_eps: 1e-08
09:49:22 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador sgd
    lr=1e-3, 
    optimizer='sgd',
    warmup_updates=50,
    # Entrenamiento máximo de 80 min
    max_train_time=2400, 
    # Tamaño del lote
    batchsize=16, 
       
)

10:05:13 | building dictionary first...
10:05:13 | [33mOverriding opt["validation_every_n_secs"] to 10.0 (previously: 5.0)[0m
10:05:13 | [33mOverriding opt["optimizer"] to sgd (previously: adam)[0m
10:05:13 | Using CUDA
10:05:13 | loading dictionary from modeloSeq2seq/model.dict
10:05:13 | num words = 51210
10:05:16 | Total parameters: 69,232,640 (69,232,640 trainable)
10:05:16 | Loading existing model params from modeloSeq2seq/model
10:05:17 | [33mNot loading optim state since optim class changed.[0m
10:05:17 | [33mOptimizer was reset. Also resetting LR scheduler.[0m
10:05:17 | Opt:
10:05:17 |     adafactor_eps: '[1e-30, 0.001]'
10:05:17 |     adam_eps: 1e-08
10:05:17 |     add_p1_after_newln: False
10:05:17 |     aggregate_micro: False
10:05:17 |     allow_missing_init_opts: False
10:05:17 |     batchsize: 16
10:05:17 |     beam_block_full_context: True
10:05:17 |     beam_block_list_filename: None
10:05:17 |     beam_block_ngram: -1
10:05:17 |     beam_context_block_ngram: -

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(7.456),
  'cbt:NE/ppl': PPLMetric(1731),
  'cbt:NE/token_acc': AverageMetric(0.4885),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0),
  'cbt:CN/bleu-4': BleuMetric(0),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(6.055),
  'cbt:CN/ppl': PPLMetric(426.3),
  'cbt:CN

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

10:14:05 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
10:14:05 | Using CUDA
10:14:05 | loading dictionary from modeloSeq2seq/model.dict
10:14:06 | num words = 51210
10:14:06 | Total parameters: 69,232,640 (69,232,640 trainable)
10:14:06 | Loading existing model params from modeloSeq2seq/model
10:14:07 | creating task(s): my_teacher
10:14:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
10:14:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
10:14:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
10:14:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
10:14:07 | Opt:
10:14:07 |     adafactor_eps: '[1e-30, 0.001]'
10:14:07 |     adam_eps: 1e-08
10:14:07 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 80 min
    max_train_time=4800, 
    # Tamaño del lote
    batchsize=8, 
       
)

10:21:04 | building dictionary first...
10:21:04 | [33mOverriding opt["optimizer"] to adam (previously: sgd)[0m
10:21:04 | [33mOverriding opt["max_train_time"] to 4800.0 (previously: 2400.0)[0m
10:21:04 | [33mOverriding opt["batchsize"] to 8 (previously: 16)[0m
10:21:04 | Using CUDA
10:21:04 | loading dictionary from modeloSeq2seq/model.dict
10:21:04 | num words = 51210
10:21:07 | Total parameters: 69,232,640 (69,232,640 trainable)
10:21:07 | Loading existing model params from modeloSeq2seq/model
10:21:07 | [33mNot loading optim state since optim class changed.[0m
10:21:07 | [33mOptimizer was reset. Also resetting LR scheduler.[0m
10:21:07 | Opt:
10:21:07 |     adafactor_eps: '[1e-30, 0.001]'
10:21:07 |     adam_eps: 1e-08
10:21:07 |     add_p1_after_newln: False
10:21:07 |     aggregate_micro: False
10:21:07 |     allow_missing_init_opts: False
10:21:07 |     batchsize: 8
10:21:07 |     beam_block_full_context: True
10:21:07 |     beam_block_list_filename: None
10:21:07 |   

RuntimeError: CUDA out of memory. Tried to allocate 56.00 MiB (GPU 0; 4.00 GiB total capacity; 2.39 GiB already allocated; 0 bytes free; 2.51 GiB reserved in total by PyTorch)

In [None]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 80 min
    max_train_time=4800, 
    # Tamaño del lote
    batchsize=8, 
       
)

11:03:20 | building dictionary first...
11:03:20 | Using CUDA
11:03:20 | loading dictionary from modeloSeq2seq/model.dict
11:03:20 | num words = 51210
11:03:26 | Total parameters: 69,232,640 (69,232,640 trainable)
11:03:26 | Loading existing model params from modeloSeq2seq/model
11:03:29 | Opt:
11:03:29 |     adafactor_eps: '[1e-30, 0.001]'
11:03:29 |     adam_eps: 1e-08
11:03:29 |     add_p1_after_newln: False
11:03:29 |     aggregate_micro: False
11:03:29 |     allow_missing_init_opts: False
11:03:29 |     batchsize: 8
11:03:29 |     beam_block_full_context: True
11:03:29 |     beam_block_list_filename: None
11:03:29 |     beam_block_ngram: -1
11:03:29 |     beam_context_block_ngram: -1
11:03:29 |     beam_delay: 30
11:03:29 |     beam_length_penalty: 0.65
11:03:29 |     beam_min_length: 1
11:03:29 |     beam_size: 1
11:03:29 |     betas: '[0.9, 0.999]'
11:03:29 |     bpe_add_prefix_space: None
11:03:29 |     bpe_debug: False
11:03:29 |     bpe_dropout: None
11:03:29 |     bpe_merge:

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(8.059),
  'cbt:NE/ppl': PPLMetric(3161),
  'cbt:NE/token_acc': AverageMetric(0.4668),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0.0003333),
  'cbt:CN/bleu-4': BleuMetric(1.839e-13),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.867),
  'cbt:CN/ppl': PPLMetric(3

In [8]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(model_file='modeloSeq2seq/model', task='my_teacher')

11:14:30 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
11:14:30 | Using CUDA
11:14:30 | loading dictionary from modeloSeq2seq/model.dict
11:14:31 | num words = 51210
11:14:31 | Total parameters: 69,232,640 (69,232,640 trainable)
11:14:31 | Loading existing model params from modeloSeq2seq/model
11:14:32 | creating task(s): my_teacher
11:14:32 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
11:14:32 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
11:14:32 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
11:14:33 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
11:14:33 | Opt:
11:14:33 |     adafactor_eps: '[1e-30, 0.001]'
11:14:33 |     adam_eps: 1e-08
11:14:33 |   

In [9]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='cbt',
    model_file='modeloSeq2seq/model',
    num_examples=40,
)

11:16:42 | Using CUDA
11:16:42 | loading dictionary from modeloSeq2seq/model.dict
11:16:42 | num words = 51210
11:16:43 | Total parameters: 69,232,640 (69,232,640 trainable)
11:16:43 | Loading existing model params from modeloSeq2seq/model
11:16:44 | creating task(s): cbt
11:16:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
11:16:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
11:16:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
11:16:44 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
11:16:44 | Opt:
11:16:44 |     adafactor_eps: '[1e-30, 0.001]'
11:16:44 |     adam_eps: 1e-08
11:16:44 |     add_p1_after_newln: False
11:16:44 |     aggregate_micro: False
11:16:44 |     a

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=5e-3, 
    optimizer='adam',
    warmup_updates=50,
    # Entrenamiento máximo de 80 min
    max_train_time=4800, 
    # Tamaño del lote
    batchsize=8, 
       
)

11:22:52 | building dictionary first...
11:22:52 | [33mOverriding opt["learningrate"] to 0.005 (previously: 0.001)[0m
11:22:52 | Using CUDA
11:22:52 | loading dictionary from modeloSeq2seq/model.dict
11:22:52 | num words = 51210
11:22:55 | Total parameters: 69,232,640 (69,232,640 trainable)
11:22:55 | Loading existing model params from modeloSeq2seq/model
11:22:56 | Opt:
11:22:56 |     adafactor_eps: '[1e-30, 0.001]'
11:22:56 |     adam_eps: 1e-08
11:22:56 |     add_p1_after_newln: False
11:22:56 |     aggregate_micro: False
11:22:56 |     allow_missing_init_opts: False
11:22:56 |     batchsize: 8
11:22:56 |     beam_block_full_context: True
11:22:56 |     beam_block_list_filename: None
11:22:56 |     beam_block_ngram: -1
11:22:56 |     beam_context_block_ngram: -1
11:22:56 |     beam_delay: 30
11:22:56 |     beam_length_penalty: 0.65
11:22:56 |     beam_min_length: 1
11:22:56 |     beam_size: 1
11:22:56 |     betas: '[0.9, 0.999]'
11:22:56 |     bpe_add_prefix_space: None
11:22:56 |

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/accuracy': ExactMatchMetric(0),
  'cbt:NE/f1': F1Metric(0),
  'cbt:NE/bleu-4': BleuMetric(0),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(8.261),
  'cbt:NE/ppl': PPLMetric(3870),
  'cbt:NE/token_acc': AverageMetric(0.3989),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/accuracy': ExactMatchMetric(0),
  'cbt:CN/f1': F1Metric(0.0003333),
  'cbt:CN/bleu-4': BleuMetric(1.839e-13),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.837),
  'cbt:CN/ppl': PPLMetric(3

In [9]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='my_teacher',
    model_file='modeloSeq2seq/model',
    num_examples=40,
    skip_generation = False,
)

11:37:05 | [33mOverriding opt["task"] to my_teacher (previously: cbt)[0m
11:37:05 | Using CUDA
11:37:05 | loading dictionary from modeloSeq2seq/model.dict
11:37:05 | num words = 51210
11:37:06 | Total parameters: 69,232,640 (69,232,640 trainable)
11:37:06 | Loading existing model params from modeloSeq2seq/model
11:37:07 | creating task(s): my_teacher
11:37:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
11:37:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
11:37:07 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
11:37:08 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
11:37:08 | Opt:
11:37:08 |     adafactor_eps: '[1e-30, 0.001]'
11:37:08 |     adam_eps: 1e-08
11:37:08 |   

In [7]:
from parlai.scripts.train_model import TrainModel
from parlai.core.agents import create_agent

TrainModel.main(
    model='modeloSeq2seq',
    model_file='modeloSeq2seq/model',
    task='cbt',
    # Validamos cada 10s
    validation_every_n_secs=10,
    # Usamos tasa de entrenamiento pequeño para el optimizador Adam
    lr=1e-3, 
    optimizer='adam',
    warmup_updates=50,
    
    # Entrenamiento máximo de 80 min
    max_train_time=4800, 
    # Tamaño del lote
    batchsize=8, 
    
    # Sirve para hacer más rápido las validaciones 
    skip_generation = True,
)

11:41:04 | building dictionary first...
11:41:04 | [33mOverriding opt["learningrate"] to 0.001 (previously: 0.005)[0m
11:41:04 | [33mOverriding opt["skip_generation"] to True (previously: False)[0m
11:41:04 | Using CUDA
11:41:04 | loading dictionary from modeloSeq2seq/model.dict
11:41:04 | num words = 51210
11:41:07 | Total parameters: 69,232,640 (69,232,640 trainable)
11:41:07 | Loading existing model params from modeloSeq2seq/model
11:41:09 | Opt:
11:41:09 |     adafactor_eps: '[1e-30, 0.001]'
11:41:09 |     adam_eps: 1e-08
11:41:09 |     add_p1_after_newln: False
11:41:09 |     aggregate_micro: False
11:41:09 |     allow_missing_init_opts: False
11:41:09 |     batchsize: 8
11:41:09 |     beam_block_full_context: True
11:41:09 |     beam_block_list_filename: None
11:41:09 |     beam_block_ngram: -1
11:41:09 |     beam_context_block_ngram: -1
11:41:09 |     beam_delay: 30
11:41:09 |     beam_length_penalty: 0.65
11:41:09 |     beam_min_length: 1
11:41:09 |     beam_size: 1
11:41:0

({'cbt:NE/exs': SumMetric(2000),
  'exs': SumMetric(8000),
  'cbt:NE/clen': AverageMetric(489),
  'cbt:NE/ctrunc': AverageMetric(0),
  'cbt:NE/ctrunclen': AverageMetric(0),
  'cbt:NE/llen': AverageMetric(2.047),
  'cbt:NE/ltrunc': AverageMetric(0),
  'cbt:NE/ltrunclen': AverageMetric(0),
  'cbt:NE/loss': AverageMetric(8.484),
  'cbt:NE/ppl': PPLMetric(4839),
  'cbt:NE/token_acc': AverageMetric(0.3915),
  'cbt:NE/token_em': AverageMetric(0),
  'cbt:CN/exs': SumMetric(2000),
  'cbt:CN/clen': AverageMetric(522.1),
  'cbt:CN/ctrunc': AverageMetric(0),
  'cbt:CN/ctrunclen': AverageMetric(0),
  'cbt:CN/llen': AverageMetric(2.004),
  'cbt:CN/ltrunc': AverageMetric(0),
  'cbt:CN/ltrunclen': AverageMetric(0),
  'cbt:CN/loss': AverageMetric(5.817),
  'cbt:CN/ppl': PPLMetric(336),
  'cbt:CN/token_acc': AverageMetric(0.491),
  'cbt:CN/token_em': AverageMetric(0),
  'cbt:V/exs': SumMetric(2000),
  'cbt:V/clen': AverageMetric(509.2),
  'cbt:V/ctrunc': AverageMetric(0),
  'cbt:V/ctrunclen': AverageMe

In [12]:
from parlai.scripts.display_model import DisplayModel
DisplayModel.main(
    task='cbt',
    model_file='modeloSeq2seq/model',
    num_examples=40,
    skip_generation = False,
)

12:06:08 | [33mOverriding opt["skip_generation"] to False (previously: True)[0m
12:06:08 | Using CUDA
12:06:08 | loading dictionary from modeloSeq2seq/model.dict
12:06:08 | num words = 51210
12:06:08 | Total parameters: 69,232,640 (69,232,640 trainable)
12:06:08 | Loading existing model params from modeloSeq2seq/model
12:06:10 | creating task(s): cbt
12:06:10 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_NE_valid_2000ex.txt
12:06:10 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_CN_valid_2000ex.txt
12:06:10 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_V_valid_2000ex.txt
12:06:10 | loading fbdialog data: C:\Users\hp\anaconda3\envs\pytorch\lib\site-packages\data\CBT\CBTest\data\cbtest_P_valid_2000ex.txt
12:06:10 | Opt:
12:06:10 |     adafactor_eps: '[1e-30, 0.001]'
12:06:10 |     adam_eps: 1e-08
12:06:10 |   