# Training Emely

This notebook is for training Emely with different configurations.
Use the blender_opts dictionary for the standard options. 

### Configuration

#### Base Config
We'll call the base configuration "Blender base config" and it's the blender 90M model fine tuned on the internal and external tasks

### Required config for a run

- task
- multitask_weights
- model_file

### Optional config
- mutators
- lr


### Different mutators for different tasks?
--task internal:mutators=word_shuffle,internal:mutators=last_turn


### Evaluation
All models are evaluated on the internal and external tasks

In [1]:
from parlai.scripts.train_model import TrainModel
from pathlib import Path
from copy import deepcopy
import shutil

In [2]:
blender_opts = {'init_model': 'zoo:blender/blender_90M/model',
                'dict_file': 'zoo:blender/blender_90M/model.dict',
              'model': 'transformer/generator',
              'embedding_size': 512,
              'n_layers': 8,
              'ffn_size': 2048,
              'dropout': 0.1,
              'n_heads': 16,
              'learn_positional_embeddings': True,
              'n_positions': 512,
              'variant': 'xlm',
              'activation': 'gelu',
              'fp16': True,
              'text_truncate': 512,
              'label_truncate': 128,
              'dict_tokenizer': 'bpe',
              'optimizer': 'adamax',
              'lr_scheduler': 'reduceonplateau',
              'betas': '0.9,0.999',
              'update_freq': 1,
              'attention_dropout': 0.0,
              'relu_dropout': 0.0,
              'dict_lower': True,
              'lr': 1e-06,
              'gradient_clip': 0.1,
              'veps': 0.25,
              'skip_generation': False,
              'vp': 15,
              'stim': 60,
              'vme': 20000,
              'bs': 16,
              'vmt': 'ppl',
              'vmm': 'min',
              'save_after_valid': True,
              'wblog': True,
              'wandb_project': 'parlaiemely',
              'tensorboard_log': True,
              'metrics': 'ppl,bleu-4,rouge-L',
              'evaltask': 'internal,external',
              'inference': 'beam',
              'beam_size': 10,
              'beam_min_length': 10,
              'beam_block_ngram': 3
               }

#              'dict_file': 'zoo:blender/blender_90M/model.dict'

In [3]:
def run_training(tasks, weights, mutators=None):

    # Set name for file and model run on wandb
    if mutators is not None:
        name = f'blender-{tasks}-{weights}-{mutators}'

    else:
        name = f'blender-{tasks}-{weights}'

        
    #%env WANDB_NAME=$name
    mf = Path.cwd().parents[1].joinpath(f'models/model-runs/{name}/model')
    
    if mutators is not None:
        run_opts = {'task': tasks,
                    'multitask_weights': weights,
                    'model_file': mf.as_posix(),
                    'mutators': mutators
                    }        
    else:
        run_opts = {'task': tasks,
                    'multitask_weights': weights,
                    'model_file': mf.as_posix()
                   }
    
    # Copy the standard opts and update them
    opts = deepcopy(blender_opts)
    opts.update(run_opts)

    TrainModel.main(**opts)

## 0. Test run 2 after each other

In [None]:
name = 'TEST'
mf = Path.cwd().parents[1].joinpath(f'models/model-runs/{name}/model')

if mf.parent.exists():
    shutil.rmtree(mf.parent)

tasks = 'internal,external'
weights='5,1'
mutators='word_shuffle'
run_training(tasks=tasks,weights=weights, mutators=mutators)

## 1. Blender base config

### Datasets with sampling weights:
- internal - 6
- external - 3

### Mutators
None

In [4]:
tasks='internal,external'
weights= '6,3'

run_training(tasks=tasks,weights=weights)

18:25:42 | building dictionary first...
18:25:42 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3/model(.opt)
18:25:42 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: None,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_vocab: None,bpe_merge: Non

18:25:46 |     tensorboard_logdir: None
18:25:46 |     text_truncate: 512
18:25:46 |     topk: 10
18:25:46 |     topp: 0.9
18:25:46 |     truncate: -1
18:25:46 |     update_freq: 1
18:25:46 |     use_reply: label
18:25:46 |     validation_cutoff: 1.0
18:25:46 |     validation_every_n_epochs: 0.25
18:25:46 |     validation_every_n_secs: -1
18:25:46 |     validation_every_n_steps: -1
18:25:46 |     validation_max_exs: 20000
18:25:46 |     validation_metric: ppl
18:25:46 |     validation_metric_mode: min
18:25:46 |     validation_patience: 15
18:25:46 |     validation_share_agent: False
18:25:46 |     variant: xlm
18:25:46 |     verbose: False
18:25:46 |     wandb_entity: None
18:25:46 |     wandb_log: True
18:25:46 |     wandb_name: None
18:25:46 |     wandb_project: parlaiemely
18:25:46 |     warmup_rate: 0.0001
18:25:46 |     warmup_updates: -1
18:25:46 |     weight_decay: None
18:25:46 | Current ParlAI commit: e3c1edbef397de2c084a521fb0bb81489a432c74
18:25:46 | creating task(s): inter

[34m[1mwandb[0m: W&B API key is configured (use `wandb login --relogin` to force relogin)


18:26:09 | training...
18:26:11 | time:25s total_exs:256 total_steps:16 epochs:0.26
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      55.46     1 884.4  7331       0          0 132.6  256             32768  12.56    .1370 15.49 2.844 1e-06 245.9   
   external 56.23                         0          0         97                                   15.99 3.041               
   internal 54.69                         0          0        159                                   14.99 2.647               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb  tps   ups  
   all       2038       0          0 17.51      .4114         0                   16 1130 9370 8.348  
   external             0          0 20.92      .3830         0                                       
   internal             0          0 14.11      .4398         0

18:26:11 | creating task(s): inter

18:27:16 | running eval: valid
18:27:31 | eval completed in 15.48s
18:27:31 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007787 53.69 635.2 512.6       0          0 11.39  171 .1783   .06922 14.68 2.767 1e-06   190 153.3   
   external         0 .009779 62.36                   0          0         44 .1576          15.89 3.042                     
   internal         0 .005794 45.02                   0          0        127 .1990          13.46 2.492                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.52    .1777      .4361         0                   64 825.2 665.9  
   external       0          0 20.96    .1570      .4049         0                                   
   internal       0          0 12.09    .1984      .4673         0
[0m
18:27:31 | saving model check

18:28:37 | running eval: valid
18:28:51 | eval completed in 14.89s
18:28:51 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01258 53.69 635.2 528.4       0          0 11.74  171 .1957   .06921 14.68 2.724 1e-06   190   158   
   external         0  .01791 62.36                   0          0         44 .1775          15.89 3.003                     
   internal         0 .007256 45.02                   0          0        127 .2139          13.46 2.444                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.83    .1923      .4397         0                  128 825.2 686.4  
   external       0          0 20.15    .1774      .4034         0                                   
   internal       0          0 11.52    .2072      .4760         0
[0m
18:28:52 | saving model check



18:29:53 | running eval: valid
18:30:07 | eval completed in 13.98s
18:30:07 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009824 53.69 635.2 562.2       0          0 12.49  171 .2000   .06906 14.68 2.697 1e-06   190 168.1   
   external         0 .009793 62.36                   0          0         44 .1722          15.89 2.979                     
   internal         0 .009855 45.02                   0          0        127 .2278          13.46 2.414                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.43    .1988      .4457         0                  192 825.2 730.3  
   external       0          0 19.67    .1728      .4077         0                                   
   internal       0          0 11.18    .2248      .4836         0
[0m
18:30:07 | saving model check

18:31:06 | running eval: valid
18:31:20 | eval completed in 13.51s
18:31:20 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01164 53.69 635.2 583.5       0          0 12.96  171 .1960   .06912 14.68 2.677 1e-06   190 174.5   
   external         0 .009789 62.36                   0          0         44 .1634          15.89 2.961                     
   internal         0   .0135 45.02                   0          0        127 .2286          13.46 2.393                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 15.13    .1975      .4500         0                  256 825.2  758  
   external       0          0 19.31    .1676      .4134         0                                  
   internal       0          0 10.94    .2275      .4865         0
[0m
18:31:20 | saving model checkpoi

18:32:19 | running eval: valid
18:32:32 | eval completed in 13.24s
18:32:32 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009105 53.69 635.2 598.3       0          0 13.29  171 .2062   .06923 14.68 2.661 1e-06   190 178.9   
   external         0 .004705 62.36                   0          0         44 .1690          15.89 2.946                     
   internal         0  .01351 45.02                   0          0        127 .2434          13.46 2.376                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  14.9    .2045      .4543         0                  320 825.2 777.2  
   external       0          0 19.04    .1710      .4192         0                                   
   internal       0          0 10.76    .2380      .4895         0
[0m
18:32:32 | saving model check

18:33:30 | running eval: valid
18:33:44 | eval completed in 13.05s
18:33:44 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009104 53.69 635.2 602.4       0          0 13.38  171 .1993   .06939 14.68 2.649 1e-06   190 180.2   
   external         0 .004705 62.36                   0          0         44 .1598          15.89 2.935                     
   internal         0   .0135 45.02                   0          0        127 .2389          13.46 2.363                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.72    .1947      .4547         0                  384 825.2 782.6  
   external       0          0 18.82    .1569      .4206         0                                   
   internal       0          0 10.62    .2324      .4889         0
[0m
18:33:44 | saving model check

18:34:43 | running eval: valid
18:34:56 | eval completed in 13.16s
18:34:56 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009932 53.69 635.2 602.5       0          0 13.38  171 .2054   .06916 14.68 2.638 1e-06   190 180.2   
   external         0 .004709 62.36                   0          0         44 .1684          15.89 2.924                     
   internal         0  .01516 45.02                   0          0        127 .2425          13.46 2.352                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.57    .2038      .4558         0                  448 825.2 782.7  
   external       0          0 18.62    .1656      .4220         0                                   
   internal       0          0 10.51    .2421      .4895         0
[0m
18:34:56 | saving model check

18:35:54 | running eval: valid
18:36:07 | eval completed in 12.87s
18:36:07 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .008659 53.69 635.2 615.2       0          0 13.67  171 .2050   .06917 14.68 2.629 1e-06   190   184   
   external         0 .004709 62.36                   0          0         44 .1716          15.89 2.914                     
   internal         0  .01261 45.02                   0          0        127 .2383          13.46 2.343                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.42    .2026      .4597         0                  512 825.2 799.2  
   external       0          0 18.43    .1685      .4235         0                                   
   internal       0          0 10.41    .2368      .4959         0
[0m
18:36:07 | saving model check

18:37:03 | running eval: valid
18:37:16 | eval completed in 12.65s
18:37:16 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009612 53.69 635.2 620.6       0          0 13.78  171 .2094   .06933 14.68  2.62 1e-06   190 185.6   
   external         0 .004709 62.36                   0          0         44 .1739          15.89 2.904                     
   internal         0  .01451 45.02                   0          0        127 .2450          13.46 2.336                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.29    .2043      .4600         0                  576 825.2 806.2  
   external       0          0 18.25    .1670      .4235         0                                   
   internal       0          0 10.34    .2416      .4965         0
[0m
18:37:16 | saving model check

18:38:13 | running eval: valid
18:38:26 | eval completed in 13.31s
18:38:26 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .009738 53.69 635.2   573       0          0 12.73  171 .2155   .06905 14.68 2.613 1e-06   190 171.4   
   external         0 .004959 62.36                   0          0         44 .1807          15.89 2.898                     
   internal         0  .01452 45.02                   0          0        127 .2503          13.46 2.329                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  14.2    .2102      .4611         0                  640 825.2 744.3  
   external       0          0 18.14    .1743      .4263         0                                   
   internal       0          0 10.27    .2462      .4959         0
[0m
18:38:26 | saving model check

18:39:22 | running eval: valid
18:39:35 | eval completed in 12.25s
18:39:35 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01373 53.69 635.2 641.3       0          0 14.25  171 .2195   .06902 14.68 2.606 1e-06   190 191.8   
   external         0  .01308 62.36                   0          0         44 .1899          15.89 2.891                     
   internal         0  .01438 45.02                   0          0        127 .2491          13.46 2.321                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.11    .2163      .4620         0                  704 825.2 833.1  
   external       0          0 18.02    .1850      .4263         0                                   
   internal       0          0 10.19    .2477      .4977         0
[0m
18:39:35 | saving model check

18:40:30 | running eval: valid
18:40:43 | eval completed in 12.33s
18:40:43 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01558 53.69 635.2 637.2       0          0 14.16  171 .2222   .06906 14.68   2.6 1e-06   190 190.6   
   external         0  .01308 62.36                   0          0         44 .1857          15.89 2.886                     
   internal         0  .01808 45.02                   0          0        127 .2587          13.46 2.315                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.02    .2265      .4621         0                  768 825.2 827.8  
   external       0          0 17.91    .1887      .4278         0                                   
   internal       0          0 10.12    .2643      .4965         0
[0m
18:40:43 | saving model check

18:41:39 | running eval: valid
18:41:51 | eval completed in 12.54s
18:41:51 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01741 53.69 635.2 621.2       0          0  13.8  171 .2209   .06911 14.68 2.595 1e-06   190 185.8   
   external         0  .02025 62.36                   0          0         44 .1909          15.89  2.88                     
   internal         0  .01457 45.02                   0          0        127 .2508          13.46  2.31                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.94    .2235      .4621         0                  832 825.2  807  
   external       0          0 17.81    .1888      .4278         0                                  
   internal       0          0 10.07    .2583      .4965         0
[0m
18:41:51 | saving model checkpoi

18:42:48 | running eval: valid
18:43:01 | eval completed in 12.67s
18:43:01 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01548 53.69 635.2 610.3       0          0 13.56  171 .2207   .06905 14.68  2.59 1e-06   190 182.6   
   external         0  .01639 62.36                   0          0         44 .1849          15.89 2.875                     
   internal         0  .01457 45.02                   0          0        127 .2564          13.46 2.305                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.87    .2228      .4634         0                  896 825.2 792.9  
   external       0          0 17.72    .1817      .4292         0                                   
   internal       0          0 10.02    .2638      .4977         0
[0m
18:43:01 | saving model check

18:43:58 | running eval: valid
18:44:10 | eval completed in 12.09s
18:44:10 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01465 53.69 635.2 642.8       0          0 14.28  171 .2193   .06938 14.68 2.584 1e-06   190 192.3   
   external         0  .01639 62.36                   0          0         44 .1867          15.89 2.868                     
   internal         0   .0129 45.02                   0          0        127 .2520          13.46 2.301                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.79    .2201      .4660         0                  960 825.2 835.1  
   external       0          0 17.61    .1828      .4320         0                                   
   internal       0          0  9.98    .2575      .5000         0
[0m
18:44:10 | saving model check

18:45:05 | running eval: valid
18:45:17 | eval completed in 12.01s
18:45:17 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01744 53.69 635.2 647.7       0          0 14.39  171 .2236   .06925 14.68 2.581 1e-06   190 193.7   
   external         0  .02217 62.36                   0          0         44 .1928          15.89 2.866                     
   internal         0  .01271 45.02                   0          0        127 .2544          13.46 2.296                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.75    .2235      .4663         0                 1024 825.2 841.4  
   external       0          0 17.57    .1885      .4320         0                                   
   internal       0          0 9.934    .2585      .5006         0
[0m
18:45:17 | saving model check

18:46:11 | running eval: valid
18:46:23 | eval completed in 12.00s
18:46:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01744 53.69 635.2 648.9       0          0 14.42  171 .2248   .06913 14.68 2.576 1e-06   190 194.1   
   external         0  .02217 62.36                   0          0         44 .1953          15.89 2.861                     
   internal         0  .01271 45.02                   0          0        127 .2543          13.46 2.292                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.68    .2241      .4654         0                 1088 825.2  843  
   external       0          0 17.48    .1903      .4320         0                                  
   internal       0          0 9.891    .2579      .4988         0
[0m
18:46:23 | saving model checkpoi

18:47:18 | running eval: valid
18:47:30 | eval completed in 11.83s
18:47:30 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01589 53.69 635.2 657.6       0          0 14.61  171 .2254   .06905 14.68 2.572 1e-06   190 196.7   
   external         0  .02217 62.36                   0          0         44 .1981          15.89 2.856                     
   internal         0 .009613 45.02                   0          0        127 .2527          13.46 2.288                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.62    .2254      .4653         0                 1152 825.2 854.3  
   external       0          0 17.38    .1954      .4335         0                                   
   internal       0          0 9.853    .2553      .4971         0
[0m
18:47:30 | saving model check

18:48:25 | running eval: valid
18:48:37 | eval completed in 11.88s
18:48:37 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01686 53.69 635.2 658.5       0          0 14.63  171 .2230   .06921 14.68 2.568 1e-06   190 196.9   
   external         0  .02217 62.36                   0          0         44 .1964          15.89 2.851                     
   internal         0  .01154 45.02                   0          0        127 .2496          13.46 2.284                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.57    .2210      .4657         0                 1216 825.2 855.4  
   external       0          0 17.31    .1914      .4349         0                                   
   internal       0          0 9.819    .2507      .4965         0
[0m
18:48:37 | saving model check

18:49:31 | running eval: valid
18:49:43 | eval completed in 12.10s
18:49:43 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01841 53.69 635.2 653.7       0          0 14.52  171 .2258   .06913 14.68 2.564 1e-06   190 195.5   
   external         0  .02217 62.36                   0          0         44 .1926          15.89 2.847                     
   internal         0  .01464 45.02                   0          0        127 .2590          13.46 2.281                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.51    .2242      .4640         0                 1280 825.2 849.3  
   external       0          0 17.23    .1872      .4320         0                                   
   internal       0          0 9.786    .2613      .4959         0
[0m
18:49:43 | saving model check

18:50:39 | running eval: valid
18:50:51 | eval completed in 12.28s
18:50:52 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01501 53.69 635.2 634.2       0          0 14.09  171 .2247   .06912 14.68 2.561 1e-06   190 189.7   
   external         0  .02217 62.36                   0          0         44 .1929          15.89 2.843                     
   internal         0 .007854 45.02                   0          0        127 .2564          13.46 2.278                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.46    .2244      .4644         0                 1344 825.2 823.9  
   external       0          0 17.17    .1872      .4306         0                                   
   internal       0          0 9.761    .2615      .4982         0
[0m
18:50:52 | saving model check



18:51:47 | running eval: valid
18:51:59 | eval completed in 12.22s
18:51:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01619 53.69 635.2 636.7       0          0 14.14  171 .2244   .06929 14.68 2.558 1e-06   190 190.4   
   external         0  .02217 62.36                   0          0         44 .2004          15.89 2.841                     
   internal         0   .0102 45.02                   0          0        127 .2483          13.46 2.276                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.43    .2246      .4656         0                 1408 825.2 827.2  
   external       0          0 17.13    .1958      .4335         0                                   
   internal       0          0 9.737    .2535      .4977         0
[0m
18:51:59 | saving model check

18:52:52 | running eval: valid
18:53:04 | eval completed in 11.87s
18:53:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01707 53.69 635.2 655.8       0          0 14.57  171 .2310   .06913 14.68 2.556 1e-06   190 196.2   
   external         0  .02217 62.36                   0          0         44 .2029          15.89 2.838                     
   internal         0  .01197 45.02                   0          0        127 .2590          13.46 2.273                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0  13.4    .2287      .4676         0                 1472 825.2  852  
   external       0          0 17.09    .1947      .4363         0                                  
   internal       0          0 9.706    .2628      .4988         0
[0m
18:53:04 | saving model checkpoi

18:53:59 | running eval: valid
18:54:11 | eval completed in 11.86s
18:54:11 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01862 53.69 635.2 656.8       0          0 14.59  171 .2288   .06938 14.68 2.553 1e-06   190 196.5   
   external         0  .02217 62.36                   0          0         44 .1999          15.89 2.837                     
   internal         0  .01506 45.02                   0          0        127 .2578          13.46  2.27                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.37    .2291      .4660         0                 1536 825.2 853.3  
   external       0          0 17.06    .1970      .4349         0                                   
   internal       0          0 9.681    .2612      .4971         0
[0m
18:54:11 | saving model check

18:55:06 | running eval: valid
18:55:18 | eval completed in 11.80s
18:55:18 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01862 53.69 635.2 660.5       0          0 14.67  171 .2270   .06907 14.68  2.55 1e-06   190 197.6   
   external         0  .02217 62.36                   0          0         44 .2041          15.89 2.832                     
   internal         0  .01506 45.02                   0          0        127 .2500          13.46 2.267                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.32    .2272      .4659         0                 1600 825.2 858.1  
   external       0          0 16.99    .1964      .4335         0                                   
   internal       0          0 9.653    .2580      .4982         0
[0m
18:55:18 | saving model check

18:56:11 | running eval: valid
18:56:23 | eval completed in 11.97s
18:56:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01949 53.69 635.2 648.5       0          0 14.41  171 .2289    .0692 14.68 2.546 1e-06   190   194   
   external         0  .02217 62.36                   0          0         44 .2044          15.89 2.829                     
   internal         0  .01681 45.02                   0          0        127 .2533          13.46 2.264                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.27    .2282      .4656         0                 1664 825.2 842.5  
   external       0          0 16.92    .1964      .4335         0                                   
   internal       0          0 9.624    .2600      .4977         0
[0m
18:56:23 | saving model check

18:57:18 | running eval: valid
18:57:29 | eval completed in 11.77s
18:57:29 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01949 53.69 635.2   657       0          0  14.6  171 .2298   .06911 14.68 2.543 1e-06   190 196.5   
   external         0  .02218 62.36                   0          0         44 .2048          15.89 2.824                     
   internal         0  .01681 45.02                   0          0        127 .2547          13.46 2.262                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.22    .2302      .4670         0                 1728 825.2 853.5  
   external       0          0 16.85    .1975      .4335         0                                   
   internal       0          0 9.598    .2628      .5006         0
[0m
18:57:29 | saving model check

18:58:24 | running eval: valid
18:58:36 | eval completed in 11.84s
18:58:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02044 53.69 635.2 653.8       0          0 14.52  171 .2288   .06913 14.68  2.54 1e-06   190 195.5   
   external         0  .02217 62.36                   0          0         44 .2018          15.89 2.821                     
   internal         0  .01871 45.02                   0          0        127 .2558          13.46  2.26                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.19    .2317      .4673         0                 1792 825.2 849.3  
   external       0          0  16.8    .1970      .4335         0                                   
   internal       0          0 9.581    .2663      .5012         0
[0m
18:58:36 | saving model check

18:59:31 | running eval: valid
18:59:42 | eval completed in 11.73s
18:59:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01638 53.69 635.2 663.1       0          0 14.73  171 .2281   .06919 14.68 2.537 1e-06   190 198.3   
   external         0  .01405 62.36                   0          0         44 .1976          15.89 2.818                     
   internal         0  .01871 45.02                   0          0        127 .2587          13.46 2.257                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.15    .2288      .4657         0                 1856 825.2 861.4  
   external       0          0 16.74    .1893      .4320         0                                   
   internal       0          0 9.554    .2683      .4994         0
[0m
18:59:43 | saving model check

19:00:37 | running eval: valid
19:00:49 | eval completed in 11.77s
19:00:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02074 53.69 635.2 656.9       0          0 14.59  171 .2319   .06945 14.68 2.535 1e-06   190 196.5   
   external         0  .02217 62.36                   0          0         44 .2042          15.89 2.816                     
   internal         0   .0193 45.02                   0          0        127 .2596          13.46 2.255                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.12    .2341      .4643         0                 1920 825.2 853.3  
   external       0          0  16.7    .1975      .4292         0                                   
   internal       0          0 9.534    .2706      .4994         0
[0m
19:00:49 | saving model check

19:01:43 | running eval: valid
19:01:55 | eval completed in 11.89s
19:01:55 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01675 53.69 635.2 651.3       0          0 14.47  171 .2256    .0689 14.68 2.534 1e-06   190 194.8   
   external         0  .01405 62.36                   0          0         44 .1925          15.89 2.815                     
   internal         0  .01945 45.02                   0          0        127 .2587          13.46 2.253                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  13.1    .2271      .4647         0                 1984 825.2 846.2  
   external       0          0 16.69    .1842      .4306         0                                   
   internal       0          0 9.512    .2700      .4988         0
[0m
19:01:55 | saving model check

19:02:50 | running eval: valid
19:03:02 | eval completed in 11.83s
19:03:02 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01734 53.69 635.2 655.8       0          0 14.57  171 .2262   .06906 14.68 2.531 1e-06   190 196.1   
   external         0  .01405 62.36                   0          0         44 .1920          15.89 2.812                     
   internal         0  .02062 45.02                   0          0        127 .2603          13.46  2.25                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.07    .2298      .4642         0                 2048 825.2 851.9  
   external       0          0 16.65    .1888      .4278         0                                   
   internal       0          0 9.487    .2708      .5006         0
[0m
19:03:02 | saving model check

19:03:56 | running eval: valid
19:04:08 | eval completed in 11.90s
19:04:08 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0168 53.69 635.2 650.9       0          0 14.46  171 .2273   .06905 14.68 2.528 1e-06   190 194.7   
   external         0  .01405 62.36                   0          0         44 .1970          15.89 2.809                     
   internal         0  .01956 45.02                   0          0        127 .2577          13.46 2.248                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.03    .2286      .4649         0                 2112 825.2 845.5  
   external       0          0 16.59    .1871      .4292         0                                   
   internal       0          0  9.47    .2700      .5006         0
[0m
19:04:08 | saving model check

19:05:02 | running eval: valid
19:05:14 | eval completed in 11.94s
19:05:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01681 53.69 635.2 649.7       0          0 14.43  171 .2273   .06906 14.68 2.526 1e-06   190 194.3   
   external         0  .01405 62.36                   0          0         44 .1943          15.89 2.806                     
   internal         0  .01956 45.02                   0          0        127 .2603          13.46 2.246                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 12.99    .2296      .4649         0                 2176 825.2  844  
   external       0          0 16.54    .1870      .4292         0                                  
   internal       0          0 9.448    .2722      .5006         0
[0m
19:05:14 | saving model checkpoi

19:06:09 | running eval: valid
19:06:21 | eval completed in 11.86s
19:06:21 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01317 53.69 635.2 652.1       0          0 14.49  171 .2245   .06919 14.68 2.524 1e-06   190   195   
   external         0 .006878 62.36                   0          0         44 .1882          15.89 2.803                     
   internal         0  .01945 45.02                   0          0        127 .2607          13.46 2.245                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.97    .2265      .4661         0                 2240 825.2 847.1  
   external       0          0 16.49    .1803      .4292         0                                   
   internal       0          0 9.438    .2726      .5029         0
[0m
19:06:21 | saving model check

19:07:16 | running eval: valid
19:07:28 | eval completed in 11.85s
19:07:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01317 53.69 635.2 655.8       0          0 14.57  171 .2241   .06904 14.68 2.522 1e-06   190 196.2   
   external         0 .006878 62.36                   0          0         44 .1885          15.89 2.801                     
   internal         0  .01945 45.02                   0          0        127 .2596          13.46 2.242                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.94    .2278      .4656         0                 2304 825.2 851.9  
   external       0          0 16.46    .1820      .4278         0                                   
   internal       0          0 9.417    .2737      .5035         0
[0m
19:07:28 | saving model check

19:08:22 | running eval: valid
19:08:34 | eval completed in 11.89s
19:08:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01317 53.69 635.2 654.1       0          0 14.53  171 .2243   .06906 14.68 2.519 1e-06   190 195.6   
   external         0 .006878 62.36                   0          0         44 .1882          15.89 2.797                     
   internal         0  .01945 45.02                   0          0        127 .2604          13.46  2.24                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.9    .2291      .4663         0                 2368 825.2 849.8  
   external       0          0  16.4    .1840      .4292         0                                   
   internal       0          0 9.395    .2742      .5035         0
[0m
19:08:34 | saving model check

19:09:29 | running eval: valid
19:09:41 | eval completed in 11.93s
19:09:41 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01675 53.69 635.2 649.9       0          0 14.44  171 .2270   .06916 14.68 2.516 1e-06   190 194.4   
   external         0  .01405 62.36                   0          0         44 .1958          15.89 2.794                     
   internal         0  .01945 45.02                   0          0        127 .2582          13.46 2.239                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.86    .2298      .4656         0                 2432 825.2 844.4  
   external       0          0 16.35    .1874      .4278         0                                   
   internal       0          0 9.382    .2722      .5035         0
[0m
19:09:41 | saving model check

19:10:34 | running eval: valid
19:10:46 | eval completed in 11.94s
19:10:46 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01675 53.69 635.2 654.1       0          0 14.53  171 .2276   .06945 14.68 2.514 1e-06   190 195.7   
   external         0  .01405 62.36                   0          0         44 .1983          15.89 2.792                     
   internal         0  .01945 45.02                   0          0        127 .2568          13.46 2.236                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.83    .2289      .4662         0                 2496 825.2 849.8  
   external       0          0 16.31    .1867      .4278         0                                   
   internal       0          0 9.358    .2711      .5047         0
[0m
19:10:46 | saving model check

19:11:41 | running eval: valid
19:11:53 | eval completed in 11.92s
19:11:53 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01316 53.69 635.2   652       0          0 14.48  171 .2236   .06904 14.68 2.513 1e-06   190   195   
   external         0 .006878 62.36                   0          0         44 .1891          15.89 2.791                     
   internal         0  .01945 45.02                   0          0        127 .2581          13.46 2.235                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.82    .2280      .4672         0                 2560 825.2 847.1  
   external       0          0 16.29    .1865      .4292         0                                   
   internal       0          0 9.343    .2696      .5053         0
[0m
19:11:53 | saving model check

19:12:48 | running eval: valid
19:13:00 | eval completed in 11.92s
19:13:00 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01316 53.69 635.2 652.4       0          0 14.49  171 .2185   .06938 14.68 2.511 1e-06   190 195.1   
   external         0 .006876 62.36                   0          0         44 .1809          15.89 2.789                     
   internal         0  .01945 45.02                   0          0        127 .2560          13.46 2.233                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.8    .2219      .4675         0                 2624 825.2 847.6  
   external       0          0 16.26    .1751      .4292         0                                   
   internal       0          0 9.331    .2688      .5058         0
[0m
19:13:00 | saving model check



19:13:53 | running eval: valid
19:14:05 | eval completed in 11.88s
19:14:05 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01316 53.69 635.2 653.5       0          0 14.52  171 .2238   .06906 14.68  2.51 1e-06   190 195.5   
   external         0 .006876 62.36                   0          0         44 .1891          15.89 2.787                     
   internal         0  .01945 45.02                   0          0        127 .2585          13.46 2.232                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.78    .2276      .4692         0                 2688 825.2 848.9  
   external       0          0 16.23    .1854      .4320         0                                   
   internal       0          0 9.319    .2697      .5064         0
[0m
19:14:05 | saving model check

19:15:00 | running eval: valid
19:15:12 | eval completed in 11.88s
19:15:12 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01316 53.69 635.2   651       0          0 14.46  171 .2237   .06924 14.68 2.508 1e-06   190 194.7   
   external         0 .006876 62.36                   0          0         44 .1928          15.89 2.786                     
   internal         0  .01945 45.02                   0          0        127 .2547          13.46  2.23                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.76    .2287      .4704         0                 2752 825.2 845.8  
   external       0          0 16.21    .1898      .4320         0                                   
   internal       0          0 9.304    .2676      .5088         0
[0m
19:15:12 | saving model check

19:16:06 | running eval: valid
19:16:18 | eval completed in 11.98s
19:16:18 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01316 53.69 635.2 641.7       0          0 14.25  171 .2278   .06911 14.68 2.506 1e-06   190 191.9   
   external         0 .006876 62.36                   0          0         44 .1983          15.89 2.782                     
   internal         0  .01945 45.02                   0          0        127 .2573          13.46  2.23                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.72    .2314      .4727         0                 2816 825.2 833.6  
   external       0          0 16.15    .1925      .4349         0                                   
   internal       0          0 9.299    .2702      .5105         0
[0m
19:16:18 | saving model check

19:17:11 | running eval: valid
19:17:22 | eval completed in 11.90s
19:17:22 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01322 53.69 635.2 646.3       0          0 14.36  171 .2206   .06922 14.68 2.505 1e-06   190 193.3   
   external         0 .006885 62.36                   0          0         44 .1867          15.89 2.781                     
   internal         0  .01956 45.02                   0          0        127 .2546          13.46 2.228                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.71    .2266      .4731         0                 2880 825.2 839.6  
   external       0          0 16.14    .1854      .4363         0                                   
   internal       0          0 9.284    .2678      .5099         0
[0m
19:17:23 | saving model check

19:18:17 | running eval: valid
19:18:29 | eval completed in 11.87s
19:18:29 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01134 53.69 635.2 650.7       0          0 14.45  171 .2183   .06912 14.68 2.503 1e-06   190 194.6   
   external         0 .006885 62.36                   0          0         44 .1820          15.89  2.78                     
   internal         0   .0158 45.02                   0          0        127 .2546          13.46 2.226                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.69    .2239      .4737         0                 2944 825.2 845.3  
   external       0          0 16.12    .1806      .4363         0                                   
   internal       0          0 9.264    .2673      .5111         0
[0m
19:18:29 | saving model check

19:19:22 | running eval: valid
19:19:34 | eval completed in 11.83s
19:19:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01134 53.69 635.2 651.9       0          0 14.48  171 .2214   .06905 14.68 2.502 1e-06   190   195   
   external         0 .006885 62.36                   0          0         44 .1884          15.89 2.778                     
   internal         0   .0158 45.02                   0          0        127 .2544          13.46 2.225                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.67    .2272      .4743         0                 3008 825.2 846.9  
   external       0          0 16.09    .1863      .4363         0                                   
   internal       0          0 9.256    .2682      .5123         0
[0m
19:19:34 | saving model check

19:20:28 | running eval: valid
19:20:39 | eval completed in 11.82s
19:20:39 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01581 53.69 635.2 653.8       0          0 14.52  171 .2261   .06927 14.68   2.5 1e-06   190 195.6   
   external         0  .01406 62.36                   0          0         44 .1945          15.89 2.776                     
   internal         0  .01757 45.02                   0          0        127 .2578          13.46 2.224                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.65    .2317      .4717         0                 3072 825.2 849.4  
   external       0          0 16.05    .1913      .4335         0                                   
   internal       0          0 9.244    .2721      .5099         0
[0m
19:20:40 | saving model check

19:21:34 | running eval: valid
19:21:46 | eval completed in 11.81s
19:21:46 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01636 53.69 635.2 653.8       0          0 14.52  171 .2273   .06889 14.68 2.498 1e-06   190 195.6   
   external         0  .01406 62.36                   0          0         44 .1924          15.89 2.775                     
   internal         0  .01866 45.02                   0          0        127 .2622          13.46 2.222                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.63    .2322      .4714         0                 3136 825.2 849.3  
   external       0          0 16.04    .1894      .4335         0                                   
   internal       0          0 9.222    .2750      .5094         0
[0m
19:21:46 | saving model check

19:22:41 | running eval: valid
19:22:53 | eval completed in 11.86s
19:22:53 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0127 53.69 635.2 650.8       0          0 14.46  171 .2247   .06902 14.68 2.497 1e-06   190 194.6   
   external         0 .006885 62.36                   0          0         44 .1860          15.89 2.774                     
   internal         0  .01853 45.02                   0          0        127 .2633          13.46 2.221                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.62    .2306      .4730         0                 3200 825.2 845.4  
   external       0          0 16.02    .1844      .4349         0                                   
   internal       0          0 9.215    .2768      .5111         0
[0m
19:22:53 | saving model check

19:23:45 | running eval: valid
19:23:57 | eval completed in 11.77s
19:23:57 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01116 53.69 635.2 657.1       0          0  14.6  171 .2226   .06929 14.68 2.497 1e-06   190 196.6   
   external         0 .006885 62.36                   0          0         44 .1865          15.89 2.773                     
   internal         0  .01543 45.02                   0          0        127 .2587          13.46  2.22                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.61    .2289      .4730         0                 3264 825.2 853.7  
   external       0          0 16.01    .1844      .4349         0                                   
   internal       0          0  9.21    .2734      .5111         0
[0m
19:23:57 | saving model check

19:24:50 | running eval: valid
19:25:02 | eval completed in 11.99s
19:25:02 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01116 53.69 635.2 644.1       0          0 14.31  171 .2210   .06901 14.68 2.496 1e-06   190 192.7   
   external         0 .006885 62.36                   0          0         44 .1841          15.89 2.773                     
   internal         0  .01543 45.02                   0          0        127 .2579          13.46 2.219                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.6    .2277      .4737         0                 3328 825.2 836.7  
   external       0          0    16    .1844      .4363         0                                   
   internal       0          0 9.203    .2710      .5111         0
[0m
19:25:02 | saving model check

19:25:57 | running eval: valid
19:26:09 | eval completed in 11.87s
19:26:09 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01474 53.69 635.2   648       0          0 14.39  171 .2223    .0693 14.68 2.494 1e-06   190 193.8   
   external         0  .01406 62.36                   0          0         44 .1852          15.89  2.77                     
   internal         0  .01542 45.02                   0          0        127 .2595          13.46 2.218                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.57    .2276      .4713         0                 3392 825.2 841.8  
   external       0          0 15.95    .1832      .4320         0                                   
   internal       0          0 9.188    .2719      .5105         0
[0m
19:26:09 | saving model check

19:27:04 | running eval: valid
19:27:16 | eval completed in 11.90s
19:27:16 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01215 53.69 635.2   650       0          0 14.44  171 .2216   .06952 14.68 2.493 1e-06   190 194.4   
   external         0 .006885 62.36                   0          0         44 .1791          15.89 2.768                     
   internal         0  .01741 45.02                   0          0        127 .2641          13.46 2.217                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.55    .2270      .4729         0                 3456 825.2 844.4  
   external       0          0 15.93    .1782      .4335         0                                   
   internal       0          0 9.178    .2758      .5123         0
[0m
19:27:16 | saving model check

19:28:08 | running eval: valid
19:28:20 | eval completed in 11.96s
19:28:20 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01369 53.69 635.2 646.7       0          0 14.37  171 .2252   .06913 14.68 2.492 1e-06   190 193.4   
   external         0 .006885 62.36                   0          0         44 .1858          15.89 2.768                     
   internal         0   .0205 45.02                   0          0        127 .2646          13.46 2.216                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.55    .2305      .4745         0                 3520 825.2 840.1  
   external       0          0 15.93    .1860      .4349         0                                   
   internal       0          0 9.171    .2751      .5140         0
[0m
19:28:20 | saving model check

19:29:11 | running eval: valid
19:29:23 | eval completed in 11.83s
19:29:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01027 53.69 635.2 654.2       0          0 14.53  171 .2236   .06919 14.68 2.492 1e-06   190 195.7   
   external         0 .006885 62.36                   0          0         44 .1858          15.89 2.767                     
   internal         0  .01365 45.02                   0          0        127 .2613          13.46 2.216                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.54    .2289      .4742         0                 3584 825.2 849.8  
   external       0          0 15.92    .1860      .4349         0                                   
   internal       0          0 9.169    .2718      .5135         0
[0m
19:29:23 | saving model check

19:30:16 | running eval: valid
19:30:28 | eval completed in 11.83s
19:30:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01027 53.69 635.2 652.4       0          0 14.49  171 .2240   .06904 14.68 2.491 1e-06   190 195.1   
   external         0 .006885 62.36                   0          0         44 .1858          15.89 2.767                     
   internal         0  .01365 45.02                   0          0        127 .2621          13.46 2.215                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.54    .2307      .4746         0                 3648 825.2 847.5  
   external       0          0 15.91    .1860      .4363         0                                   
   internal       0          0 9.164    .2755      .5129         0
[0m
19:30:28 | saving model check

19:31:21 | running eval: valid
19:31:33 | eval completed in 11.86s
19:31:33 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01182 53.69 635.2 653.6       0          0 14.52  171 .2248   .06923 14.68 2.491 1e-06   190 195.5   
   external         0 .006885 62.36                   0          0         44 .1858          15.89 2.768                     
   internal         0  .01675 45.02                   0          0        127 .2637          13.46 2.214                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.54    .2311      .4739         0                 3712 825.2 849.1  
   external       0          0 15.92    .1860      .4349         0                                   
   internal       0          0 9.155    .2762      .5129         0
[0m
19:31:33 | saving model check

19:32:22 | running eval: valid
19:32:34 | eval completed in 11.92s
19:32:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01687 53.69 635.2 651.6       0          0 14.47  171 .2287   .06918 14.68 2.491 5e-07   190 194.9   
   external         0  .01501 62.36                   0          0         44 .1922          15.89 2.767                     
   internal         0  .01874 45.02                   0          0        127 .2653          13.46 2.214                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.53    .2351      .4733         0                 3776 825.2 846.5  
   external       0          0 15.92    .1943      .4349         0                                   
   internal       0          0 9.148    .2760      .5117         0
[0m
19:32:34 | saving model check

19:33:24 | running eval: valid
19:33:36 | eval completed in 11.94s
19:33:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .01532 53.69 635.2 650.8       0          0 14.46  171 .2275   .06913 14.68  2.49 2.5e-07   190 194.6   
   external         0  .01501 62.36                   0          0         44 .1922          15.89 2.767                       
   internal         0  .01564 45.02                   0          0        127 .2628          13.46 2.213                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.53    .2352      .4740         0                 3840 825.2 845.4  
   external       0          0 15.92    .1943      .4363         0                                   
   internal       0          0 9.144    .2762      .5117         0
[0m
19:33:36 | saving mod

19:34:29 | running eval: valid
19:34:41 | eval completed in 11.94s
19:34:41 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .01126 53.69 635.2   652       0          0 14.48  171 .2239   .06917 14.68  2.49 2.5e-07   190   195   
   external         0 .006885 62.36                   0          0         44 .1856          15.89 2.767                       
   internal         0  .01564 45.02                   0          0        127 .2621          13.46 2.213                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 12.52    .2305      .4740         0                 3904 825.2  847  
   external       0          0 15.91    .1860      .4363         0                                  
   internal       0          0 9.143    .2751      .5117         0
[0m
19:34:41 | saving model 

19:35:33 | running eval: valid
19:35:45 | eval completed in 11.92s
19:35:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .01126 53.69 635.2   653       0          0 14.51  171 .2244   .06912 14.68  2.49 1.25e-07   190 195.3   
   external         0 .006885 62.36                   0          0         44 .1855          15.89 2.767                        
   internal         0  .01564 45.02                   0          0        127 .2633          13.46 2.213                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.52    .2305      .4746         0                 3968 825.2 848.3  
   external       0          0 15.91    .1860      .4363         0                                   
   internal       0          0 9.142    .2751      .5129         0
[0m
19:35:45 | saving

19:36:37 | running eval: valid
19:36:49 | eval completed in 11.96s
19:36:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .01126 53.69 635.2 650.5       0          0 14.45  171 .2250   .06917 14.68  2.49 6.25e-08   190 194.6   
   external         0 .006885 62.36                   0          0         44 .1874          15.89 2.766                        
   internal         0  .01564 45.02                   0          0        127 .2626          13.46 2.213                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.52    .2312      .4746         0                 4032 825.2 845.1  
   external       0          0  15.9    .1860      .4363         0                                   
   internal       0          0 9.142    .2765      .5129         0
[0m
19:36:49 | saving

19:37:41 | running eval: valid
19:37:53 | eval completed in 11.94s
19:37:53 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .01532 53.69 635.2   651       0          0 14.46  171 .2280   .06904 14.68  2.49 6.25e-08   190 194.7   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                        
   internal         0  .01564 45.02                   0          0        127 .2640          13.46 2.213                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.52    .2350      .4746         0                 4096 825.2 845.7  
   external       0          0  15.9    .1943      .4363         0                                   
   internal       0          0 9.141    .2757      .5129         0
[0m
19:37:53 | saving

19:38:43 | running eval: valid
19:38:55 | eval completed in 12.06s
19:38:55 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2   643       0          0 14.28  171 .2274   .06906 14.68  2.49 3.125e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2628          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      192.3       0          0 12.52    .2350      .4753         0                 4160 825.2 835.3  
   external             0          0  15.9    .1943      .4378         0                                   
   internal             0          0 9.142    .2757      .5129         0
[0m
19:38:55 | sa

19:39:44 | running eval: valid
19:39:56 | eval completed in 11.94s
19:39:56 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 650.5       0          0 14.45  171 .2294   .06905 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1939          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2649          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.6       0          0 12.52    .2355      .4743         0                 4224 825.2 845.1  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0  9.14    .2768      .5123         0
[0m
19:39:56 | sa

19:40:46 | running eval: valid
19:40:58 | eval completed in 11.98s
19:40:58 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 647.7       0          0 14.39  171 .2279   .06901 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1922          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2636          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      193.7       0          0 12.52    .2350      .4743         0                 4288 825.2 841.5  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.141    .2757      .5123         0
[0m
19:40:58 | sa

19:41:48 | running eval: valid
19:42:00 | eval completed in 11.93s
19:42:00 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 651.5       0          0 14.47  171 .2282   .06945 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2643          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.9       0          0 12.52    .2355      .4743         0                 4352 825.2 846.4  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0  9.14    .2768      .5123         0
[0m
19:42:00 | sa

19:42:51 | running eval: valid
19:43:03 | eval completed in 11.98s
19:43:03 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01828 53.69 635.2 647.5       0          0 14.38  171 .2334   .06952 14.68  2.49 1.562e-08   190   
   external         0  .02091 62.36                   0          0         44 .2044          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2624          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      193.7       0          0 12.52    .2391      .4747         0                 4416 825.2 841.2  
   external             0          0  15.9    .2033      .4378         0                                   
   internal             0          0 9.141    .2749      .5117         0
[0m
19:43:03 | sa

19:43:52 | running eval: valid
19:44:04 | eval completed in 11.91s
19:44:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 653.3       0          0 14.51  171 .2291   .06889 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1939          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2643          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.4       0          0 12.52    .2356      .4739         0                 4480 825.2 848.7  
   external             0          0  15.9    .1943      .4349         0                                   
   internal             0          0 9.141    .2770      .5129         0
[0m
19:44:04 | sa

19:44:54 | running eval: valid
19:45:06 | eval completed in 11.89s
19:45:06 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 653.6       0          0 14.52  171 .2287   .06891 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1939          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2635          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.5       0          0 12.52    .2354      .4746         0                 4544 825.2 849.1  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.141    .2765      .5129         0
[0m
19:45:06 | sa

19:45:58 | running eval: valid
19:46:10 | eval completed in 11.95s
19:46:10 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 651.8       0          0 14.48  171 .2278   .06919 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2636          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.9       0          0 12.52    .2350      .4750         0                 4608 825.2 846.8  
   external             0          0 15.89    .1943      .4378         0                                   
   internal             0          0  9.14    .2757      .5123         0
[0m
19:46:10 | sa

19:46:59 | running eval: valid
19:47:11 | eval completed in 11.93s
19:47:11 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 650.2       0          0 14.44  171 .2288   .06923 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1939          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2637          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.5       0          0 12.52    .2350      .4746         0                 4672 825.2 844.7  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.141    .2757      .5129         0
[0m
19:47:11 | sa

19:48:02 | running eval: valid
19:48:14 | eval completed in 11.99s
19:48:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 649.5       0          0 14.43  171 .2288   .06905 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2655          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.3       0          0 12.52    .2355      .4740         0                 4736 825.2 843.8  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.141    .2768      .5117         0
[0m
19:48:14 | sa

19:49:04 | running eval: valid
19:49:16 | eval completed in 11.94s
19:49:16 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 651.8       0          0 14.48  171 .2286   .06928 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2652          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all        195       0          0 12.52    .2359      .4746         0                 4800 825.2 846.7  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.141    .2776      .5129         0
[0m
19:49:16 | sa

19:50:05 | running eval: valid
19:50:17 | eval completed in 11.97s
19:50:17 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 650.7       0          0 14.46  171 .2283   .06922 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1939          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2626          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.6       0          0 12.52    .2354      .4756         0                 4864 825.2 845.4  
   external             0          0  15.9    .1943      .4378         0                                   
   internal             0          0 9.139    .2765      .5135         0
[0m
19:50:17 | sa

19:51:08 | running eval: valid
19:51:20 | eval completed in 11.95s
19:51:20 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2   651       0          0 14.46  171 .2285   .06901 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2648          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.7       0          0 12.52    .2359      .4749         0                 4928 825.2 845.7  
   external             0          0  15.9    .1943      .4363         0                                   
   internal             0          0 9.139    .2776      .5135         0
[0m
19:51:20 | sa

19:52:09 | running eval: valid
19:52:21 | eval completed in 11.99s
19:52:21 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 648.2       0          0  14.4  171 .2276   .06905 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2631          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all      193.9       0          0 12.52    .2352      .4753         0                 4992 825.2  842  
   external             0          0  15.9    .1943      .4378         0                                  
   internal             0          0  9.14    .2760      .5129         0
[0m
19:52:21 | savin

19:53:11 | running eval: valid
19:53:23 | eval completed in 12.00s
19:53:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01126 53.69 635.2 645.9       0          0 14.35  171 .2258   .06905 14.68 2.489 1.562e-08   190   
   external         0 .006885 62.36                   0          0         44 .1874          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2642          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      193.2       0          0 12.52    .2306      .4737         0                 5056 825.2 839.1  
   external             0          0 15.89    .1860      .4363         0                                   
   internal             0          0 9.139    .2753      .5111         0
[0m
19:53:23 | sa

19:54:12 | running eval: valid
19:54:24 | eval completed in 11.94s
19:54:24 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01421 53.69 635.2 650.3       0          0 14.45  171 .2296   .06918 14.68 2.489 1.562e-08   190   
   external         0  .01279 62.36                   0          0         44 .1960          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2632          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.5       0          0 12.51    .2351      .4743         0                 5120 825.2 844.8  
   external             0          0 15.89    .1951      .4363         0                                   
   internal             0          0 9.139    .2751      .5123         0
[0m
19:54:24 | sa

19:55:15 | running eval: valid
19:55:27 | eval completed in 11.99s
19:55:27 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 648.3       0          0  14.4  171 .2278   .06902 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2636          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      193.9       0          0 12.52    .2350      .4743         0                 5184 825.2 842.2  
   external             0          0 15.89    .1943      .4363         0                                   
   internal             0          0 9.139    .2757      .5123         0
[0m
19:55:27 | sa



19:56:02 | running eval: valid
19:56:14 | eval completed in 12.07s
19:56:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 645.1       0          0 14.33  171 .2283   .06903 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2645          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all        193       0          0 12.52    .2356      .4756         0                 5232 825.2 838.1  
   external             0          0 15.89    .1943      .4378         0                                   
   internal             0          0 9.139    .2770      .5135         0
[0m
19:56:14 | sa

19:57:03 | running eval: valid
19:57:15 | eval completed in 11.88s
19:57:15 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2 655.2       0          0 14.56  171 .2287   .06922 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2653          13.46 2.212                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all        196       0          0 12.52    .2355      .4746         0                 5296 825.2 851.2  
   external             0          0 15.89    .1943      .4363         0                                   
   internal             0          0 9.138    .2768      .5129         0
[0m
19:57:15 | sa

19:58:05 | running eval: valid
19:58:17 | eval completed in 11.93s
19:58:17 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01532 53.69 635.2   653       0          0 14.51  171 .2286   .06918 14.68 2.489 1.562e-08   190   
   external         0  .01501 62.36                   0          0         44 .1921          15.89 2.766                   
   internal         0  .01564 45.02                   0          0        127 .2652          13.46 2.213                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.3       0          0 12.52    .2355      .4743         0                 5360 825.2 848.3  
   external             0          0 15.89    .1943      .4363         0                                   
   internal             0          0 9.139    .2768      .5123         0
[0m
19:58:17 | sa

0,1
internal/exs/train,144.0
exs/train,256.0
internal/clen/train,58.60417
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,13.95139
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.09052
internal/ppl/train,8.08912


0,1
internal/exs/train,▃▁▃▅▂█▅▃▃▂▃▂▅▅▃▅▄▃▃▂▂▆▅▁▃▃▃▄▇▃▃▂▆▂▅▇▂▅▆▇
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,▄▅▄▂▅▃▆▆▅▅▅█▄▂▅▄▃▂▃▅▄▃▁▇▃▇▂▆▂▃▆█▃▅▅▅▇▂▂▂
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/llen/train,▅▂▆▂█▅▄▅▇▃▄▂▃▄▃▅▃▄▂▄▅▄▃▅▇▄▂▃▅▅▁▅▄▄▄▄▂▂▃▅
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,█▇▆▅▅▆▄▅▄▄▄▃▄▃▄▃▂▃▃▁▁▂▃▄▃▃▂▁▂▂▁▂▁▂▂▂▂▂▂▃
internal/ppl/train,█▆▆▄▄▆▃▄▄▃▃▃▃▃▃▃▂▂▃▁▁▂▂▃▂▃▂▁▂▂▁▂▁▂▂▂▁▂▁▂


## 2. Blender base config + bst

### Datasets with sampling weights:
- internal: 6
- external: 3
- blended_skill_talk: 1

### Mutators
None

In [5]:
tasks='internal,external,blended_skill_talk'
weights= '6,3,1'

run_training(tasks=tasks,weights=weights)

19:59:12 | building dictionary first...
19:59:12 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model(.opt)
19:59:12 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: None,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_vocab

19:59:13 |     text_truncate: 512
19:59:13 |     topk: 10
19:59:13 |     topp: 0.9
19:59:13 |     truncate: -1
19:59:13 |     update_freq: 1
19:59:13 |     use_reply: label
19:59:13 |     validation_cutoff: 1.0
19:59:13 |     validation_every_n_epochs: 0.25
19:59:13 |     validation_every_n_secs: -1
19:59:13 |     validation_every_n_steps: -1
19:59:13 |     validation_max_exs: 20000
19:59:13 |     validation_metric: ppl
19:59:13 |     validation_metric_mode: min
19:59:13 |     validation_patience: 15
19:59:13 |     validation_share_agent: False
19:59:13 |     variant: xlm
19:59:13 |     verbose: False
19:59:13 |     wandb_entity: None
19:59:13 |     wandb_log: True
19:59:13 |     wandb_name: None
19:59:13 |     wandb_project: parlaiemely
19:59:13 |     warmup_rate: 0.0001
19:59:13 |     warmup_updates: -1
19:59:13 |     weight_decay: None
19:59:13 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
19:59:14 | creating task(s): internal,external,blended_skill_talk
19:59:14

19:59:31 | training...
19:59:36 | time:22s total_exs:800 total_steps:50 epochs:0.03
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                76.17     1  1205 11381       0          0 151.1  800             32768  12.01    .3383 15.81 2.639   
   blended_skill_talk 78.69                         0          0        153                                   18.32 2.265   
   external           77.32                         0          0        271                                   15.11 3.002   
   internal           72.49                         0          0        376                                      14 2.651   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.2  2297       0          0 14.64      .4411   .002179                   50 1448 13678 9.498  
   blended_skill_talk                  

20:00:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:00:14 | Saving dictionary to /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint.dict
20:00:20 | time:66s total_exs:6400 total_steps:400 epochs:0.23
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                81.76     1  1311  8837       0          0 107.8  800             24576    inf    .3406 15.36 2.606   
   blended_skill_talk 78.21                         0          0        208                                    17.2 2.484   
   external           84.54                         0          0        211                                   14.37  2.81   
   internal           82.53                         0          0        381                                   14.51 2.525   
                     



20:00:24 | time:70s total_exs:7008 total_steps:438 epochs:0.25
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                78.92     1  1274 11527       0          0 144.7  608             16384  11.44    .3338  15.6 2.535   
   blended_skill_talk 78.87                         0          0        141                                    18.9 2.269   
   external            75.7                         0          0        166                                   14.18 2.814   
   internal           82.19                         0          0        301                                   13.71 2.522   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 240.7  2176       0          0 12.93      .4539         0                  438 1515 13704 9.047  
   blended_skill_talk                         0          0  

20:01:15 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:01:17 | time:123s total_exs:11808 total_steps:738 epochs:0.42
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.37     1  1323  8621       0          0 104.2  800             16384  10.93    .3220 15.12 2.535   
   blended_skill_talk  83.8                         0          0        223                                   17.75 2.369   
   external           87.54                         0          0        232                                   13.72 2.783   
   internal           78.76                         0          0        345                                    13.9 2.453   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 238.7  1555

20:02:08 | time:174s total_exs:16416 total_steps:1026 epochs:0.59
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                80.45     1  1283 10320       0          0 128.7  800             16384  10.64    .3383 15.65   2.5   
   blended_skill_talk 84.66                         0          0        208                                   18.62 2.396   
   external           77.36                         0          0        213                                   14.31  2.74   
   internal           79.33                         0          0        379                                   14.01 2.363   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 244.6  1967       0          0 12.36      .4633   .001603                 1026 1528 12287 8.044  
   blended_skill_talk                         0          



20:02:39 | time:205s total_exs:20416 total_steps:1276 epochs:0.73
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.73     1  1334 12012       0          0 144.1  800              8192  10.58    .3353 15.67 2.487   
   blended_skill_talk 94.76                         0          0        204                                    18.4 2.417   
   external           80.75                         0          0        221                                   14.86 2.706   
   internal           78.67                         0          0        375                                   13.76  2.34   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.9  2197       0          0 12.18      .4553  .0008889                 1276 1577 14209 9.009  
   blended_skill_talk                         0          

20:03:29 | time:255s total_exs:25024 total_steps:1564 epochs:0.89
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.57     1  1354 12858       0          0   152  800              8192  10.58    .3328 15.84  2.48   
   blended_skill_talk 92.46                         0          0        210                                   18.19 2.314   
   external           83.79                         0          0        240                                   15.54 2.765   
   internal           80.48                         0          0        350                                    13.8 2.361   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 247.6  2351       0          0  12.2      .4632   .001905                 1564 1601 15209 9.499  
   blended_skill_talk                         0          

20:04:19 | time:305s total_exs:29632 total_steps:1852 epochs:1.06
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.46     1  1339 12209       0          0 145.8  800              8192  10.61    .3487 15.28 2.428   
   blended_skill_talk  93.4                         0          0        214                                   17.64  2.35   
   external            85.7                         0          0        202                                   14.05 2.616   
   internal           77.27                         0          0        384                                   14.14 2.317   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 240.8  2195       0          0 11.44      .4718    .00165                 1852 1580 14404 9.116  
   blended_skill_talk                         0          

20:05:00 | time:346s total_exs:35040 total_steps:2190 epochs:1.25
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                90.39     1  1453 13619       0          0 149.9  608              8192  10.41    .3449 15.53 2.426   
   blended_skill_talk  81.7                         0          0        191                                   17.97 2.451   
   external           91.22                         0          0        192                                   14.03 2.504   
   internal           98.25                         0          0        225                                   14.58 2.323   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 247.6  2320       0          0 11.35      .4776         0                 2190 1701 15940 9.374  
   blended_skill_talk                         0          

20:05:51 | time:397s total_exs:39840 total_steps:2490 epochs:1.42
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.14     1  1393 12748       0          0 146.5  800              8192  10.23    .3338 15.82 2.376   
   blended_skill_talk 97.36                         0          0        220                                    17.8 2.292   
   external           78.35                         0          0        204                                   15.81 2.595   
   internal            85.7                         0          0        376                                   13.85 2.241   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   247  2261       0          0  10.9      .4773         0                 2490 1639 15009 9.156  
   blended_skill_talk                         0          

20:06:42 | time:448s total_exs:44448 total_steps:2778 epochs:1.59
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.24     1  1302 12206       0          0   150  800              8192  10.53    .3449  16.2 2.384   
   blended_skill_talk 96.48                         0          0        195                                   18.95 2.449   
   external           77.15                         0          0        240                                   15.55 2.509   
   internal           76.08                         0          0        365                                   14.09 2.194   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 251.4  2357       0          0 10.95      .4713         0                 2778 1553 14563 9.376  
   blended_skill_talk                         0          

20:07:17 | running eval: valid
20:07:29 | eval completed in 12.00s
20:07:29 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01634 53.69 635.2   649       0          0 14.42  171 .2167    .2003 14.68 2.532 1e-06   190 194.1   
   external         0   .0150 62.36                   0          0         44 .1792          15.89  2.81                     
   internal         0  .01768 45.02                   0          0        127 .2542          13.46 2.254                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.07    .2235      .4670         0                 3066 825.2 843.1  
   external       0          0 16.62    .1829      .4363         0                                   
   internal       0          0 9.529    .2641      .4977         0
[0m
20:07:29 | saving model check



20:08:08 | time:534s total_exs:53856 total_steps:3366 epochs:1.92
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 84.6     1  1346 12905       0          0 153.5  800              8192  11.73    .3697 15.36 2.396   
   blended_skill_talk 86.62                         0          0        202                                   17.67 2.495   
   external           86.59                         0          0        264                                   14.44 2.441   
   internal           80.59                         0          0        334                                   13.99 2.253   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   241  2312       0          0 11.04      .4759         0                 3366 1587 15217 9.593  
   blended_skill_talk                         0          

20:08:58 | time:584s total_exs:58464 total_steps:3654 epochs:2.09
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.84     1  1376 12703       0          0 147.7  800              8192  10.13    .3834 15.88 2.309   
   blended_skill_talk 94.33                         0          0        241                                   18.79  2.27   
   external           85.57                         0          0        204                                   14.53 2.451   
   internal           80.61                         0          0        355                                    14.3 2.205   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 251.4  2321       0          0 10.12      .4875   .000939                 3654 1628 15024 9.232  
   blended_skill_talk                         0          

20:09:33 | running eval: valid
20:09:45 | eval completed in 11.81s
20:09:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01616 53.69 635.2 654.7       0          0 14.54  171 .2174    .2004 14.68 2.516 1e-06   190 195.8   
   external         0   .0150 62.36                   0          0         44 .1805          15.89 2.794                     
   internal         0  .01733 45.02                   0          0        127 .2542          13.46 2.239                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.86    .2298      .4695         0                 3942 825.2 850.5  
   external       0          0 16.34    .1882      .4378         0                                   
   internal       0          0 9.381    .2714      .5012         0
[0m
20:09:45 | saving model check

20:10:29 | time:675s total_exs:68672 total_steps:4292 epochs:2.45
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.18     1  1368 12234       0          0 143.1  800              8192  10.41    .3544 16.35 2.298   
   blended_skill_talk 90.28                         0          0        219                                   19.42  2.43   
   external           72.32                         0          0        205                                    15.4 2.323   
   internal           89.93                         0          0        376                                   14.23 2.141   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 255.2  2282       0          0 10.02      .4888         0                 4292 1623 14516 8.943  
   blended_skill_talk                         0          

20:11:19 | time:725s total_exs:73280 total_steps:4580 epochs:2.62
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.89     1  1410 14144       0          0 160.4  800              8192   10.6    .3641 15.39 2.232   
   blended_skill_talk 96.21                         0          0        252                                   18.65 2.332   
   external           95.19                         0          0        200                                   13.78 2.268   
   internal           78.26                         0          0        348                                   13.75 2.096   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 244.8  2455       0          0 9.364      .4983         0                 4580 1655 16599 10.03  
   blended_skill_talk                         0          

20:12:00 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:12:02 | [1;32mnew best ppl: 12.69 (previous best was 12.78)[0m
20:12:02 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model
20:12:03 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:12:10 | time:776s total_exs:77888 total_steps:4868 epochs:2.78
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.26     1  1346 12148       0          0 144.4  800              8192   10.4    .3727 16.06 2.303   
   blended_skill_talk 102.8                         0          0        201                                      19  2.39   
   external           78.46                         0          0   

20:12:53 | time:819s total_exs:83488 total_steps:5218 epochs:2.98
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.55     1  1409 10778       0          0 122.4  800              8192  10.45    .3386 16.07 2.305   
   blended_skill_talk 105.2                         0          0        231                                   18.89 2.382   
   external           84.03                         0          0        213                                   15.24 2.414   
   internal           79.37                         0          0        356                                   14.06 2.119   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 252.3  1930       0          0 10.11      .4908  .0009363                 5218 1662 12708 7.649  
   blended_skill_talk                         0          

20:13:44 | time:869s total_exs:88096 total_steps:5506 epochs:3.15
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.43     1  1336 10047 .001709      .1009 120.3  800             16384  10.45    .4488 15.89 2.275   
   blended_skill_talk 97.28                   .005128      .3026        195                                      18 2.455   
   external           80.94                         0          0        224                                   14.96 2.294   
   internal           78.07                         0          0        381                                   14.71 2.076   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 249.3  1875       0          0 9.844      .4952         0                 5506 1585 11922 7.523  
   blended_skill_talk                         0          

20:14:32 | time:918s total_exs:92704 total_steps:5794 epochs:3.31
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 77.6     1  1214 11773       0          0 155.2  800             16384  10.24    .3245 15.57 2.195   
   blended_skill_talk 84.26                         0          0        163                                   17.12 2.341   
   external            76.3                         0          0        230                                   15.24 2.255   
   internal           72.24                         0          0        407                                   14.36  1.99   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 242.8  2355       0          0 9.081      .5017   .002045                 5794 1457 14129 9.702  
   blended_skill_talk                         0          



20:14:56 | time:941s total_exs:95904 total_steps:5994 epochs:3.43
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                76.42     1  1219 11251       0          0 147.7  800              8192  10.39    .3419 15.17 2.195   
   blended_skill_talk 76.73                         0          0        155                                   15.96 2.264   
   external           77.33                         0          0        260                                   15.47 2.247   
   internal           75.19                         0          0        385                                   14.09 2.074   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 238.4  2201       0          0 9.012      .5128         0                 5994 1457 13452 9.232  
   blended_skill_talk                         0          

20:15:46 | time:991s total_exs:100512 total_steps:6282 epochs:3.59
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                79.96     1  1270 13136       0          0 165.5  800              8192  10.39    .3406 16.16 2.243   
   blended_skill_talk 77.82                         0          0        189                                   18.24 2.341   
   external           85.45                         0          0        222                                   15.34 2.257   
   internal           76.61                         0          0        389                                   14.89 2.132   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   253  2617       0          0 9.458      .5030  .0008569                 6282 1523 15754 10.35  
   blended_skill_talk                         0         

20:16:19 | running eval: valid
20:16:31 | eval completed in 12.02s
20:16:31 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01366 53.69 635.2 638.5       0          0 14.18  171 .2260    .2008 14.68 2.488 1e-06   190   191   
   external         0  .01279 62.36                   0          0         44 .1954          15.89 2.767                     
   internal         0  .01454 45.02                   0          0        127 .2567          13.46  2.21                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.51    .2314      .4767         0                 6570 825.2 829.4  
   external       0          0 15.91    .1958      .4435         0                                   
   internal       0          0 9.115    .2671      .5099         0
[0m
20:16:31 | saving model check

20:17:19 | time:1084s total_exs:110720 total_steps:6920 epochs:3.96
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                82.12     1  1291 10648       0          0 131.9  800              8192  10.68    .3748 15.76 2.181   
   blended_skill_talk 90.72                         0          0        201                                   17.61 2.316   
   external           79.91                         0          0        232                                   15.19 2.194   
   internal           75.74                         0          0        367                                    14.5 2.033   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 247.6  2042       0          0 8.915      .5035   .001658                 6920 1539 12690 8.246  
   blended_skill_talk                         0        

20:18:09 | time:1135s total_exs:115328 total_steps:7208 epochs:4.12
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.77     1  1325 10387       0          0 125.5  800              8192  10.18    .3319 16.11  2.19   
   blended_skill_talk 88.66                         0          0        196                                   18.44 2.382   
   external           83.07                         0          0        226                                   15.61 2.196   
   internal           79.58                         0          0        378                                   14.29 1.992   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 250.9  1967       0          0 9.046      .5078  .0008818                 7208 1576 12354 7.842  
   blended_skill_talk                         0        

20:18:49 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:18:51 | [1;32mnew best ppl: 12.44 (previous best was 12.46)[0m
20:18:51 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model
20:18:52 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:18:59 | time:1185s total_exs:119936 total_steps:7496 epochs:4.29
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                94.07     1  1476 12827       0          0   139  800              8192   10.3    .3890 15.79 2.153   
   blended_skill_talk 104.6                         0          0        251                                   18.08 2.363   
   external           97.78                         0          0 

20:19:40 | time:1226s total_exs:125536 total_steps:7846 epochs:4.49
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.46     1  1340 10450       0          0 124.7  800              8192  10.73    .3874 15.58 2.144   
   blended_skill_talk 87.73                         0          0        213                                   18.75 2.426   
   external           84.47                         0          0        207                                   13.49 2.047   
   internal           81.18                         0          0        380                                    14.5 1.958   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 245.9  1917       0          0 8.716      .5137   .001754                 7846 1586 12368 7.797  
   blended_skill_talk                         0        

20:20:31 | time:1277s total_exs:130144 total_steps:8134 epochs:4.65
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                90.85     1  1457 10906 .000939      .0554 119.8  800             16384  10.24    .4488 15.81  2.13   
   blended_skill_talk 89.74                         0          0        241                                   17.83 2.392   
   external           90.28                         0          0        204                                   14.24 1.999   
   internal           92.54                   .002817      .1662        355                                   15.34 1.998   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   253  1894       0          0 8.565      .5119   .000939                 8134 1710 12801 7.488  
   blended_skill_talk                         0        



20:21:21 | time:1327s total_exs:134752 total_steps:8422 epochs:4.82
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.56     1  1388 10216 .0008749     .05162 117.8  800              8192  10.14    .4488 15.92 2.055   
   blended_skill_talk 89.36                          0          0        212                                   19.93 2.356   
   external           82.71                          0          0        207                                   13.85  1.94   
   internal            87.6                    .002625      .1549        381                                   13.97 1.868   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 248.3  1828       0          0 7.993      .5305    .00175                 8422 1636 12044 7.363  
   blended_skill_talk                         0   

20:22:01 | time:1367s total_exs:140160 total_steps:8760 epochs:5.01
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 86.1     1  1397 12117       0          0 138.8  608              8192  10.54    .3544 15.53 2.148   
   blended_skill_talk 81.17                         0          0        158                                   18.02 2.326   
   external           85.47                         0          0        158                                   14.55 2.201   
   internal           91.65                         0          0        292                                   14.02 1.918   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.1  2109       0          0 8.692      .5081   .003251                 8760 1640 14226 8.675  
   blended_skill_talk                         0        

20:22:52 | time:1417s total_exs:144960 total_steps:9060 epochs:5.18
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.21     1  1366 12666       0          0 148.4  800              8192  10.11    .3426 16.31 2.092   
   blended_skill_talk 100.6                         0          0        226                                   18.94 2.404   
   external            84.8                         0          0        210                                   15.72 1.983   
   internal           76.19                         0          0        364                                   14.27 1.889   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 255.5  2370       0          0 8.313      .5226  .0009158                 9060 1621 15036 9.276  
   blended_skill_talk                         0        

20:23:42 | time:1468s total_exs:149568 total_steps:9348 epochs:5.34
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                88.62     1  1400 10540       0          0 120.4  800              8192  10.86    .3782 15.95 2.112   
   blended_skill_talk 98.04                         0          0        221                                   18.58 2.404   
   external            85.5                         0          0        210                                   14.77 1.996   
   internal           82.33                         0          0        369                                   14.49 1.936   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 251.1  1890       0          0 8.455      .5220   .002491                 9348 1651 12430 7.529  
   blended_skill_talk                         0        

20:24:17 | running eval: valid
20:24:29 | eval completed in 12.03s
20:24:29 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01866 53.69 635.2 639.7       0          0 14.21  171 .2350    .2007 14.68  2.48 1e-06   190 191.3   
   external         0  .02253 62.36                   0          0         44 .1996          15.89 2.764                     
   internal         0  .01478 45.02                   0          0        127 .2703          13.46 2.197                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.43    .2392      .4808         0                 9636 825.2 831.1  
   external       0          0 15.86    .1969      .4464         0                                   
   internal       0          0 8.998    .2814      .5152         0
[0m
20:24:29 | saving model check

20:25:12 | time:1558s total_exs:159776 total_steps:9986 epochs:5.71
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                75.24     1  1207 11575       0          0 153.4  800              8192  10.72    .3353 15.77 2.008   
   blended_skill_talk 80.24                         0          0        160                                   17.51 2.253   
   external            67.3                         0          0        231                                   15.82 1.971   
   internal           78.17                         0          0        409                                   13.99   1.8   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.6  2335       0          0  7.58      .5317   .000815                 9986 1451 13911 9.591  
   blended_skill_talk                         0        

20:26:01 | time:1607s total_exs:164384 total_steps:10274 epochs:5.87
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.66     1  1307 12191       0          0 149.2  800              8192  10.61    .3475 15.43 2.052   
   blended_skill_talk 97.77                         0          0        203                                   17.58 2.332   
   external           78.13                         0          0        216                                   14.12 1.893   
   internal           75.09                         0          0        381                                   14.59 1.931   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.5  2272       0          0 7.946      .5296         0                10274 1550 14463 9.332  
   blended_skill_talk                         0       

20:26:42 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:26:44 | [1mdid not beat best ppl: 12.3979 impatience: 3[0m
20:26:44 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:26:50 | time:1656s total_exs:168992 total_steps:10562 epochs:6.04
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                78.15     1  1255 12084       0          0 154.1  800             16384  10.47    .3352 15.65 2.103   
   blended_skill_talk 75.52                         0          0        191                                   17.45 2.409   
   external           79.99                         0          0        234                                    15.1 1.958   
   internal           78.93                         0          0        375  

20:27:32 | time:1697s total_exs:174592 total_steps:10912 epochs:6.24
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.23     1  1344 10446       0          0 124.3  800             16384  10.55    .3473 15.57 2.036   
   blended_skill_talk 86.93                         0          0        208                                   17.65 2.375   
   external           82.44                         0          0        232                                   15.34 1.895   
   internal           83.33                         0          0        360                                   13.71 1.838   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 243.4  1891       0          0 7.897      .5275   .001603                10912 1588 12338 7.773  
   blended_skill_talk                         0       



20:28:09 | time:1735s total_exs:177600 total_steps:11100 epochs:6.35
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                81.69     1  1300 10417       0          0 128.2  800              8192  10.46    .3319 15.23 2.013   
   blended_skill_talk  89.6                         0          0        196                                   17.14 2.373   
   external            74.4                         0          0        225                                   14.42 1.844   
   internal           81.06                         0          0        379                                   14.14 1.821   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 239.3  1917       0          0 7.743      .5396   .001759                11100 1540 12334 8.011  
   blended_skill_talk                         0       

20:28:43 | running eval: valid
20:28:55 | eval completed in 12.17s
20:28:55 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02127 53.69 635.2 633.6       0          0 14.07  171 .2343    .2006 14.68 2.482 5e-07   190 189.5   
   external         0  .02524 62.36                   0          0         44 .1987          15.89 2.767                     
   internal         0  .01731 45.02                   0          0        127 .2699          13.46 2.197                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.45    .2403      .4804         0                11388 825.2 823.1  
   external       0          0  15.9    .1996      .4449         0                                   
   internal       0          0 8.996    .2810      .5158         0
[0m
20:28:55 | saving model check

20:29:39 | time:1825s total_exs:187808 total_steps:11738 epochs:6.71
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                79.21     1  1263 10079       0          0 127.6  800              8192  10.63    .3406 15.55 2.005   
   blended_skill_talk 80.15                         0          0        177                                   17.71 2.353   
   external           79.32                         0          0        248                                   14.42 1.872   
   internal           78.17                         0          0        375                                   14.54  1.79   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 243.2  1940       0          0  7.67      .5341  .0008889                11738 1507 12019 7.979  
   blended_skill_talk                         0       

20:30:28 | time:1874s total_exs:192416 total_steps:12026 epochs:6.88
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.36     1  1355 10717       0          0 126.5  800              8192  10.41    .3575 15.92 1.996   
   blended_skill_talk 96.28                         0          0        215                                   19.04 2.316   
   external           77.84                         0          0        213                                   14.85 1.919   
   internal           81.96                         0          0        372                                   13.89 1.754   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 248.4  1964       0          0 7.576      .5435   .004026                12026 1604 12682 7.908  
   blended_skill_talk                         0       

20:31:09 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:31:10 | [1mdid not beat best ppl: 12.3979 impatience: 7[0m
20:31:10 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:31:18 | time:1924s total_exs:197024 total_steps:12314 epochs:7.04
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                88.37     1  1399 10791       0          0 123.4  800              8192  10.53    .3264 15.28 1.991   
   blended_skill_talk 92.72                         0          0        249                                   18.29 2.404   
   external           92.13                         0          0        224                                   13.89 1.796   
   internal           80.26                         0          0        327  

20:31:59 | time:1965s total_exs:202624 total_steps:12664 epochs:7.24
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.68     1  1358 12315       0          0 145.1  800              8192  10.78    .3352 15.49 1.955   
   blended_skill_talk 89.88                         0          0        225                                   17.15 2.312   
   external           86.48                         0          0        219                                   14.76 1.755   
   internal           80.68                         0          0        356                                   14.55 1.799   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 245.4  2226       0          0 7.308      .5498   .002809                12664 1603 14541 9.073  
   blended_skill_talk                         0       

20:32:48 | time:2014s total_exs:207232 total_steps:12952 epochs:7.41
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                78.41     1  1249 12164       0          0 155.9  800              8192  10.34    .3426 15.68 2.069   
   blended_skill_talk 76.89                         0          0        185                                   16.29 2.481   
   external           81.94                         0          0        220                                   16.05 1.926   
   internal           76.41                         0          0        395                                   14.71 1.801   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 247.1  2407       0          0  8.29      .5286         0                12952 1496 14571 9.744  
   blended_skill_talk                           0 

20:33:37 | time:2063s total_exs:211840 total_steps:13240 epochs:7.57
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                79.19     1  1262  9949       0          0 126.1  800             16384  10.36    .3514 15.98 2.005   
   blended_skill_talk 76.08                         0          0        177                                   17.74 2.308   
   external           84.59                         0          0        223                                   15.83 1.906   
   internal           76.91                         0          0        400                                   14.38 1.799   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 248.4  1959       0          0 7.611      .5357         0                13240 1510 11907 7.885  
   blended_skill_talk                           0 



20:33:49 | time:2075s total_exs:213440 total_steps:13340 epochs:7.63
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                82.74     1  1323 10187       0          0 123.2  800              8192  10.41    .3613 15.73 2.013   
   blended_skill_talk 83.82                         0          0        194                                   18.79  2.37   
   external           81.89                         0          0        205                                    14.4 1.864   
   internal           82.52                         0          0        401                                      14 1.805   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 244.2  1881       0          0 7.745      .5385   .003289                13340 1567 12068 7.702  
   blended_skill_talk                           0 

20:34:30 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:34:32 | [1mdid not beat best ppl: 12.3979 impatience: 10[0m
20:34:32 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,blended_skill_talk-6,3,1/model.checkpoint
20:34:39 | time:2125s total_exs:218048 total_steps:13628 epochs:7.79
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 92.2     1  1481 11605 .001052     .02734 125.4  800              8192  10.49    .4414 15.03 1.966   
   blended_skill_talk 91.76                         0          0        264                                   16.95 2.289   
   external           89.45                         0          0        219                                   14.42 1.866   
   internal           95.39                   .003155     .08202        317 

20:35:20 | time:2166s total_exs:223648 total_steps:13978 epochs:7.99
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                91.83     1  1467 13023       0          0   142  800              8192   10.4    .3640 15.87 1.977   
   blended_skill_talk 99.98                         0          0        254                                   19.65 2.396   
   external           87.44                         0          0        227                                    13.9 1.749   
   internal           88.08                         0          0        319                                   14.06 1.785   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                2.5e-07 252.6  2243       0          0 7.563      .5459   .006072                13978 1719 15266 8.88  
   blended_skill_talk                           0   

20:36:10 | time:2215s total_exs:228256 total_steps:14266 epochs:8.16
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                78.49     1  1262 11977       0          0 151.8  800              8192  11.77    .3338 15.18 1.977   
   blended_skill_talk  79.1                         0          0        199                                   16.92 2.362   
   external           73.44                         0          0        261                                   14.82 1.832   
   internal           82.94                         0          0        340                                    13.8 1.736   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 238.6  2264       0          0 7.512      .5450  .0009804                14266 1501 14240 9.491  
   blended_skill_talk                           0 

20:36:59 | time:2265s total_exs:232864 total_steps:14554 epochs:8.32
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 87.8     1  1404 12894       0          0 146.9  800              8192  10.41    .3627 15.52 1.943   
   blended_skill_talk 83.38                         0          0        254                                   17.26 2.293   
   external            90.8                         0          0        184                                   15.02 1.782   
   internal           89.22                         0          0        362                                   14.28 1.753   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 246.3  2263       0          0 7.206      .5495   .002762                14554 1650 15157 9.188  
   blended_skill_talk                           

20:37:38 | time:2304s total_exs:238272 total_steps:14892 epochs:8.51
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                78.99     1  1269 12141       0          0 153.1  608              8192  10.54    .3219 15.01 1.996   
   blended_skill_talk 79.91                         0          0        159                                   15.55 2.339   
   external           76.28                         0          0        169                                   15.22 1.796   
   internal           80.79                         0          0        280                                   14.26 1.854   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                1.25e-07 237.9  2276       0          0 7.595      .5399   .007516                14892 1507 14417 9.57  
   blended_skill_talk                            0

20:38:29 | time:2355s total_exs:243072 total_steps:15192 epochs:8.69
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 86.2     1  1382 12170       0          0 140.9  800              8192  10.65    .3478 15.92 1.957   
   blended_skill_talk 86.63                         0          0        223                                   18.93 2.433   
   external           84.81                         0          0        216                                   15.02 1.773   
   internal           87.17                         0          0        361                                   13.81 1.667   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07   249  2193       0          0 7.524      .5472   .001543                15192 1631 14363 8.808  
   blended_skill_talk                           

20:39:18 | time:2404s total_exs:247680 total_steps:15480 epochs:8.85
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.17     1  1307 12574       0          0 153.9  800             16384  10.69    .3727 16.13 2.038   
   blended_skill_talk 91.64                         0          0        184                                   19.41 2.485   
   external           80.38                         0          0        261                                   14.33 1.803   
   internal           77.48                         0          0        355                                   14.64 1.825   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 250.2  2407       0          0 8.091      .5336   .005972                15480 1557 14981 9.623  
   blended_skill_talk                           

20:39:52 | running eval: valid
20:40:04 | eval completed in 12.33s
20:40:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02351 53.69 635.2 624.8       0          0 13.88  171 .2377    .2007 14.68 2.487 1.25e-07   190 186.9   
   external         0  .02524 62.36                   0          0         44 .2036          15.89 2.774                        
   internal         0  .02178 45.02                   0          0        127 .2717          13.46   2.2                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.53    .2437      .4788         0                15768 825.2 811.7  
   external       0          0 16.03    .2032      .4435         0                                   
   internal       0          0 9.026    .2841      .5140         0
[0m
20:40:04 | saving

0,1
internal/exs/train,273.0
exs/train,608.0
internal/clen/train,81.20147
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,14.37363
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,1.80362
internal/ppl/train,6.07159


0,1
internal/exs/train,▆▇▆▆▁▆▆▅▆▆▆▆▆▂▅▆▅▇▇▆▅▇▅▆▆▆▃▆▇█▇▇▆█▄▂▅▆▆▂
exs/train,████▁████████▁████████████▁████████▁███▁
internal/clen/train,▄▃▃▅▇▃▃▂▆▂▄▇▂█▅▄▂▂▂▂▄▂▃▃▃▃▂▃▄▁▁▃▂▂▇▂▃▂▄▃
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁
internal/llen/train,▅▅█▅▇▅▆▇▆▃▃▅▅▇▅▃▄▃▅▅▅▅▁▆▃▅▂▂▅▃▅▇▇▅▂▄▁▄▃▅
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,███▇▇▇▇▇▆▅▆▅▆▅▅▄▄▄▄▄▃▃▃▃▂▃▃▃▂▂▃▂▂▂▂▁▂▂▁▂
internal/ppl/train,███▇▆▆▆▇▅▄▅▄▅▄▄▄▄▃▃▃▃▃▂▃▂▂▂▂▂▂▂▁▂▂▁▁▁▂▁▂


## 3. Blender base + otter

### Datasets with sampling weights:
- internal: 6
- external: 3
- otter: 1

### Mutators
None

In [6]:
tasks='internal,external,otter'
weights= '6,3,1'

run_training(tasks=tasks,weights=weights)

20:41:00 | building dictionary first...
20:41:00 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model(.opt)
20:41:00 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: None,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_vocab: None,bpe_me

20:41:01 |     topp: 0.9
20:41:01 |     truncate: -1
20:41:01 |     update_freq: 1
20:41:01 |     use_reply: label
20:41:01 |     validation_cutoff: 1.0
20:41:01 |     validation_every_n_epochs: 0.25
20:41:01 |     validation_every_n_secs: -1
20:41:01 |     validation_every_n_steps: -1
20:41:01 |     validation_max_exs: 20000
20:41:01 |     validation_metric: ppl
20:41:01 |     validation_metric_mode: min
20:41:01 |     validation_patience: 15
20:41:01 |     validation_share_agent: False
20:41:01 |     variant: xlm
20:41:01 |     verbose: False
20:41:01 |     wandb_entity: None
20:41:01 |     wandb_log: True
20:41:01 |     wandb_name: None
20:41:01 |     wandb_project: parlaiemely
20:41:01 |     warmup_rate: 0.0001
20:41:01 |     warmup_updates: -1
20:41:01 |     weight_decay: None
20:41:01 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
20:41:01 | creating task(s): internal,external,otter
20:41:01 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/interna

20:41:19 | training...
20:41:22 | time:20s total_exs:384 total_steps:24 epochs:0.25
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      67.27     1  1003  8595 .001449      .1261 137.1  384             19797    inf    .4639 19.63  3.08 1e-06 258.8   
   external 63.37                         0          0        120                                   16.95 2.974               
   internal 60.45                   .004348      .3783        230                                   14.13 2.668               
   otter       78                         0          0         34                                   27.82 3.597               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2218  .01961      .1667 23.49      .3821         0                   24 1262 10813 8.617  
   external             0          0 19.57      .3800         0          



20:41:22 | creating task(s): internal
20:41:22 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/internal/valid.txt
20:41:22 | creating task(s): external
20:41:22 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/external/valid.txt
20:41:22 | running eval: valid
20:41:39 | eval completed in 16.90s
20:41:39 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0    .00489 53.69 635.2 478.2       0          0 10.62  171 .1656    .2003 14.68 2.808 1e-06   190   143   
   external         0   .009779 62.36                   0          0         44 .1546          15.89 3.079                     
   internal         0 1.784e-06 45.02                   0          0        127 .1766          13.46 2.538                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 17.19   



20:42:27 | running eval: valid
20:42:42 | eval completed in 15.24s
20:42:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01258 53.69 635.2 523.4       0          0 11.63  171 .1901    .1934 14.68 2.749 1e-06   190 156.5   
   external         0  .01791 62.36                   0          0         44 .1771          15.89 3.026                     
   internal         0 .007253 45.02                   0          0        127 .2030          13.46 2.472                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.23    .1879      .4342         0                   96 825.2 679.9  
   external       0          0 20.62    .1801      .3977         0                                   
   internal       0          0 11.85    .1957      .4708         0
[0m
20:42:42 | saving model check

20:43:44 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:43:45 | [1;32mnew best ppl: 15.65 (previous best was 15.82)[0m
20:43:45 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:43:49 | time:167s total_exs:3072 total_steps:192 epochs:2.02
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      59.78     1 966.2  9497 .001377     .07851 157.3  384              8192  11.45    .4414 20.63 2.982 1e-06 258.1   
   external 56.69                         0          0        109                                   14.26 3.046               
   internal  62.3                   .004132      .2355        242                                   14.68 2.585               
   otter    60.36                         0          0         33                                   32.94 3

20:44:48 | running eval: valid
20:45:02 | eval completed in 13.49s
20:45:02 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01006 53.69 635.2 585.7       0          0 13.01  171 .1983    .2005 14.68 2.681 1e-06   190 175.2   
   external         0 .009789 62.36                   0          0         44 .1762          15.89 2.964                     
   internal         0  .01032 45.02                   0          0        127 .2204          13.46 2.399                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.19    .1985      .4506         0                  264 825.2 760.9  
   external       0          0 19.37    .1765      .4163         0                                   
   internal       0          0 11.01    .2205      .4848         0
[0m
20:45:02 | saving model check

20:46:01 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:46:02 | [1;32mnew best ppl: 14.96 (previous best was 15.04)[0m
20:46:02 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:46:06 | time:304s total_exs:5760 total_steps:360 epochs:3.78
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      63.76     1 996.8 10250       0          0 164.5  384              8192  10.84    .3659 21.14  2.93 1e-06 266.3   
   external 66.67                         0          0        139                                   16.45 2.955               
   internal 59.02                         0          0        215                                   14.63 2.438               
   otter     65.6                         0          0         30                                   32.33 3

20:47:03 | running eval: valid
20:47:16 | eval completed in 13.01s
20:47:16 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01317 53.69 635.2 608.2       0          0 13.51  171 .1999    .2005 14.68 2.649 1e-06   190 181.9   
   external         0  .01283 62.36                   0          0         44 .1624          15.89 2.932                     
   internal         0   .0135 45.02                   0          0        127 .2374          13.46 2.367                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.72    .2011      .4556         0                  432 825.2 790.1  
   external       0          0 18.77    .1669      .4206         0                                   
   internal       0          0 10.66    .2353      .4906         0
[0m
20:47:16 | saving model check

20:48:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:48:15 | [1;32mnew best ppl: 14.55 (previous best was 14.6)[0m
20:48:15 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:48:19 | time:437s total_exs:8448 total_steps:528 epochs:5.55
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      58.67     1 979.7  9449       0          0 154.3  384              8192  10.68    .3617 24.18 2.843 1e-06 269.2   
   external 63.83                         0          0        125                                   14.49 2.716               
   internal 61.05                         0          0        231                                    15.3 2.513               
   otter    51.14                         0          0         28                                   42.75 3.

20:49:15 | running eval: valid
20:49:28 | eval completed in 12.72s
20:49:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01223 53.69 635.2 614.7       0          0 13.65  171 .2056    .2004 14.68 2.626 1e-06   190 183.8   
   external         0  .01283 62.36                   0          0         44 .1735          15.89 2.909                     
   internal         0  .01163 45.02                   0          0        127 .2377          13.46 2.343                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.38    .2045      .4592         0                  600 825.2 798.5  
   external       0          0 18.35    .1705      .4249         0                                   
   internal       0          0 10.42    .2384      .4936         0
[0m
20:49:28 | saving model check

20:50:22 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:50:24 | [1;32mnew best ppl: 14.27 (previous best was 14.3)[0m
20:50:24 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:50:25 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:50:29 | time:568s total_exs:11136 total_steps:696 epochs:7.32
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all       65.9     1 998.9  9370       0          0 150.1  384              8192  11.04    .3982 21.75 2.914 1e-06 254.8   
   external 63.04                         0          0        129                                   14.71 2.847               
   internal 60.24                         0          0        221                                   13

20:51:24 | running eval: valid
20:51:37 | eval completed in 12.54s
20:51:37 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01448 53.69 635.2 621.5       0          0 13.81  171 .2187    .2004 14.68  2.61 1e-06   190 185.9   
   external         0  .01283 62.36                   0          0         44 .1834          15.89 2.893                     
   internal         0  .01613 45.02                   0          0        127 .2540          13.46 2.327                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.15    .2234      .4617         0                  768 825.2 807.4  
   external       0          0 18.05    .1857      .4292         0                                   
   internal       0          0 10.25    .2611      .4942         0
[0m
20:51:37 | saving model check

20:52:32 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:52:34 | [1;32mnew best ppl: 14.07 (previous best was 14.1)[0m
20:52:34 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:52:37 | time:696s total_exs:13824 total_steps:864 epochs:9.08
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      63.56     1  1026 10508       0          0 163.8  384              8192  10.61    .3226 18.41 2.801 1e-06 254.1   
   external 68.66                         0          0        148                                   16.63 2.771               
   internal 61.41                         0          0        204                                   13.97 2.299               
   otter    60.62                         0          0         32                                   24.62 3

20:53:32 | running eval: valid
20:53:44 | eval completed in 12.27s
20:53:44 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01808 53.69 635.2 640.4       0          0 14.23  171 .2227    .2006 14.68 2.597 1e-06   190 191.5   
   external         0  .02025 62.36                   0          0         44 .1891          15.89  2.88                     
   internal         0  .01592 45.02                   0          0        127 .2564          13.46 2.315                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.97    .2260      .4641         0                  936 825.2  832  
   external       0          0 17.81    .1903      .4306         0                                  
   internal       0          0 10.12    .2618      .4977         0
[0m
20:53:44 | saving model checkpoi

20:54:39 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:54:41 | [1;32mnew best ppl: 13.9 (previous best was 13.92)[0m
20:54:41 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:54:44 | time:823s total_exs:16512 total_steps:1032 epochs:10.85
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      63.04     1 975.5  9932       0          0 162.9  384              8192  10.71    .4072 20.75  2.84 1e-06   261   
   external  69.6                         0          0        117                                   13.91 2.715               
   internal 56.16                         0          0        229                                   14.69 2.317               
   otter    63.37                         0          0         38                                   33.66

20:55:39 | running eval: valid
20:55:52 | eval completed in 12.28s
20:55:52 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01714 53.69 635.2 633.6       0          0 14.08  171 .2099    .2003 14.68 2.586 1e-06   190 189.5   
   external         0   .0164 62.36                   0          0         44 .1668          15.89 2.868                     
   internal         0  .01789 45.02                   0          0        127 .2529          13.46 2.305                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.81    .2181      .4667         0                 1104 825.2 823.2  
   external       0          0  17.6    .1765      .4335         0                                   
   internal       0          0 10.02    .2597      .5000         0
[0m
20:55:52 | saving model check

20:56:45 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:56:47 | [1;32mnew best ppl: 13.76 (previous best was 13.77)[0m
20:56:47 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:56:52 | time:950s total_exs:19200 total_steps:1200 epochs:12.61
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      69.57     1  1146  7725 .005181      .8670 107.9  384              8192  10.52    .4953 24.18 2.837 1e-06 304.2   
   external 63.47                         0          0        123                                   14.62 2.739               
   internal 82.51                    .01554      2.601        193                                   13.97 2.242               
   otter    62.74                         0          0         68                                   43.9

20:57:46 | running eval: valid
20:57:59 | eval completed in 12.34s
20:57:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01992 53.69 635.2 630.9       0          0 14.02  171 .2179    .2003 14.68 2.577 1e-06   190 188.7   
   external         0  .02218 62.36                   0          0         44 .1783          15.89 2.857                     
   internal         0  .01767 45.02                   0          0        127 .2575          13.46 2.296                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.68    .2232      .4654         0                 1272 825.2 819.6  
   external       0          0 17.42    .1828      .4320         0                                   
   internal       0          0 9.939    .2636      .4988         0
[0m
20:57:59 | saving model check

20:58:52 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:58:54 | [1;32mnew best ppl: 13.61 (previous best was 13.62)[0m
20:58:54 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
20:58:55 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
20:58:59 | time:1078s total_exs:21888 total_steps:1368 epochs:14.38
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      62.93     1 984.2  9963       0          0   162  384              8192  10.62    .4788 18.25 2.737 1e-06 244.8   
   external 69.97                         0          0        112                                   15.03  2.61               
   internal 57.56                         0          0        238                                 

20:59:53 | running eval: valid
21:00:05 | eval completed in 12.31s
21:00:05 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0174 53.69 635.2 630.5       0          0 14.01  171 .2214    .2004 14.68 2.568 1e-06   190 188.6   
   external         0  .02218 62.36                   0          0         44 .1859          15.89 2.848                     
   internal         0  .01262 45.02                   0          0        127 .2570          13.46 2.288                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.56    .2228      .4666         0                 1440 825.2 819.2  
   external       0          0 17.26    .1866      .4349         0                                   
   internal       0          0 9.854    .2591      .4982         0
[0m
21:00:05 | saving model check

21:01:00 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:01:02 | [1;32mnew best ppl: 13.51 (previous best was 13.52)[0m
21:01:02 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:01:06 | time:1204s total_exs:24576 total_steps:1536 epochs:16.15
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      67.74     1 983.8  9044  .01409      3.414 147.1  384              8192  10.27    .4710 20.15  2.79 1e-06 259.7   
   external 64.29                   .007299      .1460        137                                   16.26 2.796               
   internal 59.23                   .004673      .8224        214                                   14.27 2.321               
   otter     79.7                     .0303      9.273         33                                   29.

21:02:01 | running eval: valid
21:02:13 | eval completed in 12.07s
21:02:13 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01656 53.69 635.2 646.5       0          0 14.36  171 .2209    .2004 14.68  2.56 1e-06   190 193.4   
   external         0  .02217 62.36                   0          0         44 .1898          15.89 2.839                     
   internal         0  .01095 45.02                   0          0        127 .2519          13.46  2.28                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.44    .2218      .4673         0                 1608 825.2 839.9  
   external       0          0 17.11    .1852      .4335         0                                   
   internal       0          0 9.773    .2585      .5012         0
[0m
21:02:13 | saving model check

21:03:06 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:03:07 | [1;32mnew best ppl: 13.39 (previous best was 13.41)[0m
21:03:07 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:03:12 | time:1331s total_exs:27264 total_steps:1704 epochs:17.91
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.84     1 971.9  6945       0          0 114.3  384              8192   10.4    .3227 18.37 2.725 1e-06 244.2   
   external 59.67                         0          0        121                                   14.59 2.711               
   internal 60.65                         0          0        229                                   13.94 2.296               
   otter    65.21                         0          0         34                                   26.

21:04:05 | running eval: valid
21:04:17 | eval completed in 12.07s
21:04:17 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02169 53.69 635.2 651.3       0          0 14.47  171 .2236    .2007 14.68 2.552 1e-06   190 194.8   
   external         0  .02217 62.36                   0          0         44 .1850          15.89 2.831                     
   internal         0  .02121 45.02                   0          0        127 .2622          13.46 2.273                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.33    .2249      .4706         0                 1776 825.2 846.2  
   external       0          0 16.95    .1852      .4406         0                                   
   internal       0          0 9.705    .2646      .5006         0
[0m
21:04:17 | saving model check

21:05:12 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:05:13 | [1;32mnew best ppl: 13.3 (previous best was 13.31)[0m
21:05:13 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:05:17 | time:1455s total_exs:29952 total_steps:1872 epochs:19.68
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      64.11     1 956.4 10294       0          0 172.2  384              8192  10.53    .3735 17.61 2.611 1e-06 234.4   
   external 57.07                         0          0        131                                   13.17 2.419               
   internal 58.71                         0          0        218                                   13.72 2.216               
   otter    76.54                         0          0         35                                   25.9

21:06:10 | running eval: valid
21:06:22 | eval completed in 11.91s
21:06:22 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02092 53.69 635.2 651.9       0          0 14.48  171 .2241    .2006 14.68 2.546 1e-06   190   195   
   external         0  .02217 62.36                   0          0         44 .1910          15.89 2.824                     
   internal         0  .01967 45.02                   0          0        127 .2572          13.46 2.267                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.25    .2224      .4679         0                 1944 825.2 846.9  
   external       0          0 16.85    .1824      .4363         0                                   
   internal       0          0 9.653    .2625      .4994         0
[0m
21:06:22 | saving model check

21:07:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:07:16 | [1;32mnew best ppl: 13.22 (previous best was 13.23)[0m
21:07:16 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:07:17 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:07:21 | time:1580s total_exs:32640 total_steps:2040 epochs:21.45
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.02     1 980.1  9309       0          0   152  384              8192  10.78    .5032 18.23 2.622 1e-06 238.6   
   external 64.25                         0          0        158                                   14.06 2.445               
   internal 59.09                         0          0        198                                 

21:08:13 | running eval: valid
21:08:25 | eval completed in 11.89s
21:08:25 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01956 53.69 635.2 652.3       0          0 14.49  171 .2252    .2004 14.68  2.54 1e-06   190 195.1   
   external         0  .02217 62.36                   0          0         44 .1961          15.89 2.819                     
   internal         0  .01695 45.02                   0          0        127 .2543          13.46 2.261                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.18    .2248      .4696         0                 2112 825.2 847.4  
   external       0          0 16.76    .1883      .4392         0                                   
   internal       0          0 9.592    .2612      .5000         0
[0m
21:08:25 | saving model check

21:09:18 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:09:20 | [1;32mnew best ppl: 13.16 (previous best was 13.17)[0m
21:09:20 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:09:24 | time:1703s total_exs:35328 total_steps:2208 epochs:23.21
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.61     1  1008  7096       0          0 112.6  384             16384  10.51    .4223 25.71  2.68 1e-06 275.8   
   external 66.01                         0          0        148                                   14.16  2.49               
   internal 61.89                         0          0        199                                   14.41 2.299               
   otter    56.92                         0          0         37                                   48.

21:10:18 | running eval: valid
21:10:30 | eval completed in 11.91s
21:10:30 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02044 53.69 635.2 649.2       0          0 14.42  171 .2233    .2003 14.68 2.535 1e-06   190 194.2   
   external         0  .02217 62.36                   0          0         44 .1863          15.89 2.815                     
   internal         0  .01871 45.02                   0          0        127 .2603          13.46 2.255                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.11    .2245      .4683         0                 2280 825.2 843.4  
   external       0          0  16.7    .1833      .4349         0                                   
   internal       0          0 9.534    .2656      .5018         0
[0m
21:10:30 | saving model check

21:11:25 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:11:26 | [1;32mnew best ppl: 13.08 (previous best was 13.09)[0m
21:11:26 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:11:30 | time:1829s total_exs:38016 total_steps:2376 epochs:24.98
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.59     1 984.5 10224       0          0 166.1  384             16384  10.22    .3837 19.89 2.737 1e-06 258.6   
   external 57.98                         0          0        120                                   15.28 2.665               
   internal 63.02                         0          0        218                                   14.06 2.251               
   otter    63.76                         0          0         46                                   30.

21:12:24 | running eval: valid
21:12:36 | eval completed in 12.06s
21:12:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02044 53.69 635.2 640.6       0          0 14.23  171 .2178    .2004 14.68  2.53 1e-06   190 191.6   
   external         0  .02217 62.36                   0          0         44 .1836          15.89  2.81                     
   internal         0  .01871 45.02                   0          0        127 .2520          13.46  2.25                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.05    .2186      .4691         0                 2448 825.2 832.2  
   external       0          0  16.6    .1794      .4335         0                                   
   internal       0          0 9.492    .2578      .5047         0
[0m
21:12:36 | saving model check

21:13:30 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:13:32 | [1;32mnew best ppl: 13.02 (previous best was 13.03)[0m
21:13:32 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:13:35 | time:1954s total_exs:40704 total_steps:2544 epochs:26.74
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      67.64     1 931.2  9371  .01709      2.615   161  384             16384  10.25    .5349 19.82 2.642 1e-06 248.6   
   external 59.43                         0          0        112                                   13.91 2.492               
   internal 53.63                         0          0        233                                   13.69 2.215               
   otter    89.85                    .05128      7.846         39                                   31.



21:14:30 | running eval: valid
21:14:42 | eval completed in 12.06s
21:14:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02235 53.69 635.2   639       0          0 14.19  171 .2249    .2004 14.68 2.525 1e-06   190 191.1   
   external         0  .02217 62.36                   0          0         44 .1940          15.89 2.805                     
   internal         0  .02252 45.02                   0          0        127 .2559          13.46 2.244                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.98    .2281      .4697         0                 2616 825.2 830.1  
   external       0          0 16.53    .1911      .4335         0                                   
   internal       0          0 9.434    .2651      .5058         0
[0m
21:14:42 | saving model check

21:15:35 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:15:36 | [1;32mnew best ppl: 12.96 (previous best was 12.97)[0m
21:15:36 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:15:37 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:15:42 | time:2081s total_exs:43392 total_steps:2712 epochs:28.51
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      69.17     1  1075  7237 .003866      .2796 107.7  384              8192  10.99    .4710  17.9 2.667 1e-06 231.7   
   external 68.59                   .005917      .5148        169                                   13.07 2.556               
   internal 65.19                   .005682      .3239        176                                 

21:16:34 | running eval: valid
21:16:46 | eval completed in 11.96s
21:16:46 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02145 53.69 635.2 646.5       0          0 14.36  171 .2294    .2004 14.68  2.52 1e-06   190 193.4   
   external         0  .02217 62.36                   0          0         44 .1957          15.89   2.8                     
   internal         0  .02072 45.02                   0          0        127 .2630          13.46  2.24                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.92    .2317      .4694         0                 2784 825.2 839.9  
   external       0          0 16.44    .1918      .4335         0                                   
   internal       0          0 9.397    .2716      .5053         0
[0m
21:16:46 | saving model check

21:17:39 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:17:41 | [1;32mnew best ppl: 12.9 (previous best was 12.91)[0m
21:17:41 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:17:45 | time:2203s total_exs:46080 total_steps:2880 epochs:30.28
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.44     1 982.7  9966       0          0 162.3  384              8192  10.32    .4327 22.26 2.605 1e-06 256.5   
   external 59.58                         0          0        133                                   15.32 2.439               
   internal  62.4                         0          0        222                                   13.88 2.159               
   otter    62.34                         0          0         29                                   37.5

21:18:39 | running eval: valid
21:18:50 | eval completed in 11.84s
21:18:50 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02145 53.69 635.2 653.2       0          0 14.51  171 .2243    .2004 14.68 2.516 1e-06   190 195.4   
   external         0  .02217 62.36                   0          0         44 .1891          15.89 2.795                     
   internal         0  .02072 45.02                   0          0        127 .2595          13.46 2.237                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.87    .2269      .4707         0                 2952 825.2 848.5  
   external       0          0 16.37    .1859      .4349         0                                   
   internal       0          0 9.365    .2680      .5064         0
[0m
21:18:50 | saving model check

21:19:44 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:19:46 | [1;32mnew best ppl: 12.85 (previous best was 12.86)[0m
21:19:46 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:19:49 | time:2328s total_exs:48768 total_steps:3048 epochs:32.04
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      62.51     1   934 10383 .002625      .6509 177.9  384              8192  10.19    .5349 22.55 2.676 1e-06 261.8   
   external 66.14                   .007874      1.953        127                                   16.82 2.612               
   internal 54.31                         0          0        233                                   14.55 2.193               
   otter    67.08                         0          0         24                                   36.

21:20:43 | running eval: valid
21:20:55 | eval completed in 11.82s
21:20:55 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02144 53.69 635.2 654.4       0          0 14.54  171 .2227    .2002 14.68 2.512 1e-06   190 195.7   
   external         0  .02217 62.36                   0          0         44 .1842          15.89 2.791                     
   internal         0  .02072 45.02                   0          0        127 .2611          13.46 2.233                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.81    .2272      .4701         0                 3120 825.2 850.1  
   external       0          0  16.3    .1850      .4349         0                                   
   internal       0          0 9.328    .2694      .5053         0
[0m
21:20:55 | saving model check

21:21:47 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:21:49 | [1;32mnew best ppl: 12.79 (previous best was 12.8)[0m
21:21:49 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:21:53 | time:2452s total_exs:51456 total_steps:3216 epochs:33.81
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      60.08     1 942.2  6996       0          0 118.8  384              8192  9.913    .3133 21.92 2.638 1e-06 256.1   
   external 58.55                         0          0        149                                    15.5 2.472               
   internal 58.56                         0          0        207                                   13.73 2.073               
   otter    63.14                         0          0         28                                   36.5

21:22:46 | running eval: valid
21:22:58 | eval completed in 11.81s
21:22:58 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02056 53.69 635.2 652.1       0          0 14.49  171 .2211    .2007 14.68 2.508 1e-06   190 195.1   
   external         0  .02217 62.36                   0          0         44 .1852          15.89 2.786                     
   internal         0  .01896 45.02                   0          0        127 .2570          13.46  2.23                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.76    .2262      .4721         0                 3288 825.2 847.2  
   external       0          0 16.22    .1833      .4349         0                                   
   internal       0          0 9.298    .2691      .5094         0
[0m
21:22:58 | saving model check

21:23:51 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:23:52 | [1;32mnew best ppl: 12.74 (previous best was 12.75)[0m
21:23:52 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:23:53 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:23:57 | time:2576s total_exs:54144 total_steps:3384 epochs:35.57
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.26     1  1038  9838 .001565      .1362 151.6  384              8192  9.853    .4607 24.64 2.728 1e-06 274.1   
   external  63.4                         0          0        136                                   15.35 2.505               
   internal 68.36                   .004695      .4085        213                                 

21:24:49 | running eval: valid
21:25:01 | eval completed in 11.84s
21:25:01 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02086 53.69 635.2 651.7       0          0 14.48  171 .2315    .2003 14.68 2.504 1e-06   190 194.9   
   external         0  .02218 62.36                   0          0         44 .1976          15.89 2.781                     
   internal         0  .01955 45.02                   0          0        127 .2654          13.46 2.228                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.7    .2345      .4733         0                 3456 825.2 846.6  
   external       0          0 16.13    .1946      .4378         0                                   
   internal       0          0  9.28    .2743      .5088         0
[0m
21:25:01 | saving model check

21:25:55 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:25:57 | [1;32mnew best ppl: 12.68 (previous best was 12.69)[0m
21:25:57 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:26:01 | time:2699s total_exs:56832 total_steps:3552 epochs:37.34
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      74.59     1  1106  9899 .009431      .5406 143.3  384              8192  10.14    .5349 20.18 2.546 1e-06 255.4   
   external 67.75                         0          0        119                                   14.87 2.405               
   internal 66.49                   .004484      .2646        223                                   13.87  2.02               
   otter    89.52                    .02381      1.357         42                                   31.

21:26:53 | running eval: valid
21:27:05 | eval completed in 11.92s
21:27:05 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02086 53.69 635.2 648.7       0          0 14.41  171 .2292    .2004 14.68 2.501 1e-06   190   194   
   external         0  .02218 62.36                   0          0         44 .1951          15.89 2.778                     
   internal         0  .01955 45.02                   0          0        127 .2633          13.46 2.224                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.67    .2325      .4731         0                 3624 825.2 842.7  
   external       0          0 16.09    .1927      .4363         0                                   
   internal       0          0 9.245    .2722      .5099         0
[0m
21:27:05 | saving model check

21:27:59 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:28:00 | [1;32mnew best ppl: 12.65 (previous best was 12.65)[0m
21:28:00 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:28:05 | time:2823s total_exs:59520 total_steps:3720 epochs:39.11
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      63.04     1 991.5  7170       0          0 115.7  384              8192  9.804    .3914 21.27 2.541 1e-06 265.5   
   external 61.31                         0          0        141                                   15.53 2.353               
   internal 61.95                         0          0        218                                   15.56  2.17               
   otter    65.84                         0          0         25                                   32.

21:28:58 | running eval: valid
21:29:10 | eval completed in 11.91s
21:29:10 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02294 53.69 635.2 647.2       0          0 14.38  171 .2312    .2004 14.68 2.498 1e-06   190 193.6   
   external         0  .02808 62.36                   0          0         44 .2008          15.89 2.776                     
   internal         0   .0178 45.02                   0          0        127 .2617          13.46 2.221                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.63    .2316      .4734         0                 3792 825.2 840.8  
   external       0          0 16.05    .1939      .4363         0                                   
   internal       0          0 9.215    .2692      .5105         0
[0m
21:29:10 | saving model check

21:30:03 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:30:05 | [1;32mnew best ppl: 12.61 (previous best was 12.62)[0m
21:30:05 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:30:09 | time:2947s total_exs:62208 total_steps:3888 epochs:40.87
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      60.43     1  1031  9444       0          0 146.6  384              8192   9.95    .5029 22.29  2.57 1e-06 269.8   
   external  60.6                         0          0        129                                   15.54  2.37               
   internal 69.29                         0          0        213                                   13.71 2.047               
   otter     51.4                         0          0         42                                   37.

21:31:02 | running eval: valid
21:31:14 | eval completed in 11.90s
21:31:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02393 53.69 635.2 651.7       0          0 14.48  171 .2363    .2004 14.68 2.496 1e-06   190 194.9   
   external         0  .02808 62.36                   0          0         44 .2105          15.89 2.773                     
   internal         0  .01979 45.02                   0          0        127 .2621          13.46 2.218                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.6    .2365      .4726         0                 3960 825.2 846.6  
   external       0          0 16.01    .1993      .4335         0                                   
   internal       0          0 9.194    .2738      .5117         0
[0m
21:31:14 | saving model check

21:32:05 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:32:06 | [1;32mnew best ppl: 12.58 (previous best was 12.59)[0m
21:32:06 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:32:08 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:32:12 | time:3071s total_exs:64896 total_steps:4056 epochs:42.64
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      61.74     1 999.8  9063 .002208      .1214   145  384              8192  11.05    .5349 23.23 2.579 1e-06 248.3   
   external 71.93                   .006623      .3642        151                                   13.34 2.292               
   internal 56.59                         0          0        206                                 

21:33:02 | running eval: valid
21:33:14 | eval completed in 11.91s
21:33:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02294 53.69 635.2   649       0          0 14.42  171 .2316    .2005 14.68 2.493 1e-06   190 194.1   
   external         0  .02808 62.36                   0          0         44 .2022          15.89 2.771                     
   internal         0   .0178 45.02                   0          0        127 .2610          13.46 2.215                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.57    .2336      .4740         0                 4128 825.2 843.2  
   external       0          0 15.97    .1985      .4363         0                                   
   internal       0          0 9.163    .2686      .5117         0
[0m
21:33:14 | saving model check

21:34:08 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:34:09 | [1;32mnew best ppl: 12.55 (previous best was 12.56)[0m
21:34:09 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:34:13 | time:3192s total_exs:67584 total_steps:4224 epochs:44.40
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      66.35     1  1016  9756 .002976      .4554 153.7  384              8192  10.24    .5349 20.87 2.508 1e-06 264.1   
   external 63.39                         0          0        116                                   14.19 2.232               
   internal 63.12                   .008929      1.366        224                                   14.54 2.142               
   otter    72.52                         0          0         44                                   33.

21:35:07 | running eval: valid
21:35:19 | eval completed in 11.89s
21:35:19 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02393 53.69 635.2 648.3       0          0  14.4  171 .2359    .2004 14.68  2.49 1e-06   190 193.9   
   external         0  .02808 62.36                   0          0         44 .2092          15.89 2.766                     
   internal         0  .01979 45.02                   0          0        127 .2627          13.46 2.213                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.52    .2386      .4746         0                 4296 825.2 842.2  
   external       0          0  15.9    .2016      .4363         0                                   
   internal       0          0 9.147    .2757      .5129         0
[0m
21:35:19 | saving model check

21:36:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:36:15 | [1;32mnew best ppl: 12.52 (previous best was 12.52)[0m
21:36:15 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:36:20 | time:3318s total_exs:70272 total_steps:4392 epochs:46.17
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      73.13     1  1144  9208  .01261      1.826 128.8  384              8192  9.771    .5271 22.77 2.558 1e-06 286.3   
   external 83.68                    .03306      4.405        121                                    15.2  2.38               
   internal  69.4                   .004785      1.072        209                                   14.25 2.191               
   otter    66.31                         0          0         54                                   38.

21:37:13 | running eval: valid
21:37:25 | eval completed in 11.82s
21:37:25 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02294 53.69 635.2 652.4       0          0 14.49  171 .2352    .2003 14.68 2.489 1e-06   190 195.1   
   external         0  .02808 62.36                   0          0         44 .2069          15.89 2.765                     
   internal         0   .0178 45.02                   0          0        127 .2636          13.46 2.212                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.51    .2374      .4750         0                 4464 825.2 847.5  
   external       0          0 15.89    .2018      .4378         0                                   
   internal       0          0 9.132    .2730      .5123         0
[0m
21:37:25 | saving model check

21:38:20 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:38:21 | [1;32mnew best ppl: 12.5 (previous best was 12.5)[0m
21:38:21 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:38:25 | time:3444s total_exs:72960 total_steps:4560 epochs:47.94
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      64.47     1 979.8 10074       0          0 164.5  384              8192  10.33    .3223 19.18 2.485 1e-06 254.2   
   external 60.93                         0          0        123                                   16.12 2.257               
   internal 59.73                         0          0        228                                   14.25 2.059               
   otter    72.76                         0          0         33                                   27.15

21:39:15 | running eval: valid
21:39:27 | eval completed in 11.81s
21:39:27 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02382 53.69 635.2 653.6       0          0 14.52  171 .2395    .2004 14.68 2.488 5e-07   190 195.5   
   external         0  .02808 62.36                   0          0         44 .2095          15.89 2.766                     
   internal         0  .01955 45.02                   0          0        127 .2694          13.46  2.21                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.5    .2418      .4756         0                 4632 825.2 849.1  
   external       0          0 15.89    .2046      .4378         0                                   
   internal       0          0 9.113    .2791      .5135         0
[0m
21:39:27 | saving model check

21:40:16 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:40:20 | time:3559s total_exs:75648 total_steps:4728 epochs:49.70
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss      lr  ltpb  \
   all      65.86     1  1026  9456 .001462     .08333 147.4  384             16384  9.533    .4954  21.6 2.465 2.5e-07 277.6   
   external 66.18                         0          0        102                                   14.28 2.135                 
   internal 62.34                   .004386      .2500        228                                   14.43 1.963                 
   otter    69.06                         0          0         54                                   36.07 3.298                 
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2558  .01852      .1975 14.21      .48

21:41:11 | running eval: valid
21:41:23 | eval completed in 11.86s
21:41:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .02327 53.69 635.2 650.7       0          0 14.45  171 .2401    .2004 14.68 2.487 2.5e-07   190 194.6   
   external         0  .02897 62.36                   0          0         44 .2134          15.89 2.765                       
   internal         0  .01756 45.02                   0          0        127 .2668          13.46 2.209                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.5    .2429      .4753         0                 4800 825.2 845.3  
   external       0          0 15.88    .2081      .4378         0                                   
   internal       0          0 9.111    .2776      .5129         0
[0m
21:41:23 | saving mod

21:42:15 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:42:16 | [1;32mnew best ppl: 12.5 (previous best was 12.5)[0m
21:42:16 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:42:20 | time:3679s total_exs:78336 total_steps:4896 epochs:51.47
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss      lr  ltpb  \
   all      72.69     1  1076  9226  .01884      2.425 137.2  384             16384  9.576    .5349 27.16 2.525 2.5e-07 299.2   
   external  84.6                    .02521      .9076        119                                   14.74 2.198                 
   internal  59.4                   .009091      1.391        220                                   14.75 2.035                 
   otter    74.07                    .02222      4.978         45                                

21:43:14 | running eval: valid
21:43:26 | eval completed in 11.81s
21:43:26 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02339 53.69 635.2 653.6       0          0 14.52  171 .2395    .2005 14.68 2.487 1.25e-07   190 195.5   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                        
   internal         0   .0178 45.02                   0          0        127 .2669          13.46 2.209                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.5    .2424      .4753         0                 4968 825.2 849.1  
   external       0          0 15.88    .2071      .4378         0                                   
   internal       0          0 9.108    .2776      .5129         0
[0m
21:43:26 | saving

21:44:16 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:44:18 | [1;32mnew best ppl: 12.49 (previous best was 12.49)[0m
21:44:18 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:44:22 | time:3800s total_exs:81024 total_steps:5064 epochs:53.24
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss       lr  \
   all      65.07     1  1001  9313       0          0 148.9  384             16384  10.08    .4626 29.38 2.691 6.25e-08   
   external 64.31                         0          0        124                                   15.48 2.342            
   internal 60.41                         0          0        227                                   14.82 2.188            
   otter    70.48                         0          0         33                                   57.82 3.544    



21:45:15 | running eval: valid
21:45:27 | eval completed in 11.79s
21:45:27 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02339 53.69 635.2 654.8       0          0 14.55  171 .2399    .2004 14.68 2.487 6.25e-08   190 195.9   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                        
   internal         0   .0178 45.02                   0          0        127 .2677          13.46 2.209                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.5    .2427      .4747         0                 5136 825.2 850.7  
   external       0          0 15.88    .2071      .4378         0                                   
   internal       0          0 9.108    .2783      .5117         0
[0m
21:45:27 | saving

21:46:17 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:46:18 | [1mdid not beat best ppl: 12.4931 impatience: 4[0m
21:46:21 | time:3919s total_exs:83712 total_steps:5232 epochs:55.00
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      61.04     1 979.4  9767 .002415      .2101 159.5  384              8192  10.29    .4344 19.94 2.538 3.125e-08   
   external 57.52                   .007246      .6304        138                                   14.62 2.247             
   internal 64.04                         0          0        206                                   14.29  2.05             
   otter    61.55                         0          0         40                                    30.9 3.317             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
 

21:47:11 | running eval: valid
21:47:23 | eval completed in 11.83s
21:47:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 651.6       0          0 14.47  171 .2399    .2006 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0   .0178 45.02                   0          0        127 .2677          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      194.9       0          0 12.49    .2427      .4753         0                 5304 825.2 846.5  
   external             0          0 15.88    .2071      .4378         0                                   
   internal             0          0 9.107    .2783      .5129         0
[0m
21:47:23 | sa

21:48:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:48:15 | [1;32mnew best ppl: 12.49 (previous best was 12.49)[0m
21:48:15 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:48:16 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:48:21 | time:4040s total_exs:86400 total_steps:5400 epochs:56.77
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all       56.9     1 956.8  7194       0          0 120.3  384              8192   10.1    .4625 21.34 2.541 1.562e-08   
   external  63.3                         0          0        140                                   15.31 2.329             
   internal 58.65                         0          0        223                                   14.3

21:49:08 | running eval: valid
21:49:20 | eval completed in 11.91s
21:49:20 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2   648       0          0  14.4  171 .2402    .2005 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0  .01781 45.02                   0          0        127 .2683          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      193.8       0          0 12.49    .2427      .4753         0                 5472 825.2 841.9  
   external             0          0 15.88    .2071      .4378         0                                   
   internal             0          0 9.106    .2783      .5129         0
[0m
21:49:20 | sa

21:50:12 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:50:13 | [1mdid not beat best ppl: 12.4908 impatience: 1[0m
21:50:16 | time:4154s total_exs:89088 total_steps:5568 epochs:58.53
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all       68.1     1  1043  9679       0          0 148.4  384              8192  9.818    .4564 23.68 2.583 1.562e-08   
   external 67.72                         0          0        131                                   14.59 2.381             
   internal 60.64                         0          0        199                                   14.08 2.077             
   otter    75.93                         0          0         54                                   42.37  3.29             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
  

21:51:06 | running eval: valid
21:51:17 | eval completed in 11.83s
21:51:17 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 653.4       0          0 14.52  171 .2402    .2003 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0  .01781 45.02                   0          0        127 .2683          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.4       0          0 12.49    .2427      .4753         0                 5640 825.2 848.9  
   external             0          0 15.87    .2071      .4378         0                                   
   internal             0          0 9.107    .2783      .5129         0
[0m
21:51:17 | sa

21:52:09 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:52:10 | [1mdid not beat best ppl: 12.4905 impatience: 3[0m
21:52:12 | time:4271s total_exs:91776 total_steps:5736 epochs:60.30
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      62.49     1  1004 10671 .004273      1.113 170.1  384              8192  10.06    .4640 21.68  2.47 1.562e-08   
   external 63.41                   .008333       2.55        120                                   14.04 2.241             
   internal 65.26                   .004484      .7892        223                                   14.52 2.099             
   otter     58.8                         0          0         41                                   36.49 3.069             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
 

21:53:02 | running eval: valid
21:53:14 | eval completed in 11.79s
21:53:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 653.7       0          0 14.52  171 .2402    .2002 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0  .01781 45.02                   0          0        127 .2683          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.5       0          0 12.49    .2427      .4750         0                 5808 825.2 849.2  
   external             0          0 15.88    .2071      .4378         0                                   
   internal             0          0 9.107    .2783      .5123         0
[0m
21:53:14 | sa

21:54:03 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:54:04 | [1mdid not beat best ppl: 12.4905 impatience: 10[0m
21:54:08 | time:4386s total_exs:94464 total_steps:5904 epochs:62.07
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      65.21     1 993.8  6892       0          0   111  384              8192  10.38    .3617 20.58 2.458 1.562e-08   
   external 61.34                         0          0        149                                   15.41 2.276             
   internal 60.43                         0          0        197                                   13.86 2.003             
   otter    73.87                         0          0         38                                   32.47 3.095             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb  tps   ups  
 

21:54:57 | running eval: valid
21:55:09 | eval completed in 11.83s
21:55:09 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 653.9       0          0 14.52  171 .2399    .2005 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0   .0178 45.02                   0          0        127 .2677          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.6       0          0 12.49    .2427      .4744         0                 5976 825.2 849.4  
   external             0          0 15.88    .2071      .4378         0                                   
   internal             0          0 9.105    .2783      .5111         0
[0m
21:55:09 | sa

21:55:59 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:56:00 | [1mdid not beat best ppl: 12.4894 impatience: 2[0m
21:56:00 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:56:04 | time:4503s total_exs:97152 total_steps:6072 epochs:63.83
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      62.54     1  1025  9650       0          0 150.6  384              8192  10.01    .4273 18.98 2.663 1.562e-08   
   external 66.23                         0          0        123                                   16.02 2.359             
   internal    64                         0          0        224                                   14.76 2.132             
   otter    57.38                         0          0         37                                   26.1

21:56:53 | running eval: valid
21:57:05 | eval completed in 11.82s
21:57:05 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 652.5       0          0 14.49  171 .2399    .2004 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2120          15.89 2.765                   
   internal         0   .0178 45.02                   0          0        127 .2677          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.2       0          0 12.49    .2427      .4750         0                 6144 825.2 847.6  
   external             0          0 15.88    .2071      .4378         0                                   
   internal             0          0 9.106    .2783      .5123         0
[0m
21:57:05 | sa

21:57:55 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.checkpoint
21:57:56 | [1mdid not beat best ppl: 12.4894 impatience: 9[0m
21:57:59 | time:4617s total_exs:99840 total_steps:6240 epochs:65.60
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      62.98     1  1008  9700       0          0   154  384              8192  10.23    .3946 21.56 2.532 1.562e-08   
   external 59.84                         0          0        130                                    15.3 2.283             
   internal 64.59                         0          0        214                                      14 2.041             
   otter    64.53                         0          0         40                                   35.38 3.273             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
 

21:58:49 | running eval: valid
21:59:00 | eval completed in 11.80s
21:59:00 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 654.8       0          0 14.54  171 .2391    .2007 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0          0         44 .2099          15.89 2.765                   
   internal         0  .01781 45.02                   0          0        127 .2683          13.46 2.209                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.8       0          0 12.49    .2419      .4750         0                 6312 825.2 850.7  
   external             0          0 15.88    .2055      .4378         0                                   
   internal             0          0 9.105    .2783      .5123         0
[0m
21:59:00 | sa

21:59:35 | loading dictionary from /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model.dict
21:59:35 | num words = 54944
21:59:36 | Total parameters: 87,508,992 (87,508,992 trainable)
21:59:36 | Loading existing model params from /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter-6,3,1/model
21:59:37 | creating task(s): internal
21:59:37 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/internal/valid.txt
21:59:37 | creating task(s): external
21:59:37 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/external/valid.txt
21:59:37 | running eval: valid
21:59:49 | eval completed in 11.80s
21:59:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02339 53.69 635.2 654.9       0          0 14.55  171 .2402    .1933 14.68 2.487 1.562e-08   190   
   external         0  .02897 62.36                   0  

0,1
internal/exs/train,195.0
exs/train,384.0
internal/clen/train,71.93333
internal/ctrunc/train,0.01026
internal/ctrunclen/train,1.5641
internal/llen/train,13.57949
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.01888
internal/ppl/train,7.52989


0,1
internal/exs/train,▆█▅█▃▄▄▆▇▅▄▃▅▄▅▇▄▄▅▆▄▁▆▃▅▃▆▄▂▅▆▅▅▅▅▆▄▄▅▅
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,▅▄▄▂▄▃▄▃▃▃▅▅▃▂▃▂▄▄▄▄▅▆▅▃▅▄▄▂█▂▃▃█▄▇▅▂▃▅▁
internal/ctrunc/train,▄▄▁▁▅▁▁▁▁▅▅▁▁▁▁▁▅▁▁▁▅▅▄▁▄▅█▁▅▁█▁▄▁█▁▁▁▁▁
internal/ctrunclen/train,▂▂▁▁▃▁▁▁▁▅▇▁▁▁▁▁▃▁▁▁▂▃▂▁▂▂█▁▃▁█▁▂▁▇▁▁▁▁▁
internal/llen/train,▂▇▄▇▆▄▁▃▁▅▂▅▂▆▇▅▃▅█▃▅▇▃▄▄▅▆▁▇█▇█▆▃▄▁▃▆▇▃
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,██▆▆▇▅▄▄▄▅▃▅▄▄▄▄▄▄▃▃▃▄▁▂▂▂▃▁▂▂▂▂▂▁▁▂▃▂▂▂
internal/ppl/train,██▅▅▆▄▄▃▄▄▃▅▃▄▃▄▃▄▃▂▃▃▁▂▂▂▂▁▁▂▁▁▂▁▁▂▂▂▂▂


## 4. Blender base + otter and bst

### Datasets with sampling weights:
- internal: 6
- external: 3
- otter: 1
- bst: 1

### Mutators
None

In [7]:
tasks='internal,external,otter,blended_skill_talk'
weights= '6,3'

run_training(tasks=tasks,weights=weights)

22:00:29 | building dictionary first...
22:00:29 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model(.opt)
22:00:29 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: None,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_v

22:00:31 |     text_truncate: 512
22:00:31 |     topk: 10
22:00:31 |     topp: 0.9
22:00:31 |     truncate: -1
22:00:31 |     update_freq: 1
22:00:31 |     use_reply: label
22:00:31 |     validation_cutoff: 1.0
22:00:31 |     validation_every_n_epochs: 0.25
22:00:31 |     validation_every_n_secs: -1
22:00:31 |     validation_every_n_steps: -1
22:00:31 |     validation_max_exs: 20000
22:00:31 |     validation_metric: ppl
22:00:31 |     validation_metric_mode: min
22:00:31 |     validation_patience: 15
22:00:31 |     validation_share_agent: False
22:00:31 |     variant: xlm
22:00:31 |     verbose: False
22:00:31 |     wandb_entity: None
22:00:31 |     wandb_log: True
22:00:31 |     wandb_name: None
22:00:31 |     wandb_project: parlaiemely
22:00:31 |     warmup_rate: 0.0001
22:00:31 |     warmup_updates: -1
22:00:31 |     weight_decay: None
22:00:31 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
22:00:31 | creating task(s): internal,external,otter,blended_skill_talk
22

22:00:48 | training...
22:00:54 | time:23s total_exs:800 total_steps:50 epochs:0.03
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gpu_mem  llen  loss    lr  \
   all                81.62 .9800  1345 11450 .0007813     .06797 136.2  800             18186    .4414 21.25 2.846 1e-06   
   blended_skill_talk 87.96                          0          0        161                            17.68 2.274         
   external           85.63                          0          0        248                            14.23 3.007         
   internal           84.75                    .003125      .2719        320                            14.33 2.665         
   otter              68.15                          0          0         71                            38.75 3.437         
                       ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                  272  2315  .01056     



22:01:01 | time:30s total_exs:1600 total_steps:100 epochs:0.06
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.22     1  1362 10523 .002463      .3768 123.7  800              8192   11.1    .5350 20.78 2.814   
   blended_skill_talk 96.46                         0          0        175                                   17.21 2.266   
   external           88.15                   .009852      1.507        203                                   14.56 2.937   
   internal           79.41                         0          0        357                                    14.1 2.609   
   otter              80.83                         0          0         65                                   37.25 3.443   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                1e-06   268  2071 .003846     .09231 18.

22:01:31 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model.checkpoint
22:01:31 | Saving dictionary to /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model.checkpoint.dict
22:01:38 | time:67s total_exs:6400 total_steps:400 epochs:0.22
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.93     1  1411  9029       0          0 102.4  800              8192   10.7    .3704 18.92 2.693   
   blended_skill_talk 93.23                         0          0        218                                   17.45 2.311   
   external           87.57                         0          0        242                                   14.45 2.837   
   internal           85.35                         0          0        292                                   14.39 2.449   
   otter     

22:02:25 | time:114s total_exs:10336 total_steps:646 epochs:0.36
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                94.72     1  1464 10719 .003968      .2183 117.1  800              8192  10.24    .5350 21.97 2.754   
   blended_skill_talk 89.93                         0          0        231                                   18.15 2.371   
   external           96.07                         0          0        176                                   14.61 2.715   
   internal           87.71                         0          0        330                                   14.24 2.417   
   otter              105.2                    .01587      .8730         63                                   40.87 3.512   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 278.6  2040   .0119      .4206 

22:02:57 | running eval: valid
22:03:10 | eval completed in 13.17s
22:03:10 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .008162 53.69 635.2 596.7       0          0 13.26  171 .2013    .2005 14.68 2.629 1e-06   190 178.5   
   external         0 .004705 62.36                   0          0         44 .1603          15.89 2.914                     
   internal         0  .01162 45.02                   0          0        127 .2423          13.46 2.344                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.42    .2070      .4549         0                  892 825.2 775.2  
   external       0          0 18.43    .1700      .4192         0                                   
   internal       0          0 10.42    .2440      .4906         0
[0m
22:03:10 | saving model check

22:03:52 | time:200s total_exs:19072 total_steps:1192 epochs:0.67
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.03     1  1434 12604       0          0 140.6  800              8192  10.25    .4624 19.61  2.73   
   blended_skill_talk 90.69                         0          0        232                                   18.11 2.301   
   external           89.86                         0          0        190                                   15.74 2.762   
   internal           90.38                         0          0        343                                   13.84 2.403   
   otter              73.17                         0          0         35                                   30.74 3.453   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 260.3  2288       0          0

22:04:35 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model.checkpoint
22:04:40 | time:249s total_exs:23008 total_steps:1438 epochs:0.81
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                88.35     1  1387  8619  .00463      .2731 99.43  800              8192  10.64    .5350 21.37 2.695   
   blended_skill_talk 85.55                         0          0        197                                   17.51 2.274   
   external           88.77                         0          0        213                                   14.05 2.742   
   internal           85.01                         0          0        336                                   14.23 2.354   
   otter              94.06                    .01852      1.093         54                                    39.7 3.412   
                         lr  ltpb  

22:05:19 | time:288s total_exs:27808 total_steps:1738 epochs:0.97
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                95.37     1  1538 10942 .0007576     .04167 113.8  800              8192  10.85    .5350 20.63 2.713   
   blended_skill_talk 96.19                          0          0        227                                   18.83 2.375   
   external           91.48                          0          0        186                                   15.47 2.702   
   internal           99.06                     .00303      .1667        330                                   15.05 2.437   
   otter              94.74                          0          0         57                                   33.18 3.339   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 278.9  1985 .008772     

22:06:07 | time:336s total_exs:31744 total_steps:1984 epochs:1.11
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 93.1     1  1463 12278 .004167      .2292 134.2  800              8192   9.99    .5350 26.09   2.7   
   blended_skill_talk 95.11                         0          0        192                                   18.11 2.304   
   external           91.43                         0          0        190                                   15.15 2.688   
   internal           88.72                         0          0        358                                   13.68  2.29   
   otter              97.15                    .01667      .9167         60                                   57.43 3.519   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 288.9  2424   .0500      1.067

22:06:38 | running eval: valid
22:06:50 | eval completed in 12.39s
22:06:50 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01776 53.69 635.2   627       0          0 13.93  171 .2201    .2004 14.68 2.564 1e-06   190 187.5   
   external         0  .02217 62.36                   0          0         44 .1890          15.89 2.846                     
   internal         0  .01334 45.02                   0          0        127 .2512          13.46 2.283                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.51    .2242      .4637         0                 2230 825.2 814.5  
   external       0          0 17.22    .1892      .4292         0                                   
   internal       0          0 9.807    .2591      .4982         0
[0m
22:06:50 | saving model check

22:07:31 | time:420s total_exs:40480 total_steps:2530 epochs:1.42
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.24     1  1365 10931       0          0 128.1  800             16384  10.37    .4562 17.79  2.57   
   blended_skill_talk 86.82                         0          0        200                                   17.28 2.256   
   external           80.75                         0          0        203                                   14.27  2.44   
   internal           86.28                         0          0        346                                   13.56  2.29   
   otter               91.1                         0          0         51                                   26.06 3.294   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 247.5  1982       0          0

22:08:18 | time:467s total_exs:44416 total_steps:2776 epochs:1.56
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                91.97     1  1406 12141 .007687      .8497 138.1  800             16384  10.35    .5350  21.9 2.659   
   blended_skill_talk 101.3                         0          0        208                                   16.65 2.337   
   external           82.55                         0          0        213                                   15.42 2.585   
   internal            82.9                   .009009      1.508        333                                   14.76 2.332   
   otter              101.1                    .02174      1.891         46                                   40.78 3.382   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 267.8  2312  .02174      .8152

22:08:55 | time:504s total_exs:49216 total_steps:3076 epochs:1.72
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.44     1  1411 11918       0          0 135.2  800             16384  10.39    .4608 21.76 2.711   
   blended_skill_talk    92                         0          0        185                                   18.31 2.444   
   external            87.2                         0          0        230                                   14.32 2.525   
   internal            86.2                         0          0        341                                   14.19 2.274   
   otter              92.36                         0          0         44                                   40.23 3.602   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 263.2  2223  .02273      .7386

22:09:43 | time:552s total_exs:53152 total_steps:3322 epochs:1.86
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.73     1  1393 10533 .004001      .3724   121  800             16384  9.861    .5350 22.64 2.567   
   blended_skill_talk 91.65                    .01005      .5879        199                                   18.03 2.262   
   external           83.38                         0          0        205                                   15.18 2.509   
   internal           88.48                   .005952      .9018        336                                   13.95 2.221   
   otter              83.42                         0          0         60                                   43.38 3.274   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 277.2  2096  .01667      .5375



22:10:01 | time:570s total_exs:55552 total_steps:3472 epochs:1.95
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.99     1  1357 12567       0          0 148.2  800              8192  9.958    .4562 20.61 2.618   
   blended_skill_talk 93.85                         0          0        175                                   18.33  2.34   
   external           79.25                         0          0        221                                   16.83 2.655   
   internal           82.73                         0          0        353                                   13.98 2.243   
   otter              92.14                         0          0         51                                   33.31 3.231   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 270.7  2507 .004902      .1324

22:10:48 | time:617s total_exs:59488 total_steps:3718 epochs:2.08
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.43     1  1392 10376       0          0 119.3  800              8192  10.04    .5031 22.48 2.629   
   blended_skill_talk 97.06                         0          0        179                                   18.64 2.307   
   external           87.49                         0          0        235                                   14.82 2.573   
   internal            81.8                         0          0        330                                   13.72 2.217   
   otter              83.38                         0          0         56                                   42.75 3.421   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 273.2  2036  .01786      .3616

22:11:24 | time:653s total_exs:64224 total_steps:4014 epochs:2.25
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.76     1  1384 10917       0          0 126.2  736              8192  10.31    .4624 18.56 2.568   
   blended_skill_talk 87.26                         0          0        180                                   16.38 2.347   
   external           78.68                         0          0        194                                   15.47 2.523   
   internal            90.2                         0          0        311                                    13.8 2.137   
   otter               90.9                         0          0         51                                   28.59 3.264   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                1e-06 253.8  2002 .004902      .1225 

22:12:11 | time:700s total_exs:68224 total_steps:4264 epochs:2.39
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 91.2     1  1473 13228       0          0 143.7  800              8192  9.755    .4667 22.75  2.58   
   blended_skill_talk 94.78                         0          0        239                                   17.35 2.333   
   external           88.74                         0          0        188                                   14.87 2.401   
   internal           92.63                         0          0        311                                   14.52 2.164   
   otter              88.66                         0          0         62                                   44.24 3.425   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 281.7  2530  .02419      .4677

22:12:47 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model.checkpoint
22:12:48 | [1;32mnew best ppl: 12.88 (previous best was 12.95)[0m
22:12:48 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model
22:12:49 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external,otter,blended_skill_talk-6,3/model.checkpoint
22:12:57 | time:746s total_exs:72160 total_steps:4510 epochs:2.53
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                90.17     1  1430 10392 .003788     .07197 116.3  800              8192  9.764    .5350 22.85  2.61   
   blended_skill_talk 89.21                         0          0        213                                   18.87 2.358   
   external           93.64                         0  

22:13:35 | time:783s total_exs:76960 total_steps:4810 epochs:2.70
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.95     1  1415 10066 .004757      .7794 113.8  800              8192  9.965    .4711 18.67 2.574   
   blended_skill_talk 86.22                   .004762      .8381        210                                    17.9 2.338   
   external           98.25                     .0113       2.22        177                                   15.05 2.484   
   internal           86.25                   .002967     .05935        337                                   13.99 2.155   
   otter              89.08                         0          0         76                                   27.76 3.317   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 264.2  1880 .003289      .1184

22:14:20 | time:829s total_exs:80896 total_steps:5056 epochs:2.83
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                80.22     1  1294 11840       0          0 146.4  800              5980    inf    .4608 21.51 2.497   
   blended_skill_talk 87.51                         0          0        154                                   18.12 2.258   
   external           75.52                         0          0        220                                   14.54 2.346   
   internal           82.48                         0          0        354                                   14.49 2.105   
   otter              75.38                         0          0         72                                   38.92  3.28   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 275.4  2519  .02431      .5139



22:14:27 | time:836s total_exs:81696 total_steps:5106 epochs:2.86
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                83.75     1  1440 10513 .005946      .6407 116.8  800              4096  10.13    .5350 21.98 2.556   
   blended_skill_talk 87.15                    .01036      1.321        193                                   18.92 2.312   
   external           80.52                     .0107      .6310        187                                   14.28 2.363   
   internal           101.4                   .002725      .6104        367                                   13.89 2.115   
   otter              65.91                         0          0         53                                   40.85 3.434   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 271.4  1982  .01415     .05189

22:14:57 | running eval: valid
22:15:09 | eval completed in 11.94s
22:15:09 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02044 53.69 635.2 648.9       0          0 14.41  171 .2338    .2006 14.68 2.505 1e-06   190 194.1   
   external         0  .02308 62.36                   0          0         44 .2008          15.89 2.786                     
   internal         0   .0178 45.02                   0          0        127 .2669          13.46 2.225                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 12.73    .2368      .4685         0                 5352 825.2  843  
   external       0          0 16.21    .1961      .4335         0                                  
   internal       0          0 9.255    .2775      .5035         0
[0m
22:15:09 | saving model checkpoi

22:15:49 | time:918s total_exs:90432 total_steps:5652 epochs:3.17
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.59     1  1368 13436       0          0 157.2  800              4096  9.983    .4608 19.23 2.527   
   blended_skill_talk 89.99                         0          0        173                                   18.67 2.407   
   external           86.68                         0          0        209                                   14.97 2.298   
   internal           80.21                         0          0        363                                    14.8 2.076   
   otter              101.5                         0          0         55                                   28.49 3.325   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 264.5  2599  .01364      .3227

22:16:35 | time:964s total_exs:94368 total_steps:5898 epochs:3.31
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                97.72     1  1537 13544 .003289      .3443   141  800              4096  11.43    .5270 19.69 2.541   
   blended_skill_talk 114.4                    .01316      1.377        228                                   19.15 2.315   
   external           92.81                         0          0        220                                   14.44 2.427   
   internal            84.5                         0          0        289                                   13.66  2.16   
   otter              99.16                         0          0         63                                   31.51 3.264   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 269.3  2373 .007937     .03175

22:17:13 | time:1002s total_exs:99168 total_steps:6198 epochs:3.47
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.15     1  1376 12149       0          0 141.2  800              4096   10.3    .4387 19.54 2.508   
   blended_skill_talk 90.82                         0          0        197                                   17.86 2.336   
   external           83.35                         0          0        200                                    14.3 2.218   
   internal            84.9                         0          0        342                                   13.83 2.109   
   otter              85.51                         0          0         61                                   32.16 3.368   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 259.6  2292 .008197      .364

22:18:00 | time:1049s total_exs:103104 total_steps:6444 epochs:3.61
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.33     1  1402 10532 .0008389     .04614 120.2  800              4096  10.05    .5350 20.98 2.497   
   blended_skill_talk 82.91                          0          0        203                                   18.36 2.299   
   external           86.44                          0          0        251                                   14.56 2.242   
   internal           92.69                    .003356      .1846        298                                   14.07 2.045   
   otter              83.27                          0          0         48                                   36.94 3.402   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 264.5  1987  .02083   

22:18:30 | running eval: valid
22:18:42 | eval completed in 11.90s
22:18:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02323 53.69 635.2 651.8       0          0 14.48  171 .2339    .2006 14.68 2.496 1e-06   190   195   
   external         0  .02808 62.36                   0          0         44 .2091          15.89 2.777                     
   internal         0  .01837 45.02                   0          0        127 .2587          13.46 2.216                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.62    .2389      .4733         0                 6690 825.2 846.8  
   external       0          0 16.07    .2061      .4378         0                                   
   internal       0          0 9.169    .2717      .5088         0
[0m
22:18:42 | saving model check

22:19:24 | time:1133s total_exs:111840 total_steps:6990 epochs:3.92
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 88.1     1  1400 12058       0          0 137.8  800              4096  9.953    .4608 20.54 2.513   
   blended_skill_talk 97.95                         0          0        194                                   18.69  2.43   
   external           88.07                         0          0        215                                   15.73 2.192   
   internal           81.55                         0          0        332                                   14.75 2.096   
   otter              84.81                         0          0         59                                      33 3.335   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 275.8  2375 .008475      .27

22:20:11 | time:1180s total_exs:115776 total_steps:7236 epochs:4.06
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.59     1  1383 10179 .002249     .04761 117.8  800              8192  10.13    .5350 23.36 2.437   
   blended_skill_talk 93.82                   .006211      .1180        161                                   19.14 2.331   
   external           86.45                         0          0        214                                   15.27 2.147   
   internal           83.99                   .002786     .07242        359                                   15.05 2.017   
   otter              82.09                         0          0         66                                   43.98 3.253   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   287  2113  .03409      1.1

22:20:49 | time:1218s total_exs:120576 total_steps:7536 epochs:4.22
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                96.48     1  1493 10639 .003211      .1841   114  800              8192  10.04    .5350 20.84 2.395   
   blended_skill_talk 98.44                    .00463      .2639        216                                      18 2.227   
   external           92.14                   .005155      .3041        194                                   14.25 2.076   
   internal            88.6                   .003058      .1682        327                                   14.83 2.042   
   otter              106.7                         0          0         63                                    36.3 3.233   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 271.9  1937  .01984      .76

22:21:36 | time:1265s total_exs:124512 total_steps:7782 epochs:4.36
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                88.15     1  1406 12040 .001337     .07888   137  800              8192  10.32    .5350 21.92 2.463   
   blended_skill_talk 94.63                         0          0        197                                   18.05  2.41   
   external           83.06                   .005348      .3155        187                                   15.07 2.222   
   internal           86.76                         0          0        343                                   14.81 2.071   
   otter              88.14                         0          0         73                                   39.74  3.15   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 284.2  2433  .02055      .49

22:22:07 | running eval: valid
22:22:19 | eval completed in 12.08s
22:22:19 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01815 53.69 635.2 641.1       0          0 14.24  171 .2248    .2005 14.68 2.487 1e-06   190 191.7   
   external         0  .02172 62.36                   0          0         44 .1953          15.89 2.768                     
   internal         0  .01457 45.02                   0          0        127 .2543          13.46 2.206                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  12.5    .2299      .4756         0                 8028 825.2 832.8  
   external       0          0 15.93    .1930      .4378         0                                   
   internal       0          0 9.077    .2668      .5135         0
[0m
22:22:19 | saving model check

22:23:01 | time:1350s total_exs:133248 total_steps:8328 epochs:4.67
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                90.81     1  1452 10642 .002404      .3065 117.3  800              8192  10.04    .5270 20.38 2.392   
   blended_skill_talk 93.74                         0          0        199                                   18.23 2.302   
   external           88.22                   .009615      1.226        208                                   15.24 2.165   
   internal           91.45                         0          0        328                                   14.34 1.959   
   otter              89.83                         0          0         65                                   33.72 3.141   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 273.7  2006 .003846     .023

22:23:47 | time:1396s total_exs:137184 total_steps:8574 epochs:4.81
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                81.22     1  1299 11713 .001263     .07449 144.3  800              8192  10.03    .5350 20.57 2.408   
   blended_skill_talk 82.69                         0          0        175                                   17.67 2.319   
   external           83.22                   .005051      .2980        198                                   14.58 2.168   
   internal           79.84                         0          0        378                                   14.63 1.981   
   otter              79.12                         0          0         49                                   35.39 3.164   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 263.6  2377   .0102      .32

22:24:24 | time:1433s total_exs:141984 total_steps:8874 epochs:4.97
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.35     1  1330 12326  .00431      .2457 148.3  800              8192   10.4    .4414  17.3 2.382   
   blended_skill_talk 81.15                         0          0        190                                   16.76 2.355   
   external           89.64                         0          0        201                                   14.43  2.04   
   internal           79.21                         0          0        351                                   13.67 1.905   
   otter               91.4                    .01724      .9828         58                                   24.33 3.227   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 245.8  2279       0         

22:25:11 | time:1479s total_exs:145920 total_steps:9120 epochs:5.11
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.54     1  1377 10408       0          0   121  800             16384   10.3    .4199 18.09  2.39   
   blended_skill_talk 90.34                         0          0        198                                   18.03 2.378   
   external           87.68                         0          0        182                                   15.36 2.087   
   internal           82.15                         0          0        358                                    14.5 1.965   
   otter              90.02                         0          0         62                                   24.48  3.13   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 261.5  1977       0         

22:25:42 | running eval: valid
22:25:54 | eval completed in 12.15s
22:25:54 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02133 53.69 635.2 640.4       0          0 14.23  171 .2371    .2005 14.68 2.481 1e-06   190 191.6   
   external         0  .02897 62.36                   0          0         44 .2105          15.89 2.763                     
   internal         0  .01369 45.02                   0          0        127 .2637          13.46   2.2                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 12.43    .2427      .4770         0                 9366 825.2  832  
   external       0          0 15.85    .2110      .4435         0                                  
   internal       0          0 9.022    .2744      .5105         0
[0m
22:25:54 | saving model checkpoi

22:26:35 | time:1563s total_exs:154656 total_steps:9666 epochs:5.42
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 93.6     1  1463 12647 .001524      .1944 138.3  800             16384  10.05    .5270 19.88 2.326   
   blended_skill_talk 95.06                         0          0        229                                   18.11 2.357   
   external           92.76                         0          0        180                                   14.54 1.934   
   internal           87.44                   .006098      .7774        328                                   14.59  1.95   
   otter              99.13                         0          0         63                                   32.29 3.061   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 270.3  2337  .01587      .27

22:27:23 | time:1612s total_exs:158592 total_steps:9912 epochs:5.56
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                91.76     1  1396  9685 .006636      .4575   111  800             16384  9.781    .5350 23.53 2.391   
   blended_skill_talk 89.89                   .006024      .1145        166                                   19.17 2.386   
   external           83.44                         0          0        186                                   14.25 2.037   
   internal           84.81                   .008174      1.469        367                                   13.82 1.888   
   otter              108.9                    .01235      .2469         81                                   46.86 3.251   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 288.5  2001  .02469      .84

22:28:02 | time:1651s total_exs:163392 total_steps:10212 epochs:5.73
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                81.76     1  1256  9765 .004545      .3955 124.4  800             16384  10.08    .4345 19.64 2.341   
   blended_skill_talk 75.56                         0          0        140                                   16.74  2.31   
   external           78.77                         0          0        238                                   15.27 2.057   
   internal           77.08                         0          0        367                                   14.21 1.969   
   otter              95.62                    .01818      1.582         55                                   32.33 3.029   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 258.9  2013 .004545      .1



22:28:36 | time:1685s total_exs:165728 total_steps:10358 epochs:5.81
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.28     1  1388 12421 .006596      .5507 143.1  800              8192  10.22    .5349 20.64 2.362   
   blended_skill_talk 92.09                         0          0        199                                   16.86 2.326   
   external           85.24                   .009434      1.203        212                                   14.81  2.05   
   internal           83.88                         0          0        330                                    14.8 1.893   
   otter               95.9                    .01695          1         59                                   36.08 3.178   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 267.5  2393  .02542      .5

22:29:14 | time:1723s total_exs:170528 total_steps:10658 epochs:5.98
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.61     1  1387 12749 .002024      .2581   147  800              8192  10.76    .5270 18.85 2.355   
   blended_skill_talk 97.62                         0          0        191                                   17.04 2.414   
   external           91.17                   .008097      1.032        247                                   14.99 2.026   
   internal           78.51                         0          0        315                                   13.74 1.805   
   otter              79.15                         0          0         47                                   29.64 3.177   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 253.4  2329  .01064     .04

22:30:02 | time:1770s total_exs:174464 total_steps:10904 epochs:6.11
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                94.51     1  1446 11239 .006694      1.275 124.4  800              8192  10.25    .4878 20.98 2.411   
   blended_skill_talk 89.05                         0          0        215                                   18.58 2.512   
   external           91.83                         0          0        206                                   15.98  2.06   
   internal           89.54                   .002967      .9080        337                                    14.8 1.887   
   otter              107.6                    .02381       4.19         42                                   34.57 3.186   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 274.2  2132   .0119     .08

22:30:32 | running eval: valid
22:30:44 | eval completed in 12.00s
22:30:44 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01677 53.69 635.2 642.7       0          0 14.28  171 .2390    .2004 14.68  2.48 1e-06   190 192.2   
   external         0  .01557 62.36                   0          0         44 .2053          15.89 2.765                     
   internal         0  .01798 45.02                   0          0        127 .2726          13.46 2.195                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 12.43    .2439      .4765         0                11150 825.2  835  
   external       0          0 15.88    .2057      .4406         0                                  
   internal       0          0 8.981    .2821      .5123         0
[0m
22:30:44 | saving model checkpoi

22:31:24 | time:1853s total_exs:183200 total_steps:11450 epochs:6.42
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                91.16     1  1420 12838 .0007396     .01405 144.7  800              8192  10.11    .5349 21.67 2.374   
   blended_skill_talk 93.39                          0          0        217                                   18.45 2.343   
   external           90.89                          0          0        196                                    15.7 2.059   
   internal           83.36                    .002959     .05621        338                                   13.89 1.837   
   otter              97.02                          0          0         49                                   38.65 3.254   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06   270  2441  .02041  

22:32:08 | time:1897s total_exs:187136 total_steps:11696 epochs:6.56
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.83     1  1348 12128 .002195      .1251 143.9  800              8192  10.48    .4562 18.63 2.317   
   blended_skill_talk    86                   .005848      .3333        171                                   18.95 2.384   
   external           78.82                         0          0        214                                   13.99  1.92   
   internal           82.98                   .002933      .1672        341                                   14.11 1.881   
   otter              103.5                         0          0         74                                   27.47 3.083   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 261.1  2348 .003378     .08

22:32:45 | time:1934s total_exs:191936 total_steps:11996 epochs:6.73
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                85.41     1  1326 12102       0          0 146.1  800              8192  10.17    .4430 21.07 2.312   
   blended_skill_talk 91.26                         0          0        172                                   18.72  2.34   
   external            84.4                         0          0        208                                   15.65 1.889   
   internal           77.07                         0          0        364                                   14.34  1.86   
   otter              88.93                         0          0         56                                   35.57 3.161   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                1e-06 272.6  2488 .008929      .25

22:33:31 | time:1980s total_exs:195872 total_steps:12242 epochs:6.86
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.58     1  1316  9949       0          0   121  800              8192  10.36    .4104 19.63  2.33   
   blended_skill_talk 86.79                         0          0        154                                   18.94 2.375   
   external           85.81                         0          0        237                                      15 1.896   
   internal           76.59                         0          0        349                                   14.71 1.969   
   otter              89.12                         0          0         60                                   29.87 3.079   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1e-06 266.8  2018 .008333      .2

22:34:02 | running eval: valid
22:34:14 | eval completed in 12.04s
22:34:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01937 53.69 635.2 642.8       0          0 14.28  171 .2383    .2006 14.68 2.479 1e-06   190 192.3   
   external         0  .02365 62.36                   0          0         44 .2117          15.89 2.764                     
   internal         0   .0151 45.02                   0          0        127 .2650          13.46 2.193                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.42    .2431      .4765         0                12488 825.2 835.1  
   external       0          0 15.87    .2115      .4406         0                                   
   internal       0          0 8.966    .2746      .5123         0
[0m
22:34:14 | saving model check

22:34:52 | time:2061s total_exs:204608 total_steps:12788 epochs:7.17
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                77.43     1  1271 13049       0          0 164.3  800             16384  10.12    .4087 19.07 2.258   
   blended_skill_talk 79.93                         0          0        170                                   16.72 2.314   
   external           76.58                         0          0        218                                   14.97 1.879   
   internal           81.98                         0          0        364                                   14.49 1.797   
   otter              71.21                         0          0         48                                    30.1 3.041   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07   256  2628  .01042      .1

22:35:38 | time:2106s total_exs:208544 total_steps:13034 epochs:7.31
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                90.01     1  1352 10619 .004464      .2545 125.7  800             16384  10.27    .4878 21.72 2.356   
   blended_skill_talk 78.04                         0          0        195                                   16.63  2.34   
   external           92.13                         0          0        222                                   14.75 1.868   
   internal           78.75                         0          0        327                                   14.26 1.831   
   otter              111.1                    .01786      1.018         56                                   41.23 3.384   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 267.7  2103  .04018      .4

22:36:15 | time:2144s total_exs:213344 total_steps:13334 epochs:7.48
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.29     1  1349  9865       0          0   117  800             16384  10.13    .3744 19.63 2.301   
   blended_skill_talk  82.2                         0          0        197                                   16.92  2.43   
   external           83.66                         0          0        227                                   16.07 1.868   
   internal           83.57                         0          0        324                                   14.23 1.839   
   otter              99.75                         0          0         52                                   31.29 3.068   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 263.4  1926 .009615      .2

22:37:00 | time:2189s total_exs:217280 total_steps:13580 epochs:7.61
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                84.79     1  1339 12868       0          0 153.8  800             16384  10.24    .5091 21.87 2.292   
   blended_skill_talk 95.88                         0          0        163                                   15.69 2.453   
   external           81.47                         0          0        243                                    14.3 1.754   
   internal           79.52                         0          0        329                                   13.68 1.751   
   otter              82.29                         0          0         65                                   43.78 3.212   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 265.1  2548  .01923      .4



22:37:12 | time:2201s total_exs:218880 total_steps:13680 epochs:7.67
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                80.67     1  1306 11436 .001397      .1781 140.1  800              8192  9.957    .5270 21.01 2.295   
   blended_skill_talk 82.73                         0          0        158                                   16.93  2.38   
   external           75.76                         0          0        227                                   14.94  1.78   
   internal           86.15                   .005587      .7123        358                                   14.09 1.727   
   otter              78.04                         0          0         57                                   38.09 3.292   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07   265  2320 .008772      .1

22:37:59 | time:2247s total_exs:222816 total_steps:13926 epochs:7.81
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                89.46     1  1386 10476       0          0 120.9  800              8192   10.2    .4016 19.83 2.293   
   blended_skill_talk  86.5                         0          0        209                                   17.39 2.421   
   external           87.02                         0          0        199                                   14.33 1.727   
   internal           84.14                         0          0        334                                    14.2 1.786   
   otter              100.2                         0          0         58                                   33.41 3.238   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07 262.6  1985 .008621      .1

22:38:35 | time:2284s total_exs:227616 total_steps:14226 epochs:7.98
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 83.7     1  1318 10009 .0007246     .06304 121.5  800              8192  10.25    .4414 18.27 2.303   
   blended_skill_talk 86.99                          0          0        177                                    17.1 2.366   
   external            83.2                          0          0        221                                   14.96 1.839   
   internal           79.19                    .002899      .2522        345                                   13.99  1.86   
   otter               85.4                          0          0         57                                   27.02 3.148   
                         lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                5e-07   254  1929       0  

22:39:21 | time:2329s total_exs:231552 total_steps:14472 epochs:8.11
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 91.1     1  1434 12884 .001289     .02448 143.8  800              8192  10.03    .5350 22.48 2.271   
   blended_skill_talk 106.2                   .005155     .09794        194                                   18.11 2.297   
   external           86.79                         0          0        212                                   15.86 1.849   
   internal           81.86                         0          0        336                                    13.9 1.798   
   otter              89.55                         0          0         58                                   42.03 3.139   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 277.2  2491  .01293    

22:39:52 | running eval: valid
22:40:04 | eval completed in 12.15s
22:40:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .01838 53.69 635.2 636.5       0          0 14.14  171 .2306    .2006 14.68 2.481 2.5e-07   190 190.4   
   external         0  .02365 62.36                   0          0         44 .1944          15.89 2.768                       
   internal         0  .01311 45.02                   0          0        127 .2668          13.46 2.193                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.44    .2362      .4759         0                14718 825.2 826.9  
   external       0          0 15.93    .1950      .4378         0                                   
   internal       0          0 8.962    .2773      .5140         0
[0m
22:40:04 | saving mod

22:40:43 | time:2412s total_exs:240288 total_steps:15018 epochs:8.42
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                   89     1  1371 10556       0          0 123.2  800              8192  10.11    .4608 19.04 2.267   
   blended_skill_talk 79.86                         0          0        203                                    18.8 2.355   
   external           80.57                         0          0        203                                   15.13 1.827   
   internal           89.02                         0          0        341                                   14.29 1.836   
   otter              106.5                         0          0         53                                   27.92 3.048   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 264.4  2035  .01415    

22:41:27 | time:2456s total_exs:244224 total_steps:15264 epochs:8.56
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                82.65     1  1287 11670       0          0   145  800              8192  10.09    .4223 20.71 2.277   
   blended_skill_talk  85.3                         0          0        161                                   15.68 2.372   
   external           75.75                         0          0        200                                   15.32 1.779   
   internal           79.05                         0          0        371                                   14.21 1.809   
   otter               90.5                         0          0         68                                   37.62 3.149   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 266.1  2412  .01103    

22:42:03 | time:2492s total_exs:249024 total_steps:15564 epochs:8.73
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                 87.1     1  1369 12593 .001282      .1115 147.2  800              8192  10.25    .4878 20.11 2.276   
   blended_skill_talk 88.31                   .005128      .4462        195                                   17.47 2.455   
   external           80.68                         0          0        223                                   14.16  1.74   
   internal           86.22                         0          0        326                                   14.82  1.83   
   otter              93.18                         0          0         56                                      34 3.077   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps  ups  
   all                2.5e-07 264.8  2436 .008929     

22:42:49 | time:2538s total_exs:252960 total_steps:15810 epochs:8.86
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                92.59     1  1418 10923 .001366      .1189 123.3  800             16384  10.15    .4639 20.07 2.299   
   blended_skill_talk 104.4                         0          0        215                                   17.82 2.393   
   external           88.16                   .005464      .4754        183                                   14.58 1.808   
   internal           77.59                         0          0        349                                   14.36 1.782   
   otter              100.2                         0          0         53                                   33.49 3.215   
                           lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                2.5e-07 264.6  2038  .01887    

22:43:19 | running eval: valid
22:43:32 | eval completed in 12.28s
22:43:32 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .01893 53.69 635.2 626.3       0          0 13.91  171 .2283    .2006 14.68 2.482 2.5e-07   190 187.3   
   external         0  .02364 62.36                   0          0         44 .1915          15.89 2.771                       
   internal         0  .01422 45.02                   0          0        127 .2651          13.46 2.193                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.47    .2327      .4755         0                16056 825.2 813.7  
   external       0          0 15.97    .1922      .4363         0                                   
   internal       0          0 8.966    .2732      .5146         0
[0m
22:43:32 | saving mod

22:44:11 | time:2620s total_exs:261696 total_steps:16356 epochs:9.17
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                86.49     1  1375 12326  .00205      .1784 143.4  800             15073    inf    .4608 22.76 2.279   
   blended_skill_talk 93.74                   .005319      .4628        188                                   18.17 2.396   
   external           82.21                         0          0        220                                   15.23  1.82   
   internal           84.73                   .002882      .2507        347                                   14.19 1.737   
   otter              85.27                         0          0         45                                   43.44 3.162   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 271.1  2429  .01667  



22:44:18 | time:2627s total_exs:262496 total_steps:16406 epochs:9.20
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                  102     1  1449 10682 .001244     .03234 117.9  800              8192  9.966    .4878 19.39 2.346   
   blended_skill_talk 99.19                   .004975      .1294        201                                   20.33 2.525   
   external           87.48                         0          0        226                                   14.65 1.828   
   internal           81.05                         0          0        330                                   14.18 1.732   
   otter              140.3                         0          0         43                                    28.4   3.3   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 265.9  1960       0  

22:45:03 | time:2672s total_exs:266432 total_steps:16652 epochs:9.34
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                94.24     1  1497 13060  .00431      .6024 139.6  800              8192  10.37    .4711 19.37 2.272   
   blended_skill_talk 107.7                    .01724      2.409        232                                   18.54 2.413   
   external           90.21                         0          0        196                                   14.92 1.877   
   internal           87.11                         0          0        300                                   14.27  1.75   
   otter              91.96                         0          0         72                                   29.75 3.049   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 272.1  2374 .006944  

22:45:41 | time:2710s total_exs:271168 total_steps:16948 epochs:9.50
                       clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                87.82     1  1348 11898 .006232      .5052 141.2  736              8192  10.15    .4624 18.85  2.25   
   blended_skill_talk 94.95                         0          0        154                                   16.94 2.431   
   external           81.82                   .004926      .2808        203                                   14.35 1.832   
   internal           79.62                         0          0        329                                   14.17 1.706   
   otter              94.88                     .0200       1.74         50                                   29.96  3.03   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 253.9  2241       0  



22:46:21 | time:2749s total_exs:274368 total_steps:17148 epochs:9.61
                       clen  clip  ctpb  ctps   ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss  \
   all                97.27     1  1482 11101 .0008361     .01589 119.9  800              4096  10.01    .5349 23.97  2.33   
   blended_skill_talk 98.83                          0          0        263                                    17.9 2.372   
   external           88.78                          0          0        196                                   16.07  1.84   
   internal           86.57                    .003344     .06355        299                                   14.68 1.841   
   otter              114.9                          0          0         42                                   47.24 3.268   
                            lr  ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all                1.25e-07 281.6  2110  .0

22:46:51 | running eval: valid
22:47:03 | eval completed in 12.11s
22:47:03 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02029 53.69 635.2   638       0          0 14.17  171 .2315    .2003 14.68 2.482 1.25e-07   190 190.8   
   external         0  .02364 62.36                   0          0         44 .1943          15.89 2.771                        
   internal         0  .01695 45.02                   0          0        127 .2687          13.46 2.194                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 12.47    .2363      .4748         0                17394 825.2 828.8  
   external       0          0 15.97    .1950      .4349         0                                   
   internal       0          0 8.969    .2776      .5146         0
[0m
22:47:04 | saving

0,1
internal/exs/train,301.0
exs/train,736.0
internal/clen/train,84.81063
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,14.06645
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,1.7448
internal/ppl/train,5.72477


0,1
internal/exs/train,▃▅▆▆▁▄▄▅▃█▆▇▂▃▃▃▅▄▆▄▃▅▃▆█▅▄▄▄▆▇▇▃▅▆▆▄▄▂▆
exs/train,████▁███▁███▁███▁███▁███████████████████
internal/clen/train,▄▂▄▃▇▄▃▆▃▂▄▅▃▄▆▄▆▅▃▆▄▄▆▆▂▁▂▄▂▁▁▄▃▃▂▅█▅▅▄
internal/ctrunc/train,▁▁▅▃▄▁█▁▆▁▁▁▁▁▁▁▃▃▁▁▁▁▁▁▁▁▁▃▃▁▃▅▃▁▃▃▁▁▁▁
internal/ctrunclen/train,▁▁▆▄▁▁█▁▄▁▁▁▁▁▁▁▄▂▁▁▁▁▁▁▁▁▁▂▂▁▁▄▁▁▁▂▁▁▁▁
internal/llen/train,▆▄▆▅▅▂▇▆▁▃▃▅▃▆▅▃▅▇▆▅▅▄▆▃▇▅▄▄▄▅▇▄▇▃▃▄█▄▅█
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,█▇▇▆▆▆▆▆▅▄▄▄▄▄▄▄▄▄▃▃▃▃▃▂▃▂▂▂▂▂▂▁▂▂▂▂▂▁▁▂
internal/ppl/train,█▇▆▆▆▅▅▅▄▄▃▄▃▃▃▃▃▃▃▂▂▂▃▂▂▂▂▂▁▂▂▁▁▁▁▁▁▁▁▂


# Mutator runs below

Mutators change the dataset

## 5. Blender base +  word_shuffle

### Datasets with sampling weights:
- internal: 6
- external: 3


### Mutators
- word_shuffle


In [8]:
tasks='internal,external'
weights= '6,3'
mutators = 'word_shuffle'

run_training(tasks=tasks, weights=weights, mutators=mutators)

22:48:00 | building dictionary first...
22:48:00 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-word_shuffle/model(.opt)
22:48:00 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: word_shuffle,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_vocab

22:48:01 |     topk: 10
22:48:01 |     topp: 0.9
22:48:01 |     truncate: -1
22:48:01 |     update_freq: 1
22:48:01 |     use_reply: label
22:48:01 |     validation_cutoff: 1.0
22:48:01 |     validation_every_n_epochs: 0.25
22:48:01 |     validation_every_n_secs: -1
22:48:01 |     validation_every_n_steps: -1
22:48:01 |     validation_max_exs: 20000
22:48:01 |     validation_metric: ppl
22:48:01 |     validation_metric_mode: min
22:48:01 |     validation_patience: 15
22:48:01 |     validation_share_agent: False
22:48:01 |     variant: xlm
22:48:01 |     verbose: False
22:48:01 |     wandb_entity: None
22:48:01 |     wandb_log: True
22:48:01 |     wandb_name: None
22:48:01 |     wandb_project: parlaiemely
22:48:01 |     warmup_rate: 0.0001
22:48:01 |     warmup_updates: -1
22:48:01 |     weight_decay: None
22:48:01 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
22:48:01 | creating task(s): internal,external
22:48:01 | Loading ParlAI text data: /home/alex/ParlaiEmely/P

22:48:18 | training...
22:48:20 | time:19s total_exs:256 total_steps:16 epochs:0.26
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      55.25     1 896.3  8856       0          0 158.1  256             32768  12.59    .2659 14.08  3.14 1e-06 224.9   
   external 53.28                         0          0         78                                   14.15 3.358               
   internal 57.22                         0          0        178                                   14.01 2.922               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2222       0          0 23.65      .3658         0                   16 1121 11078 10.04  
   external             0          0 28.72      .3207         0                                        
   internal             0          0 18.57      .4110         0

22:48:20 | creating task(s): in



22:48:40 | running eval: valid
22:48:55 | eval completed in 15.25s
22:48:55 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .00631 53.69 635.2 518.2       0          0 11.51  171 .1426    .2005 14.68 3.013 1e-06   190   155   
   external         0  .00406 62.36                   0          0         44 .1217          15.89 3.242                     
   internal         0  .00856 45.02                   0          0        127 .1635          13.46 2.783                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 20.88    .1411      .3895         0                   32 825.2 673.3  
   external       0          0 25.58    .1177      .3691         0                                   
   internal       0          0 16.17    .1645      .4099         0
[0m
22:48:55 | saving model check

22:49:57 | time:116s total_exs:1536 total_steps:96 epochs:1.59
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      55.16     1 883.2  9251       0          0 167.6  256             16384  12.26    .2636 14.81 2.951 1e-06   236   
   external 54.99                         0          0         97                                   15.07 3.062               
   internal 55.33                         0          0        159                                   14.55 2.839               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2472       0          0 19.24      .3967         0                   96 1119 11722 10.48  
   external             0          0 21.38      .3851         0                                        
   internal             0          0  17.1      .4084         0

22:49:57 | running eval: valid
22:50:11 | eval compl



22:50:51 | running eval: valid
22:51:05 | eval completed in 13.39s
22:51:05 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .001983 53.69 635.2 598.6       0          0  13.3  171 .1757    .2005 14.68 2.904 1e-06   190   179   
   external         0 2.095e-06 62.36                   0          0         44 .1620          15.89 3.153                     
   internal         0   .003964 45.02                   0          0        127 .1893          13.46 2.655                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 18.81    .1751      .4143         0                  144 825.2 777.6  
   external       0          0 23.41    .1588      .3906         0                                   
   internal       0          0 14.22    .1914      .4380         0
[0m
22:51:05 | saving mod

22:51:59 | running eval: valid
22:52:12 | eval completed in 12.77s
22:52:12 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .002668 53.69 635.2 614.7       0          0 13.66  171 .1836    .2005 14.68 2.877 1e-06   190 183.9   
   external         0 1.018e-05 62.36                   0          0         44 .1642          15.89 3.135                     
   internal         0   .005326 45.02                   0          0        127 .2031          13.46 2.619                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 18.36    .1858      .4155         0                  208 825.2 798.6  
   external       0          0    23    .1631      .3906         0                                   
   internal       0          0 13.73    .2085      .4404         0
[0m
22:52:12 | saving mod

22:53:07 | running eval: valid
22:53:19 | eval completed in 12.44s
22:53:19 | [1mvalid:
             accuracy   bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .002825 53.69 635.2 643.3       0          0 14.29  171 .1781    .2004 14.68 2.859 1e-06   190 192.4   
   external         0 6.56e-06 62.36                   0          0         44 .1551          15.89 3.126                     
   internal         0  .005643 45.02                   0          0        127 .2012          13.46 2.592                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 18.07    .1782      .4120         0                  272 825.2 835.7  
   external       0          0 22.78    .1547      .3877         0                                   
   internal       0          0 13.36    .2017      .4363         0
[0m
22:53:19 | saving model c

22:54:11 | running eval: valid
22:54:23 | eval completed in 12.48s
22:54:23 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .002491 53.69 635.2 632.6       0          0 14.05  171 .1762    .2003 14.68 2.827 1e-06   190 189.2   
   external         0 2.382e-06 62.36                   0          0         44 .1485          15.89 3.076                     
   internal         0   .004979 45.02                   0          0        127 .2039          13.46 2.579                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 17.42    .1749      .4193         0                  336 825.2 821.8  
   external       0          0 21.66    .1490      .3977         0                                   
   internal       0          0 13.18    .2009      .4409         0
[0m
22:54:23 | saving mod

22:55:17 | running eval: valid
22:55:30 | eval completed in 12.32s
22:55:30 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .005871 53.69 635.2 641.6       0          0 14.25  171 .1816    .2005 14.68 2.805 1e-06   190 191.9   
   external         0 4.116e-06 62.36                   0          0         44 .1477          15.89 3.058                     
   internal         0    .01174 45.02                   0          0        127 .2156          13.46 2.552                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 17.06    .1777      .4194         0                  400 825.2 833.6  
   external       0          0 21.29    .1423      .3863         0                                   
   internal       0          0 12.83    .2131      .4526         0
[0m
22:55:30 | saving mod

22:56:20 | running eval: valid
22:56:32 | eval completed in 12.00s
22:56:32 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .003071 53.69 635.2 657.8       0          0 14.61  171 .1880    .2005 14.68 2.795 1e-06   190 196.7   
   external         0 .001314 62.36                   0          0         44 .1595          15.89 3.069                     
   internal         0 .004828 45.02                   0          0        127 .2165          13.46 2.521                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.98    .1846      .4284         0                  464 825.2 854.5  
   external       0          0 21.53    .1540      .4006         0                                   
   internal       0          0 12.44    .2151      .4561         0
[0m
22:56:32 | saving model check

22:57:24 | running eval: valid
22:57:36 | eval completed in 11.71s
22:57:36 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .004731 53.69 635.2 678.2       0          0 15.06  171 .1807    .2004 14.68 2.784 1e-06   190 202.8   
   external         0 5.003e-06 62.36                   0          0         44 .1429          15.89  3.05                     
   internal         0   .009458 45.02                   0          0        127 .2185          13.46 2.519                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 16.76    .1785      .4220         0                  528 825.2  881  
   external       0          0 21.11    .1400      .3920         0                                  
   internal       0          0 12.42    .2171      .4520         0
[0m
22:57:36 | saving model 

22:58:24 | running eval: valid
22:58:36 | eval completed in 11.69s
22:58:36 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .003629 53.69 635.2 670.7       0          0  14.9  171 .1838    .2005 14.68  2.79 5e-07   190 200.6   
   external         0 1.046e-05 62.36                   0          0         44 .1379          15.89 3.064                     
   internal         0   .007247 45.02                   0          0        127 .2298          13.46 2.515                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.89    .1771      .4259         0                  592 825.2 871.3  
   external       0          0 21.42    .1313      .4020         0                                   
   internal       0          0 12.37    .2228      .4497         0
[0m
22:58:36 | saving mod

22:59:27 | running eval: valid
22:59:39 | eval completed in 11.58s
22:59:39 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .005087 53.69 635.2 678.5       0          0 15.07  171 .1800    .2005 14.68 2.775 5e-07   190   203   
   external         0 1.565e-05 62.36                   0          0         44 .1373          15.89 3.036                     
   internal         0    .01016 45.02                   0          0        127 .2227          13.46 2.513                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.58    .1758      .4246         0                  656 825.2 881.5  
   external       0          0 20.82    .1376      .3977         0                                   
   internal       0          0 12.34    .2141      .4515         0
[0m
22:59:39 | saving mod

23:00:28 | running eval: valid
23:00:40 | eval completed in 11.82s
23:00:40 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  \
   all              0   .003545 53.69 635.2 665.6       0          0 14.78  171 .1819    .2003 14.68  2.76 2.5e-07   190   
   external         0 9.659e-06 62.36                   0          0         44 .1502          15.89 3.024                 
   internal         0    .00708 45.02                   0          0        127 .2136          13.46 2.495                 
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      199.1       0          0 16.35    .1801      .4306         0                  720 825.2 864.6  
   external             0          0 20.58    .1461      .3963         0                                   
   internal             0          0 12.12    .2140      .4649         0
[0m
23:00:40 | sa

23:01:30 | running eval: valid
23:01:42 | eval completed in 11.82s
23:01:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0 .008636 53.69 635.2 671.3       0          0 14.91  171 .1872    .2005 14.68 2.747 2.5e-07   190 200.8   
   external         0 .008132 62.36                   0          0         44 .1657          15.89 2.999                       
   internal         0  .00914 45.02                   0          0        127 .2087          13.46 2.494                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.09    .1787      .4374         0                  784 825.2 872.2  
   external       0          0 20.07    .1524      .4163         0                                   
   internal       0          0  12.1    .2050      .4585         0
[0m
23:01:42 | saving mod

23:02:33 | running eval: valid
23:02:45 | eval completed in 11.98s
23:02:45 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  \
   all              0   .004347 53.69 635.2 661.4       0          0 14.69  171 .1899    .2005 14.68 2.744 2.5e-07   190   
   external         0 4.241e-06 62.36                   0          0         44 .1601          15.89 2.997                 
   internal         0   .008691 45.02                   0          0        127 .2198          13.46 2.491                 
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      197.8       0          0 16.05    .1854      .4347         0                  848 825.2 859.3  
   external             0          0 20.03    .1526      .4092         0                                   
   internal             0          0 12.07    .2181      .4602         0
[0m
23:02:45 | sa

23:03:35 | running eval: valid
23:03:47 | eval completed in 11.91s
23:03:47 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  \
   all              0    .00282 53.69 635.2 657.1       0          0  14.6  171 .1826    .2005 14.68 2.773 2.5e-07   190   
   external         0 5.001e-06 62.36                   0          0         44 .1497          15.89 3.039                 
   internal         0   .005634 45.02                   0          0        127 .2155          13.46 2.508                 
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      196.5       0          0 16.58    .1759      .4259         0                  912 825.2 853.7  
   external             0          0 20.88    .1424      .3891         0                                   
   internal             0          0 12.28    .2094      .4626         0
[0m
23:03:47 | sa

23:04:35 | running eval: valid
23:04:47 | eval completed in 11.91s
23:04:47 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0 .005785 53.69 635.2 672.9       0          0 14.95  171 .1854    .2004 14.68 2.755 1.25e-07   190 201.3   
   external         0 .001318 62.36                   0          0         44 .1561          15.89  3.02                        
   internal         0  .01025 45.02                   0          0        127 .2146          13.46 2.489                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.27    .1883      .4349         0                  976 825.2 874.2  
   external       0          0 20.49    .1564      .4020         0                                   
   internal       0          0 12.05    .2202      .4678         0
[0m
23:04:47 | saving

23:05:37 | running eval: valid
23:05:49 | eval completed in 11.68s
23:05:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0 .008198 53.69 635.2 680.5       0          0 15.12  171 .1869    .2003 14.68 2.764 6.25e-08   190 203.6   
   external         0 .004014 62.36                   0          0         44 .1602          15.89  3.05                        
   internal         0  .01238 45.02                   0          0        127 .2136          13.46 2.477                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.51    .1795      .4295         0                 1040 825.2 884.1  
   external       0          0 21.12    .1508      .3934         0                                   
   internal       0          0 11.91    .2081      .4655         0
[0m
23:05:49 | saving

23:06:38 | running eval: valid
23:06:49 | eval completed in 11.75s
23:06:49 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  \
   all              0   .003207 53.69 635.2 665.9       0          0 14.79  171 .1819    .2004 14.68 2.746 6.25e-08   190   
   external         0 5.012e-06 62.36                   0          0         44 .1502          15.89 3.011                  
   internal         0   .006409 45.02                   0          0        127 .2135          13.46 2.482                  
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all      199.2       0          0 16.13    .1787      .4338         0                 1104 825.2  865  
   external             0          0  20.3    .1408      .4092         0                                  
   internal             0          0 11.96    .2165      .4585         0
[0m
23:06:49 | s

23:07:38 | running eval: valid
23:07:49 | eval completed in 11.55s
23:07:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .00385 53.69 635.2 674.5       0          0 14.98  171 .1838    .2004 14.68 2.762 3.125e-08   190   
   external         0 .004012 62.36                   0          0         44 .1514          15.89 3.026                   
   internal         0 .003687 45.02                   0          0        127 .2163          13.46 2.498                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      201.7       0          0 16.39    .1766      .4282         0                 1168 825.2 876.2  
   external             0          0 20.62    .1410      .3991         0                                   
   internal             0          0 12.16    .2122      .4573         0
[0m
23:07:49 | sa

23:08:38 | running eval: valid
23:08:50 | eval completed in 11.80s
23:08:50 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0    .00713 53.69 635.2   668       0          0 14.84  171 .1784    .2003 14.68 2.761 1.562e-08   190   
   external         0 1.428e-05 62.36                   0          0         44 .1431          15.89 3.031                   
   internal         0    .01425 45.02                   0          0        127 .2136          13.46 2.491                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      199.8       0          0 16.39    .1760      .4362         0                 1232 825.2 867.8  
   external             0          0 20.71    .1338      .4063         0                                   
   internal             0          0 12.08    .2182      .4661         0
[0m
23:08

0,1
internal/exs/train,184.0
exs/train,256.0
internal/clen/train,52.1413
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,14.11957
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.60868
internal/ppl/train,13.58111


0,1
internal/exs/train,▇▃▅▅▆▂▆▅▃▅▄▅▆▄▃▆▅▄▇▂▇▅▄▃▆▅▃▄▄▂▆▃▅▃▆▄▃▁▄█
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,▅▅▃▄▄▄▂▄▅▃▅▆▁▁█▃▆▂▁▆▄▅▄▆▆▃▂▇▄▄▄▆▂▇▃▄▅▇▁▁
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/llen/train,▃▅▃▆▃█▅▄▅▁▆▅▅▁▂▃█▇▃▂▂▄▄▄▅▁▄▅▆▄▅▂▃█▃▆▆▇▅▄
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,█▇▇▇▆▇▄▅▅▅▄▄▃▂▃▂▃▂▃▄▃▂▅▃▃▂▁▄▄▁▃▂▂▄▁▄▃▃▃▄
internal/ppl/train,█▇▆▇▆▆▄▅▄▅▄▃▃▂▃▂▂▂▃▄▃▂▄▃▂▁▁▄▄▁▂▂▂▃▁▄▃▃▃▃


## 6. Blender base +  last_turn

### Datasets with sampling weights:
- internal: 6
- external: 3


### Mutators
- last turn

In [9]:
tasks='internal,external'
weights= '6,3'
mutators = 'last_turn'

run_training(tasks=tasks, weights=weights, mutators=mutators)

23:09:46 | building dictionary first...
23:09:46 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model(.opt)
23:09:46 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: last_turn,preserve_context: True,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_ls

23:09:47 |     text_truncate: 512
23:09:47 |     topk: 10
23:09:47 |     topp: 0.9
23:09:47 |     truncate: -1
23:09:47 |     update_freq: 1
23:09:47 |     use_reply: label
23:09:47 |     validation_cutoff: 1.0
23:09:47 |     validation_every_n_epochs: 0.25
23:09:47 |     validation_every_n_secs: -1
23:09:47 |     validation_every_n_steps: -1
23:09:47 |     validation_max_exs: 20000
23:09:47 |     validation_metric: ppl
23:09:47 |     validation_metric_mode: min
23:09:47 |     validation_patience: 15
23:09:47 |     validation_share_agent: False
23:09:47 |     variant: xlm
23:09:47 |     verbose: False
23:09:47 |     wandb_entity: None
23:09:47 |     wandb_log: True
23:09:47 |     wandb_name: None
23:09:47 |     wandb_project: parlaiemely
23:09:47 |     warmup_rate: 0.0001
23:09:47 |     warmup_updates: -1
23:09:47 |     weight_decay: None
23:09:47 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
23:09:48 | creating task(s): internal,external
23:09:48 | Loading ParlAI t

23:10:05 | training...
23:10:06 | time:19s total_exs:256 total_steps:16 epochs:0.26
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all      40.08 .9375 636.8  6665       0          0 167.4  256             29696    .2574 15.35 2.923 1e-06 239.4  2505   
   external 40.92                         0          0         86                            16.52 3.167                     
   internal 39.24                         0          0        170                            14.17 2.679                     
             ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all            0          0 19.15      .4068         0                   16 876.2 9171 10.53  
   external       0          0 23.73      .3786         0                                        
   internal       0          0 14.57      .4350         0





23:10:06 | creating task(s): internal
23:10:06 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/internal/valid.txt
23:10:06 | creating task(s): external
23:10:06 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/external/valid.txt
23:10:06 | running eval: valid
23:10:24 | eval completed in 17.28s
23:10:24 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .003699  38.7 458.4 340.3       0          0 10.48  171 .1694    .2004 14.68 2.855 1e-06   190   141   
   external         0 .005627 44.91                   0          0         44 .1624          15.89 3.125                     
   internal         0  .00177 32.49                   0          0        127 .1764          13.46 2.586                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 18.02    .1744  

23:11:06 | [1;32mnew best ppl: 17.84 (previous best was 18.02)[0m
23:11:06 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model
23:11:09 | time:81s total_exs:1280 total_steps:80 epochs:1.32
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      39.14     1 621.1  7132       0          0 183.7  256             16384  12.91    .2593 15.08  2.92 1e-06 235.9   
   external 40.24                         0          0         91                                   16.23 3.182               
   internal 38.03                         0          0        165                                   13.92 2.657               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all       2709       0          0 19.18      .4056         0                   80 856.9 9841 11.49  
   external             0      

23:11:58 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:12:00 | [1;32mnew best ppl: 16.89 (previous best was 17.09)[0m
23:12:00 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model
23:12:01 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:12:04 | time:136s total_exs:2304 total_steps:144 epochs:2.38
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      40.85     1 649.3  7176       0          0 176.8  256             16384  12.64    .2343 14.89 2.689 1e-06 233.1   
   external 41.56                         0          0         79                                   15.72 2.802               
   internal 40.15                         0          0        177                               

23:12:50 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01552 39.61   519   409       0          0 12.61  102 .2382    .2006 14.75 2.753 1e-06 222.4 175.3   
   external         0  .02347 46.77                   0          0         22 .2374          15.59 3.039                     
   internal         0 .007577 32.44                   0          0         80 .2389           13.9 2.466                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.33    .2432      .4423         0                  192 741.4 584.3  
   external       0          0 20.89    .2507      .4052         0                                   
   internal       0          0 11.78    .2357      .4793         0
[0m
23:12:50 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,ex



23:13:09 | running eval: valid
23:13:18 | eval completed in 8.28s
23:13:18 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519   411       0          0 12.67  102 .2322    .2004 14.75  2.74 1e-06 222.4 176.1   
   external         0 .007226 46.77                   0          0         22 .2234          15.59 3.026                     
   internal         0 .007577 32.44                   0          0         80 .2410           13.9 2.454                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.12    .2382      .4461         0                  224 741.4 587.1  
   external       0          0 20.61    .2383      .4111         0                                   
   internal       0          0 11.64    .2382      .4811         0
[0m
23:13:18 | saving model checkp

23:13:59 | running eval: valid
23:14:07 | eval completed in 7.99s
23:14:07 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01552 39.61   519   423       0          0 13.04  102 .2198    .2007 14.75 2.721 1e-06 222.4 181.3   
   external         0  .02347 46.77                   0          0         22 .2010          15.59 3.006                     
   internal         0 .007579 32.44                   0          0         80 .2387           13.9 2.435                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.81    .2259      .4456         0                  288 741.4 604.3  
   external       0          0 20.21    .2117      .4111         0                                   
   internal       0          0 11.42    .2401      .4802         0
[0m
23:14:07 | saving model checkp

23:14:49 | running eval: valid
23:14:57 | eval completed in 7.85s
23:14:57 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007402 39.61   519 431.2       0          0 13.29  102 .2170    .2007 14.75 2.706 1e-06 222.4 184.8   
   external         0 .007226 46.77                   0          0         22 .1941          15.59 2.992                     
   internal         0 .007579 32.44                   0          0         80 .2399           13.9  2.42                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.59    .2276      .4456         0                  352 741.4 615.9  
   external       0          0 19.93    .2145      .4111         0                                   
   internal       0          0 11.25    .2407      .4802         0
[0m
23:14:57 | saving model checkp

23:15:39 | running eval: valid
23:15:47 | eval completed in 7.72s
23:15:47 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007404 39.61   519 436.6       0          0 13.46  102 .2359    .2004 14.75 2.693 1e-06 222.4 187.1   
   external         0 .007226 46.77                   0          0         22 .2215          15.59 2.978                     
   internal         0 .007583 32.44                   0          0         80 .2502           13.9 2.409                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.38    .2448      .4489         0                  416 741.4 623.8  
   external       0          0 19.64    .2339      .4140         0                                   
   internal       0          0 11.13    .2557      .4838         0
[0m
23:15:47 | saving model checkp

23:16:28 | running eval: valid
23:16:36 | eval completed in 7.55s
23:16:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01553 39.61   519 448.8       0          0 13.84  102 .2468    .2006 14.75 2.684 1e-06 222.4 192.3   
   external         0  .02347 46.77                   0          0         22 .2402          15.59 2.968                     
   internal         0 .007583 32.44                   0          0         80 .2533           13.9 2.399                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.24    .2535      .4527         0                  480 741.4 641.1  
   external       0          0 19.46    .2509      .4198         0                                   
   internal       0          0 11.01    .2561      .4856         0
[0m
23:16:36 | saving model checkp

23:17:17 | running eval: valid
23:17:24 | eval completed in 7.44s
23:17:24 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01832 39.61   519 454.6       0          0 14.01  102 .2503    .2007 14.75 2.674 1e-06 222.4 194.8   
   external         0  .02347 46.77                   0          0         22 .2402          15.59 2.958                     
   internal         0  .01316 32.44                   0          0         80 .2603           13.9  2.39                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.08    .2580      .4561         0                  544 741.4 649.4  
   external       0          0 19.25    .2509      .4257         0                                   
   internal       0          0 10.91    .2651      .4865         0
[0m
23:17:24 | saving model checkp

23:18:04 | running eval: valid
23:18:12 | eval completed in 7.41s
23:18:12 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01972 39.61   519 457.1       0          0 14.09  102 .2538    .2006 14.75 2.666 1e-06 222.4 195.9   
   external         0  .02347 46.77                   0          0         22 .2402          15.59  2.95                     
   internal         0  .01597 32.44                   0          0         80 .2674           13.9 2.381                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.96    .2597      .4556         0                  608 741.4  653  
   external       0          0 19.11    .2509      .4257         0                                  
   internal       0          0 10.82    .2685      .4856         0
[0m
23:18:12 | saving model checkpoin

23:18:51 | running eval: valid
23:18:59 | eval completed in 7.24s
23:18:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01832 39.61   519 466.2       0          0 14.37  102 .2415    .2006 14.75  2.66 1e-06 222.4 199.8   
   external         0  .02347 46.77                   0          0         22 .2326          15.59 2.945                     
   internal         0  .01317 32.44                   0          0         80 .2505           13.9 2.376                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.88    .2516      .4580         0                  672 741.4  666  
   external       0          0 19.01    .2490      .4286         0                                  
   internal       0          0 10.76    .2542      .4874         0
[0m
23:18:59 | saving model checkpoin

23:19:39 | running eval: valid
23:19:46 | eval completed in 7.19s
23:19:46 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0102 39.61   519 470.1       0          0 14.49  102 .2443    .2004 14.75 2.654 1e-06 222.4 201.4   
   external         0 .007226 46.77                   0          0         22 .2340          15.59 2.938                     
   internal         0  .01317 32.44                   0          0         80 .2546           13.9  2.37                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.78    .2509      .4562         0                  736 741.4 671.5  
   external       0          0 18.87    .2454      .4286         0                                   
   internal       0          0 10.69    .2563      .4838         0
[0m
23:19:46 | saving model checkp

23:20:25 | running eval: valid
23:20:33 | eval completed in 7.26s
23:20:33 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01917 39.61   519 470.5       0          0  14.5  102 .2401    .2006 14.75 2.647 1e-06 222.4 201.6   
   external         0  .02206 46.77                   0          0         22 .2219          15.59  2.93                     
   internal         0  .01627 32.44                   0          0         80 .2584           13.9 2.364                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.68    .2518      .4596         0                  800 741.4 672.1  
   external       0          0 18.73    .2426      .4344         0                                   
   internal       0          0 10.64    .2610      .4847         0
[0m
23:20:33 | saving model checkp

23:21:12 | running eval: valid
23:21:19 | eval completed in 7.19s
23:21:19 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .009858 39.61   519 472.1       0          0 14.55  102 .2245    .2005 14.75 2.642 1e-06 222.4 202.3   
   external         0 1.395e-05 46.77                   0          0         22 .1815          15.59 2.925                     
   internal         0     .0197 32.44                   0          0         80 .2675           13.9 2.359                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.61    .2398      .4596         0                  864 741.4 674.4  
   external       0          0 18.64    .2086      .4344         0                                   
   internal       0          0 10.58    .2710      .4847         0
[0m
23:21:19 | saving mode

23:22:00 | time:732s total_exs:14848 total_steps:928 epochs:15.35
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      41.09     1 645.1  6984       0          0 173.2  256              8192  11.23    .2554 14.46 2.472 1e-06 230.9   
   external 43.11                         0          0         79                                   14.53 2.533               
   internal 39.07                         0          0        177                                   14.38 2.411               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all       2500       0          0 11.87      .4747         0                  928 875.9 9484 10.83  
   external             0          0 12.59      .4695         0                                        
   internal             0          0 11.14      .4800         0

23:22:00 | running eval: valid
23:22:07 | eval co

23:22:41 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:22:42 | [1;32mnew best ppl: 14.45 (previous best was 14.48)[0m
23:22:42 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model
23:22:45 | time:777s total_exs:15872 total_steps:992 epochs:16.41
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      41.68     1   666  7952       0          0   191  256              8192   10.9    .2593 15.19 2.485 1e-06 240.4   
   external 42.03                         0          0        107                                   16.21 2.656               
   internal 41.34                         0          0        149                                   14.17 2.313               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb   tps   up

23:23:28 | eval completed in 7.13s
23:23:28 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .009968 39.61   519 473.3       0          0 14.59  102 .2264    .2004 14.75 2.628 1e-06 222.4 202.8   
   external         0 1.395e-05 46.77                   0          0         22 .1770          15.59 2.908                     
   internal         0    .01992 32.44                   0          0         80 .2757           13.9 2.348                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.39    .2429      .4611         0                 1040 741.4 676.1  
   external       0          0 18.31    .2052      .4402         0                                   
   internal       0          0 10.47    .2806      .4820         0
[0m
23:23:28 | saving model checkpoint: /home/alex/Parlai

23:24:06 | running eval: valid
23:24:13 | eval completed in 7.10s
23:24:13 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02025 39.61   519 477.8       0          0 14.73  102 .2321    .2004 14.75 2.624 1e-06 222.4 204.7   
   external         0  .01625 46.77                   0          0         22 .1886          15.59 2.903                     
   internal         0  .02424 32.44                   0          0         80 .2756           13.9 2.345                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.33    .2487      .4626         0                 1104 741.4 682.5  
   external       0          0 18.23    .2179      .4431         0                                   
   internal       0          0 10.43    .2795      .4820         0
[0m
23:24:13 | saving model checkp

23:24:52 | running eval: valid
23:24:59 | eval completed in 7.03s
23:24:59 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0    .01223 39.61   519 480.9       0          0 14.82  102 .2285    .2006 14.75  2.62 1e-06 222.4 206.1   
   external         0 1.395e-05 46.77                   0          0         22 .1858          15.59 2.899                     
   internal         0    .02446 32.44                   0          0         80 .2711           13.9 2.342                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.28    .2478      .4621         0                 1168 741.4  687  
   external       0          0 18.15    .2169      .4431         0                                  
   internal       0          0  10.4    .2787      .4811         0
[0m
23:24:59 | saving model c

23:25:39 | time:952s total_exs:19712 total_steps:1232 epochs:20.38
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      40.23     1 641.3  6880       0          0 171.6  256              8192  11.19    .2574 14.21 2.365 1e-06 226.7   
   external 40.74                         0          0         91                                   14.37 2.561               
   internal 39.72                         0          0        165                                   14.05 2.169               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb  tps   ups  
   all       2432       0          0 10.84      .4964         0                 1232  868 9312 10.73  
   external             0          0 12.94      .4740         0                                       
   internal             0          0 8.747      .5188         0

23:25:39 | running eval: valid
23:25:46 | eval comp

23:26:21 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:26:22 | [1;32mnew best ppl: 14.17 (previous best was 14.18)[0m
23:26:22 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model
23:26:25 | time:997s total_exs:20736 total_steps:1296 epochs:21.44
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all       40.6     1   639  6811       0          0 170.5  256              8192  10.31    .2574 15.16 2.249 1e-06 237.4   
   external  42.9                         0          0         91                                   16.26  2.44               
   internal  38.3                         0          0        165                                   14.05 2.059               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   up

23:27:08 | eval completed in 7.00s
23:27:08 | [1mvalid:
             accuracy   bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .01648 39.61   519 480.9       0          0 14.83  102 .2355    .2003 14.75 2.611 1e-06 222.4 206.1   
   external         0 1.71e-05 46.77                   0          0         22 .1978          15.59 2.886                     
   internal         0   .03295 32.44                   0          0         80 .2732           13.9 2.335                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.13    .2509      .4606         0                 1344 741.4  687  
   external       0          0 17.92    .2235      .4373         0                                  
   internal       0          0 10.33    .2783      .4838         0
[0m
23:27:08 | saving model checkpoint: /home/alex/ParlaiEmely/m

23:27:46 | running eval: valid
23:27:53 | eval completed in 7.01s
23:27:53 | [1mvalid:
             accuracy   bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .01355 39.61   519 480.2       0          0  14.8  102 .2377    .2005 14.75 2.609 1e-06 222.4 205.8   
   external         0 1.71e-05 46.77                   0          0         22 .1978          15.59 2.886                     
   internal         0   .02709 32.44                   0          0         80 .2777           13.9 2.333                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.11    .2532      .4638         0                 1408 741.4 685.9  
   external       0          0 17.91    .2235      .4402         0                                   
   internal       0          0 10.31    .2829      .4874         0
[0m
23:27:53 | saving model ch

23:28:31 | running eval: valid
23:28:38 | eval completed in 7.05s
23:28:38 | [1mvalid:
             accuracy   bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .01429 39.61   519   476       0          0 14.67  102 .2373    .2007 14.75 2.607 1e-06 222.4   204   
   external         0 1.71e-05 46.77                   0          0         22 .1978          15.59 2.882                     
   internal         0   .02856 32.44                   0          0         80 .2769           13.9 2.332                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.07    .2517      .4619         0                 1472 741.4 679.9  
   external       0          0 17.84    .2235      .4373         0                                   
   internal       0          0  10.3    .2800      .4865         0
[0m
23:28:38 | saving model ch

23:29:19 | running eval: valid
23:29:26 | eval completed in 7.09s
23:29:26 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02368 39.61   519 473.1       0          0 14.58  102 .2470    .2006 14.75 2.604 1e-06 222.4 202.7   
   external         0  .01436 46.77                   0          0         22 .2170          15.59 2.879                     
   internal         0   .0330 32.44                   0          0         80 .2770           13.9  2.33                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.03    .2606      .4628         0                 1536 741.4 675.8  
   external       0          0 17.79    .2401      .4373         0                                   
   internal       0          0 10.28    .2811      .4883         0
[0m
23:29:26 | saving model checkp

23:30:06 | running eval: valid
23:30:13 | eval completed in 7.06s
23:30:13 | [1mvalid:
             accuracy   bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .01651 39.61   519 474.9       0          0 14.64  102 .2416    .2005 14.75 2.602 1e-06 222.4 203.5   
   external         0 1.71e-05 46.77                   0          0         22 .2059          15.59 2.877                     
   internal         0    .0330 32.44                   0          0         80 .2772           13.9 2.328                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.01    .2529      .4646         0                 1600 741.4 678.4  
   external       0          0 17.76    .2235      .4373         0                                   
   internal       0          0 10.25    .2822      .4919         0
[0m
23:30:13 | saving model ch

23:30:51 | running eval: valid
23:30:58 | eval completed in 7.09s
23:30:58 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0    .01286 39.61   519 470.2       0          0 14.49  102 .2251    .2006 14.75   2.6 1e-06 222.4 201.5   
   external         0 3.182e-06 46.77                   0          0         22 .1814          15.59 2.874                     
   internal         0    .02572 32.44                   0          0         80 .2688           13.9 2.326                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.97    .2389      .4637         0                 1664 741.4 671.6  
   external       0          0 17.71    .2014      .4373         0                                   
   internal       0          0 10.24    .2764      .4901         0
[0m
23:30:58 | saving mode

23:31:38 | time:1311s total_exs:27648 total_steps:1728 epochs:28.59
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      39.31     1 627.1  7006       0          0 178.8  256              8192   10.3    .2593 16.02 2.263 1e-06 248.3   
   external 39.77                         0          0         94                                    17.9 2.388               
   internal 38.86                         0          0        162                                   14.14 2.139               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all       2774       0          0  9.69      .5053         0                 1728 875.4 9781 11.18  
   external             0          0 10.89      .4896         0                                        
   internal             0          0 8.492      .5210         0

23:31:38 | running eval: valid
23:31:45 | eval 

23:32:19 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:32:20 | [1;32mnew best ppl: 13.88 (previous best was 13.88)[0m
23:32:21 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model
23:32:22 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-last_turn/model.checkpoint
23:32:25 | time:1357s total_exs:28672 total_steps:1792 epochs:29.65
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      40.82     1 659.5  7148       0          0 173.4  256              8192  10.29    .2554 15.03 2.249 1e-06 237.2   
   external  39.4                         0          0         92                                   15.76 2.285               
   internal 42.24                         0          0        164                           

23:33:06 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02887 39.61   519   475       0          0 14.64  102 .2449    .2006 14.75 2.592 1e-06 222.4 203.5   
   external         0  .03059 46.77                   0          0         22 .2161          15.59 2.863                     
   internal         0  .02714 32.44                   0          0         80 .2738           13.9 2.321                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.85    .2579      .4628         0                 1840 741.4 678.5  
   external       0          0 17.51    .2370      .4373         0                                   
   internal       0          0 10.19    .2787      .4883         0
[0m
23:33:06 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,ex

23:33:46 | running eval: valid
23:33:53 | eval completed in 7.03s
23:33:53 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02672 39.61   519   477       0          0 14.71  102 .2542    .2006 14.75  2.59 1e-06 222.4 204.4   
   external         0  .03062 46.77                   0          0         22 .2371          15.59 2.861                     
   internal         0  .02281 32.44                   0          0         80 .2714           13.9 2.319                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.82    .2634      .4623         0                 1904 741.4 681.4  
   external       0          0 17.47    .2518      .4344         0                                   
   internal       0          0 10.17    .2750      .4901         0
[0m
23:33:53 | saving model checkp

23:34:31 | running eval: valid
23:34:38 | eval completed in 7.08s
23:34:38 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01934 39.61   519 472.7       0          0 14.57  102 .2500    .2007 14.75 2.588 1e-06 222.4 202.5   
   external         0  .01437 46.77                   0          0         22 .2214          15.59 2.859                     
   internal         0  .02432 32.44                   0          0         80 .2786           13.9 2.318                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.79    .2607      .4623         0                 1968 741.4 675.2  
   external       0          0 17.44    .2394      .4344         0                                   
   internal       0          0 10.15    .2819      .4901         0
[0m
23:34:38 | saving model checkp

23:35:18 | running eval: valid
23:35:25 | eval completed in 7.03s
23:35:25 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519 475.7       0          0 14.67  102 .2518    .2004 14.75 2.585 1e-06 222.4 203.9   
   external         0  .01437 46.77                   0          0         22 .2214          15.59 2.855                     
   internal         0  .02433 32.44                   0          0         80 .2821           13.9 2.315                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.74    .2631      .4666         0                 2032 741.4 679.6  
   external       0          0 17.37    .2394      .4431         0                                   
   internal       0          0 10.12    .2867      .4901         0
[0m
23:35:25 | saving model checkp

23:36:05 | running eval: valid
23:36:12 | eval completed in 7.00s
23:36:12 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519 477.2       0          0 14.71  102 .2575    .2003 14.75 2.584 1e-06 222.4 204.5   
   external         0  .01438 46.77                   0          0         22 .2333          15.59 2.853                     
   internal         0  .02433 32.44                   0          0         80 .2818           13.9 2.314                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.73    .2689      .4680         0                 2096 741.4 681.7  
   external       0          0 17.35    .2495      .4431         0                                   
   internal       0          0 10.11    .2882      .4928         0
[0m
23:36:12 | saving model checkp

23:36:51 | running eval: valid
23:36:58 | eval completed in 7.00s
23:36:58 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519 479.5       0          0 14.78  102 .2575    .2004 14.75 2.582 1e-06 222.4 205.5   
   external         0  .01438 46.77                   0          0         22 .2258          15.59 2.851                     
   internal         0  .02433 32.44                   0          0         80 .2891           13.9 2.313                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.71    .2649      .4670         0                 2160 741.4  685  
   external       0          0 17.31    .2385      .4402         0                                  
   internal       0          0 10.11    .2914      .4937         0
[0m
23:36:58 | saving model checkpoin



23:37:25 | running eval: valid
23:37:32 | eval completed in 7.01s
23:37:32 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0223 39.61   519 478.2       0          0 14.74  102 .2519    .2005 14.75 2.582 1e-06 222.4 204.9   
   external         0  .01437 46.77                   0          0         22 .2246          15.59 2.851                     
   internal         0  .03023 32.44                   0          0         80 .2791           13.9 2.312                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  13.7    .2611      .4670         0                 2208 741.4 683.2  
   external       0          0 17.31    .2385      .4402         0                                   
   internal       0          0  10.1    .2838      .4937         0
[0m
23:37:32 | saving model checkp

23:38:11 | running eval: valid
23:38:18 | eval completed in 7.06s
23:38:18 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02304 39.61   519 475.1       0          0 14.64  102 .2501    .2004 14.75 2.581 1e-06 222.4 203.6   
   external         0  .01437 46.77                   0          0         22 .2139          15.59 2.849                     
   internal         0  .03172 32.44                   0          0         80 .2863           13.9 2.312                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.68    .2597      .4712         0                 2272 741.4 678.6  
   external       0          0 17.27    .2284      .4461         0                                   
   internal       0          0  10.1    .2910      .4964         0
[0m
23:38:18 | saving model checkp

23:38:56 | running eval: valid
23:39:03 | eval completed in 7.03s
23:39:03 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02087 39.61   519   477       0          0 14.71  102 .2471    .2007 14.75  2.58 1e-06 222.4 204.4   
   external         0  .01437 46.77                   0          0         22 .2139          15.59 2.848                     
   internal         0  .02738 32.44                   0          0         80 .2804           13.9 2.312                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.67    .2572      .4717         0                 2336 741.4 681.4  
   external       0          0 17.25    .2284      .4461         0                                   
   internal       0          0 10.09    .2860      .4973         0
[0m
23:39:03 | saving model checkp

23:39:40 | running eval: valid
23:39:47 | eval completed in 7.07s
23:39:47 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0290 39.61   519 474.5       0          0 14.63  102 .2557    .2004 14.75  2.58 1e-06 222.4 203.3   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.849                     
   internal         0  .02738 32.44                   0          0         80 .2836           13.9 2.311                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.68    .2651      .4693         0                 2400 741.4 677.9  
   external       0          0 17.27    .2408      .4431         0                                   
   internal       0          0 10.09    .2894      .4955         0
[0m
23:39:48 | saving model checkp

23:40:21 | running eval: valid
23:40:28 | eval completed in 7.05s
23:40:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0290 39.61   519 473.7       0          0  14.6  102 .2534    .2005 14.75  2.58 5e-07 222.4   203   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                     
   internal         0  .02738 32.44                   0          0         80 .2791           13.9 2.311                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.67    .2632      .4708         0                 2464 741.4 676.6  
   external       0          0 17.26    .2408      .4461         0                                   
   internal       0          0 10.08    .2856      .4955         0
[0m
23:40:29 | saving model checkp

23:41:02 | running eval: valid
23:41:09 | eval completed in 7.15s
23:41:09 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0   .0290 39.61   519 468.5       0          0 14.44  102 .2533    .2004 14.75 2.579 2.5e-07 222.4 200.8   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                       
   internal         0  .02738 32.44                   0          0         80 .2789           13.9  2.31                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.67    .2629      .4703         0                 2528 741.4 669.3  
   external       0          0 17.26    .2408      .4461         0                                   
   internal       0          0 10.08    .2850      .4946         0
[0m
23:41:09 | saving mode

23:41:44 | running eval: valid
23:41:52 | eval completed in 7.08s
23:41:52 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02824 39.61   519 473.3       0          0 14.59  102 .2545    .2004 14.75 2.579 1.25e-07 222.4 202.8   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                        
   internal         0  .02587 32.44                   0          0         80 .2812           13.9  2.31                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.66    .2633      .4708         0                 2592 741.4 676.1  
   external       0          0 17.25    .2408      .4461         0                                   
   internal       0          0 10.07    .2858      .4955         0
[0m
23:41:52 | saving 

23:42:27 | running eval: valid
23:42:34 | eval completed in 7.10s
23:42:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss       lr  ltpb  ltps  \
   all              0  .02824 39.61   519 471.8       0          0 14.55  102 .2541    .2006 14.75 2.579 6.25e-08 222.4 202.2   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                        
   internal         0  .02587 32.44                   0          0         80 .2805           13.9  2.31                        
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.66    .2629      .4708         0                 2656 741.4  674  
   external       0          0 17.26    .2408      .4461         0                                  
   internal       0          0 10.07    .2851      .4955         0
[0m
23:42:34 | saving mod

23:43:08 | running eval: valid
23:43:15 | eval completed in 7.13s
23:43:15 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02824 39.61   519 469.8       0          0 14.48  102 .2540    .2004 14.75 2.579 3.125e-08 222.4   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                   
   internal         0  .02587 32.44                   0          0         80 .2803           13.9  2.31                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      201.3       0          0 13.66    .2629      .4703         0                 2720 741.4 671.1  
   external             0          0 17.25    .2408      .4461         0                                   
   internal             0          0 10.07    .2851      .4946         0
[0m
23:43:15 | sav



23:43:18 | running eval: valid
23:43:25 | eval completed in 7.09s
23:43:25 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02824 39.61   519 472.4       0          0 14.56  102 .2551    .2005 14.75 2.579 3.125e-08 222.4   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                   
   internal         0  .02587 32.44                   0          0         80 .2824           13.9  2.31                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      202.4       0          0 13.66    .2637      .4708         0                 2736 741.4 674.8  
   external             0          0 17.26    .2408      .4461         0                                   
   internal             0          0 10.07    .2866      .4955         0
[0m
23:43:25 | sav

23:44:00 | running eval: valid
23:44:07 | eval completed in 7.05s
23:44:07 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .02824 39.61   519 475.7       0          0 14.66  102 .2540    .2004 14.75 2.579 1.562e-08 222.4   
   external         0  .03061 46.77                   0          0         22 .2278          15.59 2.848                   
   internal         0  .02587 32.44                   0          0         80 .2803           13.9  2.31                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      203.8       0          0 13.66    .2629      .4703         0                 2800 741.4 679.5  
   external             0          0 17.25    .2408      .4461         0                                   
   internal             0          0 10.07    .2851      .4946         0
[0m
23:44:07 | sav

0,1
internal/exs/train,180.0
exs/train,256.0
internal/clen/train,40.42222
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,13.96667
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.03321
internal/ppl/train,7.63859


0,1
internal/exs/train,▇▆▆▆▃▅█▅▆▇▄█▃▇▁▆▆▄▄▇▂▆▅▄▄▄▆▆▇▆▅▆▄▆▇█▅▅█▇
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,▂▃▁▂▅▂▃▃▂▂▅▅▄▄▇▃▄▄▃▁▃▂▇▆▄█▃▇▄▄▆▄▇▃▃▅▆▇▄▃
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/llen/train,▄█▄▅▃▇▂▄▆▆▄▆▃▆▅▃▃▃▁▆▅▄▃▇▄▅▄▆▄▄▂▃██▆▄▂▄▅▃
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,█▇▇▆▅▆▅▆▅▆▅▅▅▆▅▄▄▄▃▅▄▃▃▄▄▄▄▄▃▃▃▃▃▄▃▃▁▂▂▃
internal/ppl/train,█▇▆▅▅▅▅▅▄▅▄▄▄▅▄▃▃▃▂▄▃▃▃▄▃▄▃▃▂▂▂▂▂▃▂▂▁▂▂▂


## 7. Blender base + flatten

In [10]:
tasks='internal,external'
weights= '6,3'
mutators = 'flatten'

run_training(tasks=tasks, weights=weights, mutators=mutators)

23:45:04 | building dictionary first...
23:45:04 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model(.opt)
23:45:04 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: flatten,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history_add_global_end_token: None,special_tok_lst: None,bpe_vocab: None,bpe

23:45:05 |     topk: 10
23:45:05 |     topp: 0.9
23:45:05 |     truncate: -1
23:45:05 |     update_freq: 1
23:45:05 |     use_reply: label
23:45:05 |     validation_cutoff: 1.0
23:45:05 |     validation_every_n_epochs: 0.25
23:45:05 |     validation_every_n_secs: -1
23:45:05 |     validation_every_n_steps: -1
23:45:05 |     validation_max_exs: 20000
23:45:05 |     validation_metric: ppl
23:45:05 |     validation_metric_mode: min
23:45:05 |     validation_patience: 15
23:45:05 |     validation_share_agent: False
23:45:05 |     variant: xlm
23:45:05 |     verbose: False
23:45:05 |     wandb_entity: None
23:45:05 |     wandb_log: True
23:45:05 |     wandb_name: None
23:45:05 |     wandb_project: parlaiemely
23:45:05 |     warmup_rate: 0.0001
23:45:05 |     warmup_updates: -1
23:45:05 |     weight_decay: None
23:45:05 | Current ParlAI commit: adb7ed134aec01437ef8d4e6bf450827fd0fbdc2
23:45:05 | creating task(s): internal,external
23:45:05 | Loading ParlAI text data: /home/alex/ParlaiEmely/P

23:45:23 | training...
23:45:24 | time:19s total_exs:256 total_steps:16 epochs:0.26
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      51.15     1 823.3  8775       0          0 170.5  256             32768  12.37    .2710 14.82 2.857 1e-06 236.2   
   external 50.28                         0          0         83                                   14.96 2.944               
   internal 52.02                         0          0        173                                   14.67  2.77               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2518       0          0 17.48      .4039         0                   16 1060 11293 10.78  
   external             0          0    19      .3905         0                                        
   internal             0          0 15.97      .4173         0

23:45:24 | creating task(s): in



23:45:58 | running eval: valid
23:46:08 | eval completed in 9.88s
23:46:08 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .006064 39.61   519 347.8       0          0 10.72  102 .1786    .2007 14.75 2.857 1e-06 222.4 149.1   
   external         0 .007206 46.77                   0          0         22 .1675          15.59 3.143                     
   internal         0 .004923 32.44                   0          0         80 .1897           13.9 2.572                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 18.13    .1870      .4224         0                   48 741.4 496.9  
   external       0          0 23.16    .1875      .3907         0                                   
   internal       0          0 13.09    .1865      .4541         0
[0m
23:46:08 | saving model checkp

23:46:50 | running eval: valid
23:46:59 | eval completed in 9.14s
23:46:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01402 39.61   519 375.8       0          0 11.59  102 .2273    .2006 14.75 2.811 5e-07 222.4   161   
   external         0  .02047 46.77                   0          0         22 .2235          15.59 3.096                     
   internal         0 .007572 32.44                   0          0         80 .2312           13.9 2.526                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 17.31    .2303      .4294         0                  112 741.4 536.8  
   external       0          0 22.11    .2395      .3965         0                                   
   internal       0          0  12.5    .2210      .4622         0
[0m
23:46:59 | saving model checkp

23:47:42 | running eval: valid
23:47:51 | eval completed in 8.93s
23:47:51 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007398 39.61   519 385.3       0          0 11.88  102 .2138    .2003 14.75 2.788 5e-07 222.4 165.1   
   external         0 .007224 46.77                   0          0         22 .2035          15.59 3.073                     
   internal         0 .007572 32.44                   0          0         80 .2242           13.9 2.502                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.91    .2195      .4377         0                  176 741.4 550.4  
   external       0          0 21.62    .2244      .4023         0                                   
   internal       0          0  12.2    .2147      .4730         0
[0m
23:47:51 | saving model checkp

23:48:37 | running eval: valid
23:48:45 | eval completed in 8.53s
23:48:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007406 39.61   519   404       0          0 12.46  102 .2250    .2005 14.75 2.769 5e-07 222.4 173.1   
   external         0 .007234 46.77                   0          0         22 .2173          15.59 3.055                     
   internal         0 .007578 32.44                   0          0         80 .2326           13.9 2.484                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  16.6    .2297      .4447         0                  240 741.4 577.2  
   external       0          0 21.22    .2294      .4111         0                                   
   internal       0          0 11.98    .2301      .4784         0
[0m
23:48:45 | saving model checkp

23:49:29 | running eval: valid
23:49:38 | eval completed in 8.44s
23:49:38 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01552 39.61   519 405.1       0          0 12.49  102 .2395    .2003 14.75 2.754 5e-07 222.4 173.6   
   external         0  .02347 46.77                   0          0         22 .2400          15.59 3.039                     
   internal         0 .007577 32.44                   0          0         80 .2390           13.9 2.469                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.35    .2440      .4456         0                  304 741.4 578.7  
   external       0          0 20.89    .2525      .4111         0                                   
   internal       0          0 11.81    .2355      .4802         0
[0m
23:49:38 | saving model checkp

23:50:20 | running eval: valid
23:50:28 | eval completed in 8.26s
23:50:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519 408.4       0          0 12.59  102 .2308    .2004 14.75 2.741 5e-07 222.4   175   
   external         0 .007226 46.77                   0          0         22 .2234          15.59 3.026                     
   internal         0 .007576 32.44                   0          0         80 .2381           13.9 2.456                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 16.14    .2360      .4458         0                  368 741.4 583.5  
   external       0          0 20.62    .2383      .4140         0                                   
   internal       0          0 11.66    .2337      .4775         0
[0m
23:50:28 | saving model checkp

23:51:11 | running eval: valid
23:51:19 | eval completed in 7.99s
23:51:19 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519 416.7       0          0 12.85  102 .2403    .2005 14.75 2.731 5e-07 222.4 178.6   
   external         0 .007226 46.77                   0          0         22 .2349          15.59 3.015                     
   internal         0 .007577 32.44                   0          0         80 .2458           13.9 2.446                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.97    .2478      .4462         0                  432 741.4 595.3  
   external       0          0  20.4    .2534      .4140         0                                   
   internal       0          0 11.54    .2422      .4784         0
[0m
23:51:19 | saving model checkp

23:52:00 | running eval: valid
23:52:08 | eval completed in 7.81s
23:52:08 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519 428.7       0          0 13.21  102 .2360    .2006 14.75 2.721 5e-07 222.4 183.7   
   external         0 .007226 46.77                   0          0         22 .2257          15.59 3.006                     
   internal         0 .007577 32.44                   0          0         80 .2463           13.9 2.437                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.82    .2392      .4443         0                  496 741.4 612.4  
   external       0          0  20.2    .2339      .4111         0                                   
   internal       0          0 11.44    .2444      .4775         0
[0m
23:52:08 | saving model checkp



23:52:13 | running eval: valid
23:52:21 | eval completed in 7.85s
23:52:21 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519 427.7       0          0 13.19  102 .2379    .1935 14.75 2.719 5e-07 222.4 183.3   
   external         0 .007226 46.77                   0          0         22 .2301          15.59 3.003                     
   internal         0 .007577 32.44                   0          0         80 .2457           13.9 2.435                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 15.78    .2423      .4447         0                  512 741.4  611  
   external       0          0 20.15    .2390      .4111         0                                  
   internal       0          0 11.42    .2457      .4784         0
[0m
23:52:21 | saving model checkpoin

23:53:01 | running eval: valid
23:53:09 | eval completed in 7.73s
23:53:09 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007401 39.61   519 435.6       0          0 13.43  102 .2399    .2006 14.75 2.712 5e-07 222.4 186.7   
   external         0 .007226 46.77                   0          0         22 .2316          15.59 2.996                     
   internal         0 .007577 32.44                   0          0         80 .2482           13.9 2.428                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.67    .2455      .4471         0                  576 741.4 622.3  
   external       0          0    20    .2435      .4140         0                                   
   internal       0          0 11.34    .2475      .4802         0
[0m
23:53:09 | saving model checkp

23:53:52 | running eval: valid
23:53:59 | eval completed in 7.65s
23:53:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007402 39.61   519 439.6       0          0 13.55  102 .2395    .2006 14.75 2.705 5e-07 222.4 188.4   
   external         0 .007226 46.77                   0          0         22 .2316          15.59 2.989                     
   internal         0 .007579 32.44                   0          0         80 .2474           13.9 2.421                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.56    .2453      .4467         0                  640 741.4 627.9  
   external       0          0 19.87    .2435      .4140         0                                   
   internal       0          0 11.26    .2471      .4793         0
[0m
23:53:59 | saving model checkp

23:54:41 | running eval: valid
23:54:48 | eval completed in 7.66s
23:54:48 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007402 39.61   519 439.5       0          0 13.55  102 .2355    .2004 14.75 2.698 5e-07 222.4 188.3   
   external         0 .007226 46.77                   0          0         22 .2269          15.59 2.982                     
   internal         0 .007579 32.44                   0          0         80 .2441           13.9 2.415                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.46    .2412      .4480         0                  704 741.4 627.9  
   external       0          0 19.74    .2390      .4140         0                                   
   internal       0          0 11.18    .2434      .4820         0
[0m
23:54:49 | saving model checkp



23:54:53 | running eval: valid
23:55:00 | eval completed in 7.67s
23:55:00 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007402 39.61   519 438.7       0          0 13.52  102 .2328    .1935 14.75 2.697 5e-07 222.4   188   
   external         0 .007226 46.77                   0          0         22 .2215          15.59  2.98                     
   internal         0 .007579 32.44                   0          0         80 .2441           13.9 2.413                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.43    .2387      .4471         0                  720 741.4 626.7  
   external       0          0 19.69    .2339      .4140         0                                   
   internal       0          0 11.17    .2434      .4802         0
[0m
23:55:00 | saving model checkp

23:55:40 | running eval: valid
23:55:48 | eval completed in 7.62s
23:55:48 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007405 39.61   519 444.8       0          0 13.71  102 .2424    .2005 14.75 2.691 5e-07 222.4 190.6   
   external         0 .007226 46.77                   0          0         22 .2254          15.59 2.974                     
   internal         0 .007585 32.44                   0          0         80 .2594           13.9 2.408                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.34    .2482      .4480         0                  784 741.4 635.4  
   external       0          0 19.57    .2385      .4140         0                                   
   internal       0          0 11.11    .2579      .4820         0
[0m
23:55:48 | saving model checkp

23:56:30 | running eval: valid
23:56:37 | eval completed in 7.57s
23:56:37 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007404 39.61   519 447.2       0          0 13.78  102 .2403    .2005 14.75 2.686 5e-07 222.4 191.6   
   external         0 .007226 46.77                   0          0         22 .2254          15.59 2.969                     
   internal         0 .007583 32.44                   0          0         80 .2553           13.9 2.403                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.26    .2468      .4499         0                  848 741.4 638.8  
   external       0          0 19.47    .2385      .4169         0                                   
   internal       0          0 11.05    .2552      .4829         0
[0m
23:56:37 | saving model checkp

23:57:19 | running eval: valid
23:57:27 | eval completed in 7.52s
23:57:27 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007405 39.61   519 451.3       0          0 13.91  102 .2367    .2003 14.75 2.682 5e-07 222.4 193.4   
   external         0 .007226 46.77                   0          0         22 .2194          15.59 2.965                     
   internal         0 .007584 32.44                   0          0         80 .2541           13.9 2.398                     
             ltrunc  ltrunclen  ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.2    .2416      .4509         0                  912 741.4 644.6  
   external       0          0 19.4    .2309      .4198         0                                   
   internal       0          0   11    .2523      .4820         0
[0m
23:57:27 | saving model checkpoint

23:58:07 | running eval: valid
23:58:15 | eval completed in 7.60s
23:58:15 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007406 39.61   519 449.1       0          0 13.84  102 .2414    .2005 14.75 2.677 5e-07 222.4 192.4   
   external         0 .007226 46.77                   0          0         22 .2194          15.59 2.959                     
   internal         0 .007586 32.44                   0          0         80 .2634           13.9 2.394                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.12    .2449      .4547         0                  976 741.4 641.5  
   external       0          0 19.29    .2309      .4257         0                                   
   internal       0          0 10.95    .2589      .4838         0
[0m
23:58:15 | saving model checkp

23:58:56 | running eval: valid
23:59:04 | eval completed in 7.53s
23:59:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 .007408 39.61   519   449       0          0 13.84  102 .2420    .2006 14.75 2.673 5e-07 222.4 192.4   
   external         0 .007226 46.77                   0          0         22 .2194          15.59 2.955                     
   internal         0  .00759 32.44                   0          0         80 .2647           13.9  2.39                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 15.06    .2458      .4552         0                 1040 741.4 641.4  
   external       0          0  19.2    .2309      .4257         0                                   
   internal       0          0 10.91    .2607      .4847         0
[0m
23:59:04 | saving model checkp

23:59:43 | running eval: valid
23:59:51 | eval completed in 7.41s
23:59:51 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01832 39.61   519 457.6       0          0 14.11  102 .2516    .2005 14.75 2.669 5e-07 222.4 196.1   
   external         0  .02347 46.77                   0          0         22 .2402          15.59 2.951                     
   internal         0  .01317 32.44                   0          0         80 .2629           13.9 2.386                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0    15    .2565      .4571         0                 1104 741.4 653.8  
   external       0          0 19.13    .2509      .4286         0                                   
   internal       0          0 10.87    .2621      .4856         0
[0m
23:59:51 | saving model checkp

00:00:31 | running eval: valid
00:00:38 | eval completed in 7.40s
00:00:38 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01832 39.61   519   458       0          0 14.12  102 .2477    .2004 14.75 2.665 5e-07 222.4 196.2   
   external         0  .02347 46.77                   0          0         22 .2395          15.59 2.948                     
   internal         0  .01316 32.44                   0          0         80 .2560           13.9 2.382                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.94    .2519      .4575         0                 1168 741.4 654.2  
   external       0          0 19.06    .2484      .4286         0                                   
   internal       0          0 10.83    .2554      .4865         0
[0m
00:00:38 | saving model checkp

00:01:20 | running eval: valid
00:01:27 | eval completed in 7.37s
00:01:27 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01832 39.61   519 461.1       0          0 14.21  102 .2513    .2004 14.75 2.661 5e-07 222.4 197.6   
   external         0  .02347 46.77                   0          0         22 .2456          15.59 2.942                     
   internal         0  .01317 32.44                   0          0         80 .2571           13.9 2.379                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.88    .2565      .4571         0                 1232 741.4 658.6  
   external       0          0 18.96    .2559      .4286         0                                   
   internal       0          0 10.79    .2571      .4856         0
[0m
00:01:27 | saving model checkp

00:02:06 | running eval: valid
00:02:13 | eval completed in 7.48s
00:02:13 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01972 39.61   519 452.6       0          0 13.95  102 .2547    .2004 14.75 2.657 5e-07 222.4   194   
   external         0  .02347 46.77                   0          0         22 .2456          15.59 2.939                     
   internal         0  .01597 32.44                   0          0         80 .2639           13.9 2.376                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.83    .2613      .4594         0                 1296 741.4 646.6  
   external       0          0  18.9    .2559      .4315         0                                   
   internal       0          0 10.76    .2667      .4874         0
[0m
00:02:13 | saving model checkp

00:02:55 | running eval: valid
00:03:02 | eval completed in 7.33s
00:03:02 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02089 39.61   519   458       0          0 14.12  102 .2562    .2005 14.75 2.654 5e-07 222.4 196.2   
   external         0  .02397 46.77                   0          0         22 .2483          15.59 2.936                     
   internal         0  .01782 32.44                   0          0         80 .2642           13.9 2.373                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.78    .2677      .4628         0                 1360 741.4 654.2  
   external       0          0 18.83    .2670      .4373         0                                   
   internal       0          0 10.73    .2684      .4883         0
[0m
00:03:02 | saving model checkp

00:03:43 | running eval: valid
00:03:50 | eval completed in 7.32s
00:03:50 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02089 39.61   519 458.5       0          0 14.14  102 .2509    .2006 14.75 2.651 5e-07 222.4 196.5   
   external         0  .02397 46.77                   0          0         22 .2326          15.59 2.933                     
   internal         0  .01782 32.44                   0          0         80 .2692           13.9  2.37                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.74    .2595      .4633         0                 1424 741.4  655  
   external       0          0 18.78    .2465      .4373         0                                  
   internal       0          0  10.7    .2726      .4892         0
[0m
00:03:50 | saving model checkpoin

00:04:28 | running eval: valid
00:04:36 | eval completed in 7.22s
00:04:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02806 39.61   519 466.4       0          0 14.38  102 .2553    .2005 14.75 2.649 5e-07 222.4 199.9   
   external         0  .03831 46.77                   0          0         22 .2445          15.59 2.931                     
   internal         0  .01782 32.44                   0          0         80 .2660           13.9 2.367                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  14.7    .2686      .4624         0                 1488 741.4 666.3  
   external       0          0 18.74    .2645      .4373         0                                   
   internal       0          0 10.67    .2727      .4874         0
[0m
00:04:36 | saving model checkp

00:05:17 | running eval: valid
00:05:24 | eval completed in 7.22s
00:05:24 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03074 39.61   519 465.8       0          0 14.36  102 .2575    .2006 14.75 2.646 5e-07 222.4 199.6   
   external         0  .03831 46.77                   0          0         22 .2463          15.59 2.927                     
   internal         0  .02318 32.44                   0          0         80 .2688           13.9 2.365                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.66    .2722      .4633         0                 1552 741.4 665.4  
   external       0          0 18.67    .2686      .4373         0                                   
   internal       0          0 10.64    .2758      .4892         0
[0m
00:05:24 | saving model checkp

00:06:04 | running eval: valid
00:06:12 | eval completed in 7.29s
00:06:12 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03286 39.61   519 462.6       0          0 14.26  102 .2551    .2005 14.75 2.643 5e-07 222.4 198.2   
   external         0  .03831 46.77                   0          0         22 .2445          15.59 2.924                     
   internal         0  .02741 32.44                   0          0         80 .2657           13.9 2.362                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.62    .2682      .4629         0                 1616 741.4 660.8  
   external       0          0 18.62    .2645      .4402         0                                   
   internal       0          0 10.61    .2719      .4856         0
[0m
00:06:12 | saving model checkp

00:06:50 | running eval: valid
00:06:58 | eval completed in 7.26s
00:06:58 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02993 39.61   519 466.1       0          0 14.37  102 .2595    .2009 14.75  2.64 5e-07 222.4 199.7   
   external         0  .03831 46.77                   0          0         22 .2445          15.59 2.921                     
   internal         0  .02155 32.44                   0          0         80 .2745           13.9  2.36                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.57    .2721      .4629         0                 1680 741.4 665.9  
   external       0          0 18.56    .2645      .4402         0                                   
   internal       0          0 10.59    .2797      .4856         0
[0m
00:06:58 | saving model checkp

00:07:38 | running eval: valid
00:07:45 | eval completed in 7.22s
00:07:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03286 39.61   519 468.4       0          0 14.44  102 .2584    .2005 14.75 2.638 5e-07 222.4 200.7   
   external         0  .03831 46.77                   0          0         22 .2450          15.59 2.918                     
   internal         0  .02741 32.44                   0          0         80 .2718           13.9 2.358                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.54    .2717      .4610         0                 1744 741.4 669.2  
   external       0          0  18.5    .2645      .4373         0                                   
   internal       0          0 10.57    .2790      .4847         0
[0m
00:07:46 | saving model checkp

00:08:26 | running eval: valid
00:08:33 | eval completed in 7.18s
00:08:33 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03169 39.61   519 469.8       0          0 14.48  102 .2493    .2006 14.75 2.635 5e-07 222.4 201.3   
   external         0   .0306 46.77                   0          0         22 .2248          15.59 2.915                     
   internal         0  .03277 32.44                   0          0         80 .2738           13.9 2.356                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  14.5    .2590      .4610         0                 1808 741.4 671.2  
   external       0          0 18.44    .2383      .4373         0                                   
   internal       0          0 10.55    .2796      .4847         0
[0m
00:08:33 | saving model checkp

00:09:11 | running eval: valid
00:09:18 | eval completed in 7.19s
00:09:18 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02977 39.61   519 467.5       0          0 14.41  102 .2390    .2004 14.75 2.634 5e-07 222.4 200.3   
   external         0   .0306 46.77                   0          0         22 .2073          15.59 2.913                     
   internal         0  .02893 32.44                   0          0         80 .2707           13.9 2.354                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.47    .2530      .4610         0                 1872 741.4 667.8  
   external       0          0 18.41    .2318      .4373         0                                   
   internal       0          0 10.53    .2742      .4847         0
[0m
00:09:19 | saving model checkp

00:10:00 | running eval: valid
00:10:08 | eval completed in 7.22s
00:10:08 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02901 39.61   519   466       0          0 14.36  102 .2426    .2004 14.75 2.632 5e-07 222.4 199.7   
   external         0   .0306 46.77                   0          0         22 .2115          15.59 2.912                     
   internal         0  .02741 32.44                   0          0         80 .2736           13.9 2.352                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.45    .2579      .4610         0                 1936 741.4 665.6  
   external       0          0 18.39    .2353      .4373         0                                   
   internal       0          0 10.51    .2805      .4847         0
[0m
00:10:08 | saving model checkp

00:10:49 | running eval: valid
00:10:56 | eval completed in 7.21s
00:10:56 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03117 39.61   519 466.3       0          0 14.37  102 .2394    .2004 14.75  2.63 5e-07 222.4 199.8   
   external         0   .0306 46.77                   0          0         22 .2073          15.59 2.909                     
   internal         0  .03173 32.44                   0          0         80 .2714           13.9 2.351                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.42    .2552      .4610         0                 2000 741.4 666.1  
   external       0          0 18.34    .2318      .4373         0                                   
   internal       0          0 10.49    .2786      .4847         0
[0m
00:10:56 | saving model checkp

00:11:35 | running eval: valid
00:11:42 | eval completed in 7.06s
00:11:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02479 39.61   519 482.1       0          0 14.86  102 .2363    .2003 14.75 2.628 5e-07 222.4 206.6   
   external         0  .01626 46.77                   0          0         22 .1957          15.59 2.907                     
   internal         0  .03333 32.44                   0          0         80 .2769           13.9 2.349                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.39    .2514      .4606         0                 2064 741.4 688.8  
   external       0          0  18.3    .2217      .4373         0                                   
   internal       0          0 10.48    .2810      .4838         0
[0m
00:11:42 | saving model checkp

00:12:21 | running eval: valid
00:12:28 | eval completed in 7.11s
00:12:28 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02884 39.61   519 476.9       0          0  14.7  102 .2385    .2006 14.75 2.626 5e-07 222.4 204.4   
   external         0   .0306 46.77                   0          0         22 .2073          15.59 2.905                     
   internal         0  .02708 32.44                   0          0         80 .2697           13.9 2.347                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.36    .2538      .4616         0                 2128 741.4 681.3  
   external       0          0 18.26    .2318      .4402         0                                   
   internal       0          0 10.46    .2759      .4829         0
[0m
00:12:28 | saving model checkp

00:13:07 | running eval: valid
00:13:14 | eval completed in 7.22s
00:13:14 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03178 39.61   519 468.3       0          0 14.44  102 .2426    .2008 14.75 2.624 5e-07 222.4 200.7   
   external         0   .0306 46.77                   0          0         22 .2133          15.59 2.902                     
   internal         0  .03295 32.44                   0          0         80 .2719           13.9 2.345                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.33    .2576      .4616         0                 2192 741.4  669  
   external       0          0 18.22    .2394      .4402         0                                  
   internal       0          0 10.44    .2759      .4829         0
[0m
00:13:14 | saving model checkpoin

00:13:55 | running eval: valid
00:14:02 | eval completed in 7.09s
00:14:02 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03178 39.61   519 476.7       0          0  14.7  102 .2426    .2006 14.75 2.622 5e-07 222.4 204.3   
   external         0   .0306 46.77                   0          0         22 .2119          15.59   2.9                     
   internal         0  .03295 32.44                   0          0         80 .2733           13.9 2.344                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.29    .2576      .4616         0                 2256 741.4  681  
   external       0          0 18.17    .2394      .4402         0                                  
   internal       0          0 10.42    .2759      .4829         0
[0m
00:14:02 | saving model checkpoin



00:14:17 | running eval: valid
00:14:25 | eval completed in 7.13s
00:14:25 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03178 39.61   519 473.6       0          0  14.6  102 .2417    .2004 14.75 2.621 5e-07 222.4   203   
   external         0   .0306 46.77                   0          0         22 .2119          15.59 2.899                     
   internal         0  .03295 32.44                   0          0         80 .2715           13.9 2.343                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.28    .2576      .4616         0                 2288 741.4 676.6  
   external       0          0 18.15    .2394      .4402         0                                   
   internal       0          0 10.42    .2759      .4829         0
[0m
00:14:25 | saving model checkp

00:15:05 | running eval: valid
00:15:12 | eval completed in 7.12s
00:15:12 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03251 39.61   519 474.8       0          0 14.64  102 .2417    .2005 14.75 2.619 5e-07 222.4 203.4   
   external         0  .03059 46.77                   0          0         22 .2106          15.59 2.896                     
   internal         0  .03443 32.44                   0          0         80 .2727           13.9 2.342                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.25    .2537      .4625         0                 2352 741.4 678.2  
   external       0          0 18.11    .2280      .4402         0                                   
   internal       0          0  10.4    .2794      .4847         0
[0m
00:15:12 | saving model checkp

00:15:51 | running eval: valid
00:15:58 | eval completed in 7.08s
00:15:58 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03251 39.61   519 474.2       0          0 14.62  102 .2334    .2006 14.75 2.617 5e-07 222.4 203.2   
   external         0  .03059 46.77                   0          0         22 .1955          15.59 2.895                     
   internal         0  .03443 32.44                   0          0         80 .2712           13.9  2.34                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.23    .2491      .4625         0                 2416 741.4 677.5  
   external       0          0 18.09    .2229      .4402         0                                   
   internal       0          0 10.38    .2753      .4847         0
[0m
00:15:58 | saving model checkp

00:16:38 | running eval: valid
00:16:45 | eval completed in 7.10s
00:16:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 473.2       0          0 14.59  102 .2507    .2004 14.75 2.616 5e-07 222.4 202.8   
   external         0  .03061 46.77                   0          0         22 .2204          15.59 2.893                     
   internal         0  .03443 32.44                   0          0         80 .2811           13.9 2.338                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.21    .2647      .4629         0                 2480 741.4  676  
   external       0          0 18.05    .2435      .4402         0                                  
   internal       0          0 10.36    .2859      .4856         0
[0m
00:16:45 | saving model checkpoin

00:17:25 | running eval: valid
00:17:32 | eval completed in 7.12s
00:17:32 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 471.3       0          0 14.53  102 .2496    .2006 14.75 2.614 5e-07 222.4   202   
   external         0  .03061 46.77                   0          0         22 .2204          15.59 2.891                     
   internal         0  .03443 32.44                   0          0         80 .2789           13.9 2.337                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.18    .2653      .4629         0                 2544 741.4 673.3  
   external       0          0    18    .2435      .4402         0                                   
   internal       0          0 10.35    .2871      .4856         0
[0m
00:17:32 | saving model checkp

00:18:12 | running eval: valid
00:18:19 | eval completed in 7.14s
00:18:19 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 469.8       0          0 14.48  102 .2484    .2004 14.75 2.612 5e-07 222.4 201.3   
   external         0  .03061 46.77                   0          0         22 .2239          15.59 2.889                     
   internal         0  .03443 32.44                   0          0         80 .2729           13.9 2.336                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.16    .2635      .4629         0                 2608 741.4 671.1  
   external       0          0 17.97    .2460      .4402         0                                   
   internal       0          0 10.34    .2810      .4856         0
[0m
00:18:19 | saving model checkp

00:18:58 | running eval: valid
00:19:06 | eval completed in 7.13s
00:19:06 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 471.7       0          0 14.54  102 .2524    .2006 14.75 2.611 5e-07 222.4 202.1   
   external         0  .03061 46.77                   0          0         22 .2292          15.59 2.887                     
   internal         0  .03443 32.44                   0          0         80 .2757           13.9 2.335                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.14    .2667      .4629         0                 2672 741.4 673.9  
   external       0          0 17.95    .2511      .4402         0                                   
   internal       0          0 10.33    .2824      .4856         0
[0m
00:19:06 | saving model checkp

00:19:46 | running eval: valid
00:19:53 | eval completed in 7.10s
00:19:53 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 473.6       0          0  14.6  102 .2524    .2006 14.75  2.61 5e-07 222.4 202.9   
   external         0  .03061 46.77                   0          0         22 .2292          15.59 2.886                     
   internal         0  .03442 32.44                   0          0         80 .2757           13.9 2.333                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.11    .2664      .4629         0                 2736 741.4 676.6  
   external       0          0 17.91    .2511      .4402         0                                   
   internal       0          0 10.31    .2817      .4856         0
[0m
00:19:53 | saving model checkp

00:20:31 | running eval: valid
00:20:38 | eval completed in 7.17s
00:20:38 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 465.8       0          0 14.36  102 .2493    .2005 14.75 2.608 5e-07 222.4 199.6   
   external         0  .03061 46.77                   0          0         22 .2239          15.59 2.884                     
   internal         0  .03442 32.44                   0          0         80 .2748           13.9 2.333                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0  14.1    .2638      .4619         0                 2800 741.4 665.4  
   external       0          0 17.88    .2460      .4373         0                                   
   internal       0          0 10.31    .2817      .4865         0
[0m
00:20:38 | saving model checkp

00:21:16 | running eval: valid
00:21:23 | eval completed in 7.07s
00:21:23 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03252 39.61   519 475.1       0          0 14.65  102 .2498    .2005 14.75 2.607 5e-07 222.4 203.6   
   external         0  .03061 46.77                   0          0         22 .2239          15.59 2.883                     
   internal         0  .03442 32.44                   0          0         80 .2757           13.9 2.332                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.08    .2628      .4624         0                 2864 741.4 678.7  
   external       0          0 17.87    .2460      .4373         0                                   
   internal       0          0  10.3    .2796      .4874         0
[0m
00:21:23 | saving model checkp

00:22:03 | running eval: valid
00:22:10 | eval completed in 7.10s
00:22:10 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03251 39.61   519 472.6       0          0 14.57  102 .2521    .2004 14.75 2.606 5e-07 222.4 202.5   
   external         0   .0306 46.77                   0          0         22 .2227          15.59 2.882                     
   internal         0  .03443 32.44                   0          0         80 .2816           13.9 2.331                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.07    .2656      .4628         0                 2928 741.4 675.2  
   external       0          0 17.86    .2460      .4373         0                                   
   internal       0          0 10.28    .2851      .4883         0
[0m
00:22:11 | saving model checkp

00:22:49 | running eval: valid
00:22:56 | eval completed in 7.11s
00:22:56 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .03251 39.61   519 472.7       0          0 14.57  102 .2517    .2006 14.75 2.605 5e-07 222.4 202.6   
   external         0   .0306 46.77                   0          0         22 .2227          15.59 2.881                     
   internal         0  .03443 32.44                   0          0         80 .2808           13.9  2.33                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.05    .2647      .4633         0                 2992 741.4 675.3  
   external       0          0 17.83    .2460      .4373         0                                   
   internal       0          0 10.27    .2834      .4892         0
[0m
00:22:56 | saving model checkp

00:23:35 | running eval: valid
00:23:42 | eval completed in 7.06s
00:23:42 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02439 39.61   519   476       0          0 14.68  102 .2444    .2004 14.75 2.603 5e-07 222.4   204   
   external         0  .01435 46.77                   0          0         22 .2095          15.59 2.879                     
   internal         0  .03442 32.44                   0          0         80 .2793           13.9 2.328                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 14.03    .2572      .4623         0                 3056 741.4  680  
   external       0          0 17.79    .2295      .4344         0                                  
   internal       0          0 10.26    .2850      .4901         0
[0m
00:23:42 | saving model checkpoin

00:24:22 | running eval: valid
00:24:29 | eval completed in 7.25s
00:24:29 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02439 39.61   519 463.7       0          0 14.29  102 .2448    .2004 14.75 2.603 5e-07 222.4 198.7   
   external         0  .01435 46.77                   0          0         22 .2095          15.59 2.878                     
   internal         0  .03443 32.44                   0          0         80 .2801           13.9 2.328                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 14.01    .2563      .4618         0                 3120 741.4 662.3  
   external       0          0 17.78    .2295      .4344         0                                   
   internal       0          0 10.25    .2832      .4892         0
[0m
00:24:29 | saving model checkp

00:25:06 | running eval: valid
00:25:13 | eval completed in 7.06s
00:25:13 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02003 39.61   519 474.6       0          0 14.63  102 .2414    .2003 14.75 2.601 5e-07 222.4 203.4   
   external         0  .01435 46.77                   0          0         22 .2095          15.59 2.876                     
   internal         0  .02572 32.44                   0          0         80 .2732           13.9 2.326                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.99    .2547      .4618         0                 3184 741.4  678  
   external       0          0 17.75    .2295      .4344         0                                  
   internal       0          0 10.24    .2799      .4892         0
[0m
00:25:13 | saving model checkpoin

00:25:52 | running eval: valid
00:25:59 | eval completed in 7.07s
00:25:59 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02004 39.61   519 474.8       0          0 14.64  102 .2491    .2005 14.75   2.6 5e-07 222.4 203.4   
   external         0  .01437 46.77                   0          0         22 .2268          15.59 2.874                     
   internal         0  .02572 32.44                   0          0         80 .2715           13.9 2.325                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.97    .2605      .4627         0                 3248 741.4 678.2  
   external       0          0 17.71    .2425      .4344         0                                   
   internal       0          0 10.23    .2786      .4910         0
[0m
00:25:59 | saving model checkp

00:26:38 | running eval: valid
00:26:45 | eval completed in 7.04s
00:26:45 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02004 39.61   519 478.6       0          0 14.75  102 .2519    .2004 14.75 2.599 5e-07 222.4 205.1   
   external         0  .01437 46.77                   0          0         22 .2344          15.59 2.873                     
   internal         0  .02572 32.44                   0          0         80 .2694           13.9 2.324                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.96    .2640      .4641         0                 3312 741.4 683.7  
   external       0          0  17.7    .2490      .4344         0                                   
   internal       0          0 10.22    .2790      .4937         0
[0m
00:26:45 | saving model checkp

00:27:23 | running eval: valid
00:27:30 | eval completed in 7.02s
00:27:30 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02004 39.61   519 479.1       0          0 14.77  102 .2521    .2005 14.75 2.597 5e-07 222.4 205.3   
   external         0  .01437 46.77                   0          0         22 .2344          15.59 2.872                     
   internal         0  .02572 32.44                   0          0         80 .2698           13.9 2.323                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.94    .2636      .4636         0                 3376 741.4 684.4  
   external       0          0 17.67    .2490      .4344         0                                   
   internal       0          0 10.21    .2782      .4928         0
[0m
00:27:30 | saving model checkp

00:28:09 | running eval: valid
00:28:16 | eval completed in 7.21s
00:28:16 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02004 39.61   519 463.6       0          0 14.29  102 .2518    .2004 14.75 2.596 5e-07 222.4 198.7   
   external         0  .01437 46.77                   0          0         22 .2344          15.59 2.871                     
   internal         0  .02572 32.44                   0          0         80 .2692           13.9 2.322                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.92    .2626      .4636         0                 3440 741.4 662.3  
   external       0          0 17.65    .2490      .4344         0                                   
   internal       0          0 10.19    .2762      .4928         0
[0m
00:28:16 | saving model checkp

00:28:57 | running eval: valid
00:29:04 | eval completed in 7.05s
00:29:04 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02146 39.61   519 474.6       0          0 14.63  102 .2539    .2004 14.75 2.595 5e-07 222.4 203.4   
   external         0  .01437 46.77                   0          0         22 .2313          15.59 2.869                     
   internal         0  .02856 32.44                   0          0         80 .2766           13.9 2.321                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0  13.9    .2621      .4636         0                 3504 741.4  678  
   external       0          0 17.63    .2430      .4344         0                                  
   internal       0          0 10.18    .2811      .4928         0
[0m
00:29:04 | saving model checkpoin

00:29:42 | running eval: valid
00:29:49 | eval completed in 7.02s
00:29:49 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0   .0208 39.61   519 475.7       0          0 14.67  102 .2502    .2006 14.75 2.593 5e-07 222.4 203.9   
   external         0  .01437 46.77                   0          0         22 .2237          15.59 2.867                     
   internal         0  .02723 32.44                   0          0         80 .2767           13.9 2.319                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.87    .2590      .4645         0                 3568 741.4 679.6  
   external       0          0 17.58    .2365      .4344         0                                   
   internal       0          0 10.17    .2814      .4946         0
[0m
00:29:49 | saving model checkp

00:30:27 | running eval: valid
00:30:34 | eval completed in 7.07s
00:30:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02151 39.61   519 470.5       0          0 14.51  102 .2497    .2008 14.75 2.593 5e-07 222.4 201.6   
   external         0  .01437 46.77                   0          0         22 .2237          15.59 2.867                     
   internal         0  .02865 32.44                   0          0         80 .2757           13.9 2.318                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.87    .2602      .4636         0                 3632 741.4 672.2  
   external       0          0 17.58    .2365      .4344         0                                   
   internal       0          0 10.16    .2838      .4928         0
[0m
00:30:34 | saving model checkp

00:31:12 | running eval: valid
00:31:19 | eval completed in 6.97s
00:31:19 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02151 39.61   519   479       0          0 14.77  102 .2504    .2006 14.75 2.592 5e-07 222.4 205.2   
   external         0  .01437 46.77                   0          0         22 .2237          15.59 2.865                     
   internal         0  .02865 32.44                   0          0         80 .2770           13.9 2.318                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.85    .2589      .4632         0                 3696 741.4 684.2  
   external       0          0 17.56    .2365      .4344         0                                   
   internal       0          0 10.15    .2813      .4919         0
[0m
00:31:19 | saving model checkp

00:31:58 | running eval: valid
00:32:05 | eval completed in 6.99s
00:32:05 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519 476.5       0          0 14.69  102 .2484    .2007 14.75 2.589 5e-07 222.4 204.2   
   external         0  .01437 46.77                   0          0         22 .2237          15.59 2.862                     
   internal         0  .02433 32.44                   0          0         80 .2731           13.9 2.316                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.81    .2591      .4632         0                 3760 741.4 680.6  
   external       0          0 17.49    .2365      .4344         0                                   
   internal       0          0 10.14    .2817      .4919         0
[0m
00:32:05 | saving model checkp

00:32:42 | running eval: valid
00:32:50 | eval completed in 7.09s
00:32:50 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519 469.2       0          0 14.46  102 .2548    .2006 14.75 2.588 5e-07 222.4 201.1   
   external         0  .01437 46.77                   0          0         22 .2338          15.59 2.861                     
   internal         0  .02433 32.44                   0          0         80 .2758           13.9 2.316                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.81    .2633      .4645         0                 3824 741.4 670.2  
   external       0          0 17.48    .2449      .4344         0                                   
   internal       0          0 10.13    .2817      .4946         0
[0m
00:32:50 | saving model checkp

00:33:29 | running eval: valid
00:33:36 | eval completed in 6.99s
00:33:36 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519   478       0          0 14.73  102 .2539    .2005 14.75 2.587 5e-07 222.4 204.8   
   external         0  .01437 46.77                   0          0         22 .2338          15.59  2.86                     
   internal         0  .02433 32.44                   0          0         80 .2741           13.9 2.314                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.79    .2656      .4660         0                 3888 741.4 682.8  
   external       0          0 17.46    .2449      .4373         0                                   
   internal       0          0 10.12    .2863      .4946         0
[0m
00:33:36 | saving model checkp

00:34:15 | running eval: valid
00:34:22 | eval completed in 7.04s
00:34:22 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02747 39.61   519 476.9       0          0  14.7  102 .2604    .2005 14.75 2.586 5e-07 222.4 204.4   
   external         0  .03061 46.77                   0          0         22 .2448          15.59 2.858                     
   internal         0  .02433 32.44                   0          0         80 .2761           13.9 2.314                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.77    .2719      .4674         0                 3952 741.4 681.3  
   external       0          0 17.43    .2579      .4402         0                                   
   internal       0          0 10.11    .2860      .4946         0
[0m
00:34:22 | saving model checkp

00:34:59 | running eval: valid
00:35:06 | eval completed in 6.99s
00:35:06 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .02747 39.61   519 480.9       0          0 14.83  102 .2596    .2004 14.75 2.585 5e-07 222.4 206.1   
   external         0  .03061 46.77                   0          0         22 .2420          15.59 2.858                     
   internal         0  .02433 32.44                   0          0         80 .2772           13.9 2.313                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb  tps  
   all            0          0 13.76    .2720      .4674         0                 4016 741.4  687  
   external       0          0 17.42    .2559      .4402         0                                  
   internal       0          0  10.1    .2881      .4946         0
[0m
00:35:06 | saving model checkpoin

00:35:45 | running eval: valid
00:35:52 | eval completed in 7.03s
00:35:52 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0  .01935 39.61   519   478       0          0 14.73  102 .2545    .2006 14.75 2.585 5e-07 222.4 204.8   
   external         0  .01437 46.77                   0          0         22 .2288          15.59 2.857                     
   internal         0  .02433 32.44                   0          0         80 .2802           13.9 2.313                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.76    .2639      .4660         0                 4080 741.4 682.8  
   external       0          0 17.41    .2394      .4373         0                                   
   internal       0          0  10.1    .2884      .4946         0
[0m
00:35:52 | saving model checkp

00:36:27 | running eval: valid
00:36:34 | eval completed in 7.05s
00:36:34 | [1mvalid:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  ltps  \
   all              0  .01935 39.61   519   477       0          0  14.7  102 .2541    .2005 14.75 2.585 2.5e-07 222.4 204.4   
   external         0  .01437 46.77                   0          0         22 .2316          15.59 2.857                       
   internal         0  .02433 32.44                   0          0         80 .2766           13.9 2.312                       
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 13.75    .2638      .4660         0                 4144 741.4 681.4  
   external       0          0  17.4    .2414      .4373         0                                   
   internal       0          0  10.1    .2861      .4946         0
[0m
00:36:34 | saving mode

00:37:14 | time:3128s total_exs:67328 total_steps:4208 epochs:69.63
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss      lr  ltpb  \
   all      39.85     1 640.8  4174       0          0 104.2  256              2048  10.13    .2531  14.9 2.089 2.5e-07 234.5   
   external  39.3                         0          0         81                                   15.58 2.125                 
   internal  40.4                         0          0        175                                   14.23 2.054                 
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all       1528       0          0 8.085      .5389   .005714                 4208 875.3 5702 6.516  
   external             0          0 8.375      .5436         0                                        
   internal             0          0 7.796      .5341    .01143

00:37:14 | running eval: valid
00:37:21

00:37:52 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:37:54 | [1mdid not beat best ppl: 13.7452 impatience: 3[0m
00:37:54 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:37:56 | time:3171s total_exs:68352 total_steps:4272 epochs:70.68
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss      lr  ltpb  \
   all      37.54     1 612.4  6969       0          0 182.1  256              2048  10.49    .2595 14.79 2.211 2.5e-07 233.2   
   external 35.62                         0          0         79                                   15.35 2.262                 
   internal 39.46                         0          0        177                                   14.23 2.159                 
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   t

00:38:36 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:38:37 | [1;32mnew best ppl: 13.74 (previous best was 13.74)[0m
00:38:37 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model
00:38:40 | time:3214s total_exs:69376 total_steps:4336 epochs:71.74
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss       lr  \
   all      40.34     1 637.8  8205       0          0 205.8  256              4096  10.81    .2307  13.7 2.196 1.25e-07   
   external 41.53                         0          0         76                                   13.66 2.247            
   internal 39.16                         0          0        180                                   13.73 2.144            
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb   tps   ups  
   a

00:39:20 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
Epoch   274: reducing learning rate of group 0 to 6.2500e-08.
00:39:21 | [1;32mnew best ppl: 13.74 (previous best was 13.74)[0m
00:39:21 | saving best valid model: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model
00:39:25 | time:3259s total_exs:70400 total_steps:4400 epochs:72.80
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss       lr  \
   all      42.03     1 669.3  4517       0          0   108  256              4096  10.29    .2359 14.24 2.047 6.25e-08   
   external  42.8                         0          0         96                                   14.48 2.019            
   internal 41.25                         0          0        160                                   14.01 2.074            
             ltpb  ltps  ltrunc  ltrunclen   ppl  tok

00:40:02 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
Epoch   278: reducing learning rate of group 0 to 3.1250e-08.
00:40:03 | [1mdid not beat best ppl: 13.7365 impatience: 4[0m
00:40:03 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:40:06 | time:3300s total_exs:71424 total_steps:4464 epochs:73.86
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      40.17     1 641.9  7230       0          0 180.2  256              4096   10.3    .2556 14.53  2.14 3.125e-08   
   external 40.35                         0          0         97                                   15.37 2.199             
   internal 39.98                         0          0        159                                   13.69 2.082             
             ltpb  ltps  ltrunc  ltrunclen

00:40:47 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:40:48 | [1mdid not beat best ppl: 13.7337 impatience: 1[0m
00:40:50 | time:3344s total_exs:72448 total_steps:4528 epochs:74.92
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      38.99     1   625  7384       0          0   189  256              4096  10.53    .2531 14.11 2.182 3.125e-08   
   external 38.75                         0          0         87                                   14.86 2.317             
   internal 39.22                         0          0        169                                   13.35 2.047             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb   tps   ups  
   all      221.8  2621       0          0 8.944      .5327         0                 4528 846.8 10005 11.82  
   external   

00:41:28 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:41:29 | [1mdid not beat best ppl: 13.7337 impatience: 5[0m
00:41:31 | time:3385s total_exs:73472 total_steps:4592 epochs:75.98
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      39.29     1 628.4  8182       0          0 208.3  256              4096  9.917    .2574 16.08 2.184 1.562e-08   
   external 39.35                         0          0         89                                   17.65 2.332             
   internal 39.24                         0          0        167                                    14.5 2.036             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all      249.6  3249       0          0 8.982      .5106         0                 4592  878 11432 13.03  
   external     

00:42:10 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:42:11 | [1mdid not beat best ppl: 13.7329 impatience: 1[0m
00:42:11 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:42:14 | time:3428s total_exs:74496 total_steps:4656 epochs:77.04
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      39.96     1 638.8  6781       0          0 169.8  256              4096  10.16    .2595 15.66 2.166 1.562e-08   
   external 40.07                         0          0         83                                   17.43 2.304             
   internal 39.86                         0          0        173                                   13.89 2.028             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   

00:42:52 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:42:53 | [1mdid not beat best ppl: 13.7329 impatience: 5[0m
00:42:55 | time:3469s total_exs:75520 total_steps:4720 epochs:78.10
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      41.82     1 658.2  7095       0          0 172.4  256              4096  10.34    .2378 14.45 2.126 1.562e-08   
   external 44.07                         0          0         89                                    14.6 2.194             
   internal 39.58                         0          0        167                                   14.31 2.059             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all      230.6  2485       0          0 8.404      .5419         0                 4720 888.8 9580 10.78  
   external     

00:43:33 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:43:35 | [1mdid not beat best ppl: 13.7329 impatience: 9[0m
00:43:36 | time:3511s total_exs:76544 total_steps:4784 epochs:79.16
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      39.77     1 628.4  6516       0          0 165.9  256              4096  10.11    .2556  14.6 2.022 1.562e-08   
   external 41.43                         0          0         90                                   14.81 1.985             
   internal  38.1                         0          0        166                                    14.4 2.059             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps   ups  
   all      232.7  2413       0          0 7.559      .5325   .003012                 4784 861.1 8929 10.38  
   external     

00:44:14 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:44:16 | [1mdid not beat best ppl: 13.7329 impatience: 13[0m
00:44:16 | saving model checkpoint: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-6,3-flatten/model.checkpoint
00:44:19 | time:3553s total_exs:77568 total_steps:4848 epochs:80.22
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss        lr  \
   all      39.49     1 632.8  6909       0          0 174.7  256              4096  10.62    .2595 13.88 2.062 1.562e-08   
   external 39.28                         0          0         93                                   14.83 2.147             
   internal 39.71                         0          0        163                                   12.93 1.978             
             ltpb  ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates   tpb  tps  

00:44:51 | creating task(s): internal
00:44:51 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/internal/test.txt
00:44:51 | creating task(s): external
00:44:51 | Loading ParlAI text data: /home/alex/ParlaiEmely/ParlAI/data/external/test.txt
00:44:51 | running eval: test
00:45:03 | eval completed in 12.00s
00:45:03 | [1mtest:
             accuracy  bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss        lr  ltpb  \
   all              0  .01769 53.69 635.2 648.7       0          0 14.41  171 .2260    .1934 14.68 2.555 1.562e-08   190   
   external         0  .01406 62.36                   0          0         44 .1901          15.89 2.851                   
   internal         0  .02132 45.02                   0          0        127 .2618          13.46 2.258                   
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all        194       0          0 13.44    .2296  

0,1
internal/exs/train,180.0
exs/train,256.0
internal/clen/train,38.10556
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,13.72778
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.0545
internal/ppl/train,7.80295


0,1
internal/exs/train,▅▃▅▅▅▄▄▃▅▆▇▆▅▄▃▅▅▃▇▄█▄▆▄▆▃▆▅▅▇▄█▄▄▁▃▁▃▇▇
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,█▅▅▃▃▃▃▃▂▃▂▃▅▄▂▂▃▄▂▁▃▂▁▃▃▃▃▄▁▂▄▁▅▂▃▂▃▃▁▁
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/llen/train,▄▅█▄▁▅▁▅▅▂▃▅▇▅▃▄▄▅▆▃▅▆▅▃▁▅▅▂▄▆▂▃▃▆▄▁▄▆▅▂
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,▇▇█▇▆▆▆▄▆▅▅▅▅▇▅▄▅▃▄▄▅▄▃▃▂▃▃▂▂▂▂▃▃▃▃▁▂▂▃▂
internal/ppl/train,▇▇█▇▅▆▅▄▅▄▅▄▄▆▄▄▄▃▃▃▄▃▂▂▂▃▃▁▂▂▁▂▂▃▃▁▂▂▂▂


## 8. Blender base + mixed mutators

### Datasets with sampling weights:
- internal: 6
- external: 3


### Mutators
- None
- word_shuffle
- flatten
- last_turn

In [11]:
tasks = 'internal,external'
weights= '2,1'
mutators = 'word_shuffle,flatten,last_turn'

run_training(tasks=tasks, weights=weights, mutators=mutators)

00:45:30 | building dictionary first...
00:45:30 | No model with opt yet at: /home/alex/ParlaiEmely/models/model-runs/blender-internal,external-2,1-word_shuffle,flatten,last_turn/model(.opt)
00:45:30 | [33myour model is being loaded with opts that do not exist in the model you are initializing the weights with: allow_missing_init_opts: False,download_path: None,loglevel: info,dynamic_batching: None,verbose: False,is_debug: False,datapath: /home/alex/ParlaiEmely/ParlAI/data,eval_dynamic_batching: None,num_workers: 0,max_train_steps: -1,log_every_n_steps: 50,validation_every_n_steps: -1,load_from_checkpoint: True,tensorboard_logdir: None,wandb_log: True,wandb_name: None,wandb_project: parlaiemely,wandb_entity: None,mutators: word_shuffle,flatten,last_turn,preserve_context: True,n_encoder_layers: -1,n_decoder_layers: -1,model_parallel: False,beam_block_full_context: True,beam_delay: 30,beam_block_list_filename: None,temperature: 1.0,interactive_mode: False,history_reversed: False,history

00:45:32 |     task: internal,external
00:45:32 |     temperature: 1.0
00:45:32 |     tensorboard_log: True
00:45:32 |     tensorboard_logdir: None
00:45:32 |     text_truncate: 512
00:45:32 |     topk: 10
00:45:32 |     topp: 0.9
00:45:32 |     truncate: -1
00:45:32 |     update_freq: 1
00:45:32 |     use_reply: label
00:45:32 |     validation_cutoff: 1.0
00:45:32 |     validation_every_n_epochs: 0.25
00:45:32 |     validation_every_n_secs: -1
00:45:32 |     validation_every_n_steps: -1
00:45:32 |     validation_max_exs: 20000
00:45:32 |     validation_metric: ppl
00:45:32 |     validation_metric_mode: min
00:45:32 |     validation_patience: 15
00:45:32 |     validation_share_agent: False
00:45:32 |     variant: xlm
00:45:32 |     verbose: False
00:45:32 |     wandb_entity: None
00:45:32 |     wandb_log: True
00:45:32 |     wandb_name: None
00:45:32 |     wandb_project: parlaiemely
00:45:32 |     warmup_rate: 0.0001
00:45:32 |     warmup_updates: -1
00:45:32 |     weight_decay: None
0

00:45:49 | training...
00:45:51 | time:19s total_exs:256 total_steps:16 epochs:0.26
             clen  clip  ctpb  ctps  ctrunc  ctrunclen  exps  exs  fp16_loss_scalar  gnorm  gpu_mem  llen  loss    lr  ltpb  \
   all      53.54     1 851.8  8734       0          0   164  256             32768  12.28    .2710 15.17  3.09 1e-06 236.7   
   external 54.28                         0          0         76                                   16.11 3.321               
   internal  52.8                         0          0        180                                   14.24 2.859               
             ltps  ltrunc  ltrunclen   ppl  token_acc  token_em  total_train_updates  tpb   tps   ups  
   all       2427       0          0 22.56      .3836         0                   16 1088 11160 10.32  
   external             0          0 27.68      .3578         0                                        
   internal             0          0 17.45      .4093         0

00:45:51 | creating task(s): in

00:46:35 | running eval: valid
00:46:44 | eval completed in 8.92s
00:46:44 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 1.637e-06 39.61   519 391.9       0          0 12.08  102 .1363    .2006 14.75 3.115 1e-06 222.4 167.9   
   external         0 1.892e-08 46.77                   0          0         22 .1049          15.59 3.405                     
   internal         0 3.255e-06 32.44                   0          0         80 .1678           13.9 2.826                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 23.49    .1438      .3825         0                   64 741.4 559.8  
   external       0          0 30.11    .1241      .3586         0                                   
   internal       0          0 16.87    .1634      .4065         0
[0m
00:46:44 | saving mode

00:47:22 | running eval: valid
00:47:30 | eval completed in 8.48s
00:47:30 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss    lr  ltpb  ltps  \
   all              0 1.067e-05 39.61   519 407.5       0          0 12.56  102 .1539    .2005 14.75 3.071 5e-07 222.4 174.6   
   external         0 1.777e-05 46.77                   0          0         22 .1485          15.59 3.339                     
   internal         0 3.577e-06 32.44                   0          0         80 .1594           13.9 2.803                     
             ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all            0          0 22.34    .1544      .3933         0                  128 741.4 582.2  
   external       0          0 28.19    .1556      .3703         0                                   
   internal       0          0 16.49    .1533      .4164         0
[0m
00:47:30 | saving mode



00:48:07 | running eval: valid
00:48:15 | eval completed in 7.93s
00:48:15 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  \
   all              0   .002922 39.61   519 430.9       0          0 13.28  102 .1722    .2005 14.75 3.036 2.5e-07 222.4   
   external         0 2.562e-06 46.77                   0          0         22 .1542          15.59 3.303                 
   internal         0   .005842 32.44                   0          0         80 .1902           13.9 2.769                 
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      184.6       0          0 21.58    .1673      .3980         0                  192 741.4 615.5  
   external             0          0 27.21    .1508      .3761         0                                   
   internal             0          0 15.94    .1838      .4200         0
[0m
00:48:15 | sav



00:48:53 | running eval: valid
00:49:00 | eval completed in 7.71s
00:49:00 | [1mvalid:
             accuracy    bleu-4  clen  ctpb  ctps  ctrunc  ctrunclen  exps  exs    f1  gpu_mem  llen  loss      lr  ltpb  \
   all              0   .002468 39.61   519 456.5       0          0 14.07  102 .1913    .2005 14.75 3.056 2.5e-07 222.4   
   external         0 1.288e-05 46.77                   0          0         22 .1756          15.59 3.374                 
   internal         0   .004923 32.44                   0          0         80 .2069           13.9 2.738                 
             ltps  ltrunc  ltrunclen   ppl  rouge_L  token_acc  token_em  total_train_updates   tpb   tps  
   all      195.6       0          0 22.33    .1985      .3899         0                  256 741.4 652.1  
   external             0          0  29.2    .1915      .3644         0                                   
   internal             0          0 15.45    .2056      .4155         0
[0m
00:49:00 | sav

0,1
internal/exs/train,176.0
exs/train,256.0
internal/clen/train,37.60795
internal/ctrunc/train,0.0
internal/ctrunclen/train,0.0
internal/llen/train,13.47159
internal/ltrunc/train,0.0
internal/ltrunclen/train,0.0
internal/loss/train,2.68555
internal/ppl/train,14.66634


0,1
internal/exs/train,█▃█▃▇▄▃█▄▆▄▅▁▆▆▆
exs/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/clen/train,█▆▇▅▅▃▃▃▂▂▂▃▂▁▂▁
internal/ctrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ctrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/llen/train,▄█▃▅▄▆▅▁▇▇▃▇▆▂▂▁
internal/ltrunc/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/ltrunclen/train,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
internal/loss/train,▅█▅▇▃▃▅▄▅▆▂▅▄▁▄▂
internal/ppl/train,▅█▅▇▃▃▅▄▄▅▂▄▄▁▄▁
