# Training

This notebook contains all the commands for training/fine-tuning a suit of pair classifiers used in Fig. 2 of the following paper:

```
Hosseini, Nanni and Coll Ardanuy (2020), DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching, EMNLP: System Demonstrations.
```

Refer to the `Fig2_EMNLP_inference` notebook where we use these models for inference.

---

In this notebook:

* skyline1: trained on *OCR* dataset
* skyline2: trained on *WG:en+OCR* dataset
* baseline: trained on *WG:en* dataset

---

* model A: both embedding and recurrent units are frozen (i.e., their parameters are not updated during fine-tuning).
* model B: only the embedding layer is frozen. 

---

To show the impact of fine-tuning and choice of architecture on the model performance, we trained various models starting with the baseline model and included more training instances from the training set of *OCR*.

The performance of these models is then assessed on the *OCR* test set. 

Refer to the paper for more information.

## skyline1

In [2]:
from DeezyMatch import train as dm_train

In [3]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm.yaml", 
         dataset_path="./dataset/ocr_trainval.txt", 
         model_name="ocr_001")



[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm.yaml[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.07219839096069336[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    59221
val      25380
test         2
Name: split, dtype: int64[0m
[92m2020-09-10 09:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32ms

                                                                     




[92m2020-09-10 09:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 09:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 09:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 09:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 09:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 09:40:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 09:40:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 926[0m
[92m2020-09-10 09:40:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 10[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 09:41:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_09:41:35 -- Epoch: 1/10; Train; loss: 0.293; acc: 0.875; precision: 0.849, recall: 0.912, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 09:41:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_09:41:44 -- Epoch: 1/10; Valid; loss: 0.177; acc: 0.935; precision: 0.919, recall: 0.955, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 09:41:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))

[92m2020-09-10 09:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_09:42:40 -- Epoch: 2/10; Train; loss: 0.154; acc: 0.946; precision: 0.932, recall: 0.962, macrof1: 0.946, weightedf1: 0.946[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 09:42:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_09:42:49 -- Epoch: 2/10; Valid; loss: 0.134; acc: 0.951; precision: 0.931, recall: 0.974, macrof1: 0.951, weightedf1: 0.951[0m
[92m2020-09-10 09:42:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))

[92m2020-09-10 09:43:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_09:43:45 -- Epoch: 3/10; Train; loss: 0.118; acc: 0.959; precision: 0.949, recall: 0.970, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 09:43:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_09:43:54 -- Epoch: 3/10; Valid; loss: 0.130; acc: 0.955; precision: 0.950, recall: 0.962, macrof1: 0.955, weightedf1: 0.955[0m
[92m2020-09-10 09:43:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=926.0), HTML(value='')))

[92m2020-09-10 09:44:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_09:44:51 -- Epoch: 4/10; Train; loss: 0.095; acc: 0.967; precision: 0.959, recall: 0.976, macrof1: 0.967, weightedf1: 0.967[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_09:45:00 -- Epoch: 4/10; Valid; loss: 0.133; acc: 0.956; precision: 0.941, recall: 0.972, macrof1: 0.956, weightedf1: 0.956[0m
[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/ocr_001/ocr_001.model[0m
[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 263.7606


## skyline1b

In [75]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm_b.yaml", 
         dataset_path="./dataset/ocr_trainval.txt", 
         model_name="ocr_001b")



[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_b.yaml[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.08831453323364258[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    83755
val        846
test         2
Name: split, dtype: int64[0m
[92m2020-09-10 22:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32

                                                                     




[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 1309[0m
[92m2020-09-10 22:12:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 3[0m




HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 22:14:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:14:04 -- Epoch: 1/3; Train; loss: 0.255; acc: 0.895; precision: 0.876, recall: 0.922, macr

HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))

[92m2020-09-10 22:14:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:14:05 -- Epoch: 1/3; Valid; loss: 0.135; acc: 0.957; precision: 0.943, recall: 0.974, macrof1: 0.957, weightedf1: 0.957[0m
[92m2020-09-10 22:14:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))

[92m2020-09-10 22:15:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:15:25 -- Epoch: 2/3; Train; loss: 0.132; acc: 0.955; precision: 0.944, recall: 0.968, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))

[92m2020-09-10 22:15:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:15:26 -- Epoch: 2/3; Valid; loss: 0.106; acc: 0.959; precision: 0.945, recall: 0.974, macrof1: 0.959, weightedf1: 0.959[0m
[92m2020-09-10 22:15:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1309.0), HTML(value='')))

[92m2020-09-10 22:16:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:16:46 -- Epoch: 3/3; Train; loss: 0.104; acc: 0.965; precision: 0.956, recall: 0.975, macrof1: 0.965, weightedf1: 0.965[0m


HBox(children=(FloatProgress(value=0.0, max=14.0), HTML(value='')))

[92m2020-09-10 22:16:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:16:46 -- Epoch: 3/3; Valid; loss: 0.093; acc: 0.961; precision: 0.951, recall: 0.972, macrof1: 0.961, weightedf1: 0.961[0m
[92m2020-09-10 22:16:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 22:16:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 3) at ./models/ocr_001b/ocr_001b.model[0m



User time: 243.2404


## skyline2

In [4]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm.yaml", 
         dataset_path="./dataset/wikigaz_en_ocr_trainval.txt", 
         model_name="wikigaz_en_ocr_gru_001")



[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm.yaml[0m
[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 09:45:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/wikigaz_en_ocr_trainval.txt[0m
[92m2020-09-10 09:45:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 343520 and False: 343521[0m
[92m2020-09-10 09:45:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 09:45:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.5975685119628906[0m
[92m2020-09-10 09:45:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    480927
val      206112
test          2
Name: split, dtype: int64[0m
[92m2020-09-10 09:45:06[0m [95mlwm-embeddings[0m [1m[90m[INF

                                                                       




[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 7515[0m
[92m2020-09-10 09:45:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 10[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 10:10:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_10:10:17 -- Epoch: 1/10; Train; loss: 0.321; acc: 0.860; precision: 0.849, recall: 0.875, mac

HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))

[92m2020-09-10 10:14:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_10:14:06 -- Epoch: 1/10; Valid; loss: 0.255; acc: 0.896; precision: 0.909, recall: 0.880, macrof1: 0.896, weightedf1: 0.896[0m
[92m2020-09-10 10:14:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))

[92m2020-09-10 10:32:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_10:32:38 -- Epoch: 2/10; Train; loss: 0.219; acc: 0.909; precision: 0.904, recall: 0.914, macrof1: 0.909, weightedf1: 0.909[0m


HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))

[92m2020-09-10 10:34:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_10:34:04 -- Epoch: 2/10; Valid; loss: 0.212; acc: 0.911; precision: 0.901, recall: 0.924, macrof1: 0.911, weightedf1: 0.911[0m
[92m2020-09-10 10:34:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))

[92m2020-09-10 10:43:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_10:43:38 -- Epoch: 3/10; Train; loss: 0.184; acc: 0.924; precision: 0.922, recall: 0.927, macrof1: 0.924, weightedf1: 0.924[0m


HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))

[92m2020-09-10 10:45:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_10:45:03 -- Epoch: 3/10; Valid; loss: 0.195; acc: 0.919; precision: 0.906, recall: 0.936, macrof1: 0.919, weightedf1: 0.919[0m
[92m2020-09-10 10:45:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=7515.0), HTML(value='')))

[92m2020-09-10 10:54:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_10:54:41 -- Epoch: 4/10; Train; loss: 0.164; acc: 0.933; precision: 0.931, recall: 0.936, macrof1: 0.933, weightedf1: 0.933[0m


HBox(children=(FloatProgress(value=0.0, max=3221.0), HTML(value='')))

[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_10:56:05 -- Epoch: 4/10; Valid; loss: 0.196; acc: 0.920; precision: 0.903, recall: 0.941, macrof1: 0.920, weightedf1: 0.920[0m
[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ocr_gru_001/wikigaz_en_ocr_gru_001.model[0m
[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 4235.8061


## skyline2b

In [76]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm_b.yaml", 
         dataset_path="./dataset/wikigaz_en_ocr_trainval.txt", 
         model_name="wikigaz_en_ocr_gru_001b")



[92m2020-09-10 22:16:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_b.yaml[0m
[92m2020-09-10 22:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 22:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/wikigaz_en_ocr_trainval.txt[0m
[92m2020-09-10 22:16:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 343520 and False: 343521[0m
[92m2020-09-10 22:16:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 22:16:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.6039412021636963[0m
[92m2020-09-10 22:16:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    680169
val        6870
test          2
Name: split, dtype: int64[0m
[92m2020-09-10 22:16:53[0m [95mlwm-embeddings[0m [1m[90m[I

                                                                       




[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 10628[0m
[92m2020-09-10 22:17:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 3[0m


HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 22:31:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:31:09 -- Epoch: 1/3; Train; loss: 0.302; acc: 0.869; precision: 0.858, recall: 0.883, macr

HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))

[92m2020-09-10 22:31:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:31:12 -- Epoch: 1/3; Valid; loss: 0.226; acc: 0.909; precision: 0.915, recall: 0.902, macrof1: 0.909, weightedf1: 0.909[0m
[92m2020-09-10 22:31:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))

[92m2020-09-10 22:44:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:44:40 -- Epoch: 2/3; Train; loss: 0.205; acc: 0.914; precision: 0.911, recall: 0.918, macrof1: 0.914, weightedf1: 0.914[0m


HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))

[92m2020-09-10 22:44:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:44:43 -- Epoch: 2/3; Valid; loss: 0.188; acc: 0.918; precision: 0.922, recall: 0.914, macrof1: 0.918, weightedf1: 0.918[0m
[92m2020-09-10 22:44:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=10628.0), HTML(value='')))

[92m2020-09-10 22:58:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_22:58:12 -- Epoch: 3/3; Train; loss: 0.175; acc: 0.928; precision: 0.927, recall: 0.930, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=108.0), HTML(value='')))

[92m2020-09-10 22:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_22:58:14 -- Epoch: 3/3; Valid; loss: 0.189; acc: 0.920; precision: 0.901, recall: 0.944, macrof1: 0.920, weightedf1: 0.920[0m
[92m2020-09-10 22:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 22:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 2) at ./models/wikigaz_en_ocr_gru_001b/wikigaz_en_ocr_gru_001b.model[0m



User time: 2459.0160


## baseline1_gru

In [5]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm.yaml", 
         dataset_path="./dataset/wikigaz_en_trainval.txt", 
         model_name="wikigaz_en_gru_001")



[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm.yaml[0m
[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 10:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt[0m
[92m2020-09-10 10:56:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 301219 and False: 301219[0m
[92m2020-09-10 10:56:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 10:56:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.5371513366699219[0m
[92m2020-09-10 10:56:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    421706
val      180730
test          2
Name: split, dtype: int64[0m
[92m2020-09-10 10:56:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][

                                                                       




[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 6590[0m
[92m2020-09-10 10:56:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 10[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 11:05:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:05:20 -- Epoch: 1/10; Train; loss: 0.280; acc: 0.883; precision: 0.875, recall: 0.894, mac

HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:06:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:06:31 -- Epoch: 1/10; Valid; loss: 0.203; acc: 0.918; precision: 0.912, recall: 0.924, macrof1: 0.918, weightedf1: 0.918[0m
[92m2020-09-10 11:06:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 11:14:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:14:44 -- Epoch: 2/10; Train; loss: 0.181; acc: 0.927; precision: 0.926, recall: 0.928, macrof1: 0.927, weightedf1: 0.927[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:15:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:15:55 -- Epoch: 2/10; Valid; loss: 0.173; acc: 0.932; precision: 0.931, recall: 0.932, macrof1: 0.932, weightedf1: 0.932[0m
[92m2020-09-10 11:15:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 11:24:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:24:08 -- Epoch: 3/10; Train; loss: 0.153; acc: 0.939; precision: 0.940, recall: 0.937, macrof1: 0.939, weightedf1: 0.939[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:25:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:25:19 -- Epoch: 3/10; Valid; loss: 0.168; acc: 0.934; precision: 0.934, recall: 0.933, macrof1: 0.934, weightedf1: 0.934[0m
[92m2020-09-10 11:25:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 11:33:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:33:33 -- Epoch: 4/10; Train; loss: 0.135; acc: 0.945; precision: 0.947, recall: 0.943, macrof1: 0.945, weightedf1: 0.945[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:34:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:34:43 -- Epoch: 4/10; Valid; loss: 0.160; acc: 0.937; precision: 0.936, recall: 0.938, macrof1: 0.937, weightedf1: 0.937[0m
[92m2020-09-10 11:34:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 11:42:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:42:45 -- Epoch: 5/10; Train; loss: 0.123; acc: 0.950; precision: 0.953, recall: 0.948, macrof1: 0.950, weightedf1: 0.950[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:43:52 -- Epoch: 5/10; Valid; loss: 0.167; acc: 0.936; precision: 0.941, recall: 0.930, macrof1: 0.936, weightedf1: 0.936[0m
[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model[0m
[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 2841.4835


## baseline1_lstm

In [6]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm_lstm.yaml", 
         dataset_path="./dataset/wikigaz_en_trainval.txt", 
         model_name="wikigaz_en_lstm_001")


[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm.yaml[0m
[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 11:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt[0m
[92m2020-09-10 11:43:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 301219 and False: 301219[0m
[92m2020-09-10 11:43:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 11:43:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.5019485950469971[0m
[92m2020-09-10 11:43:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    421706
val      180730
test          2
Name: split, dtype: int64[0m
[92m2020-09-10 11:43:58[0m [95mlwm-embeddings[0m [1m[90m[IN

                                                                       




[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 6590[0m
[92m2020-09-10 11:44:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 10[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 11:56:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_11:56:42 -- Epoch: 1/10; Train; loss: 0.286; acc: 0.880; precision: 0.871, recall: 0.891, ma

HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 11:59:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_11:59:05 -- Epoch: 1/10; Valid; loss: 0.220; acc: 0.911; precision: 0.906, recall: 0.917, macrof1: 0.911, weightedf1: 0.911[0m
[92m2020-09-10 11:59:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 12:15:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_12:15:11 -- Epoch: 2/10; Train; loss: 0.194; acc: 0.921; precision: 0.919, recall: 0.923, macrof1: 0.921, weightedf1: 0.921[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 12:18:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_12:18:24 -- Epoch: 2/10; Valid; loss: 0.189; acc: 0.924; precision: 0.932, recall: 0.915, macrof1: 0.924, weightedf1: 0.924[0m
[92m2020-09-10 12:18:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 12:26:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_12:26:57 -- Epoch: 3/10; Train; loss: 0.162; acc: 0.934; precision: 0.934, recall: 0.934, macrof1: 0.934, weightedf1: 0.934[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 12:28:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_12:28:12 -- Epoch: 3/10; Valid; loss: 0.171; acc: 0.931; precision: 0.936, recall: 0.926, macrof1: 0.931, weightedf1: 0.931[0m
[92m2020-09-10 12:28:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 12:41:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_12:41:48 -- Epoch: 4/10; Train; loss: 0.142; acc: 0.943; precision: 0.944, recall: 0.942, macrof1: 0.943, weightedf1: 0.943[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 12:43:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_12:43:03 -- Epoch: 4/10; Valid; loss: 0.164; acc: 0.935; precision: 0.934, recall: 0.935, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 12:43:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 12:58:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_12:58:27 -- Epoch: 5/10; Train; loss: 0.128; acc: 0.949; precision: 0.949, recall: 0.948, macrof1: 0.949, weightedf1: 0.949[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 13:01:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_13:01:29 -- Epoch: 5/10; Valid; loss: 0.163; acc: 0.935; precision: 0.932, recall: 0.939, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 13:01:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 13:15:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_13:15:44 -- Epoch: 6/10; Train; loss: 0.115; acc: 0.954; precision: 0.955, recall: 0.953, macrof1: 0.954, weightedf1: 0.954[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_13:19:50 -- Epoch: 6/10; Valid; loss: 0.166; acc: 0.938; precision: 0.945, recall: 0.932, macrof1: 0.938, weightedf1: 0.938[0m
[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model[0m
[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 5732.4270


## baseline1_rnn

In [7]:
# train a new model
dm_train(input_file_path="./inputs/input_dfm_rnn.yaml", 
         dataset_path="./dataset/wikigaz_en_trainval.txt", 
         model_name="wikigaz_en_rnn_001")


[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn.yaml[0m
[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 13:19:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/wikigaz_en_trainval.txt[0m
[92m2020-09-10 13:19:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 301219 and False: 301219[0m
[92m2020-09-10 13:19:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 13:19:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.498227596282959[0m
[92m2020-09-10 13:19:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    421706
val      180730
test          2
Name: split, dtype: int64[0m
[92m2020-09-10 13:19:56[0m [95mlwm-embeddings[0m [1m[90m[INFO

                                                                       




[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread inputs[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mcreate a two_parallel_rnns model[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mstart fitting parameters[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of batches: 6590[0m
[92m2020-09-10 13:20:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mNumber of epochs: 10[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 13:29:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_13:29:41 -- Epoch: 1/10; Train; loss: 0.325; acc: 0.861; precision: 0.854, recall: 0.870, macrof1: 0.861, weig

HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 13:30:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_13:30:49 -- Epoch: 1/10; Valid; loss: 0.263; acc: 0.891; precision: 0.893, recall: 0.888, macrof1: 0.891, weightedf1: 0.891[0m
[92m2020-09-10 13:30:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 13:40:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_13:40:04 -- Epoch: 2/10; Train; loss: 0.241; acc: 0.900; precision: 0.905, recall: 0.893, macrof1: 0.900, weightedf1: 0.900[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 13:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_13:42:13 -- Epoch: 2/10; Valid; loss: 0.233; acc: 0.902; precision: 0.902, recall: 0.901, macrof1: 0.902, weightedf1: 0.902[0m
[92m2020-09-10 13:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 13:54:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_13:54:22 -- Epoch: 3/10; Train; loss: 0.218; acc: 0.909; precision: 0.915, recall: 0.901, macrof1: 0.909, weightedf1: 0.909[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 13:56:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_13:56:30 -- Epoch: 3/10; Valid; loss: 0.223; acc: 0.906; precision: 0.916, recall: 0.896, macrof1: 0.906, weightedf1: 0.906[0m
[92m2020-09-10 13:56:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=6590.0), HTML(value='')))

[92m2020-09-10 14:09:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_14:09:19 -- Epoch: 4/10; Train; loss: 0.217; acc: 0.910; precision: 0.917, recall: 0.901, macrof1: 0.910, weightedf1: 0.910[0m


HBox(children=(FloatProgress(value=0.0, max=2824.0), HTML(value='')))

[92m2020-09-10 14:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_14:10:27 -- Epoch: 4/10; Valid; loss: 0.254; acc: 0.894; precision: 0.906, recall: 0.880, macrof1: 0.894, weightedf1: 0.894[0m
[92m2020-09-10 14:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model[0m
[92m2020-09-10 14:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 14:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 3011.0482


## Fine-Tune, model A, GRU

In [11]:
from DeezyMatch import finetune as dm_finetune

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name="wikigaz_en_ft_ocr_gru_v001_n250",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=250
           )

[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06539654731750488[0m
[92m2020-09-10 16:12:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:12:27[

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 16:12:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:12:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:12:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:12:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:12:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:12:30 -- Epoch: 1/20; Train; loss: 1.609; acc: 0.476; precision: 0.474, recall: 0.432, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:12:38 -- Epoch: 1/20; Valid; loss: 1.509; acc: 0.480; precision: 0.479, recall: 0.467, macrof1: 0.480, weightedf1: 0.480[0m
[92m2020-09-10 16:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:12:38 -- Epoch: 2/20; Train; loss: 1.279; acc: 0.508; precision: 0.509, recall: 0.456, macrof1: 0.507, weightedf1: 0.507[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:12:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:12:47 -- Epoch: 2/20; Valid; loss: 1.358; acc: 0.495; precision: 0.495, recall: 0.484, macrof1: 0.495, weightedf1: 0.495[0m
[92m2020-09-10 16:12:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:12:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:12:47 -- Epoch: 3/20; Train; loss: 1.045; acc: 0.556; precision: 0.562, recall: 0.504, macrof1: 0.555, weightedf1: 0.555[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:12:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:12:56 -- Epoch: 3/20; Valid; loss: 1.238; acc: 0.512; precision: 0.512, recall: 0.516, macrof1: 0.512, weightedf1: 0.512[0m
[92m2020-09-10 16:12:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:12:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:12:56 -- Epoch: 4/20; Train; loss: 0.860; acc: 0.612; precision: 0.625, recall: 0.560, macrof1: 0.611, weightedf1: 0.611[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:05 -- Epoch: 4/20; Valid; loss: 1.147; acc: 0.525; precision: 0.524, recall: 0.537, macrof1: 0.525, weightedf1: 0.525[0m
[92m2020-09-10 16:13:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:05 -- Epoch: 5/20; Train; loss: 0.741; acc: 0.648; precision: 0.661, recall: 0.608, macrof1: 0.647, weightedf1: 0.647[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:13 -- Epoch: 5/20; Valid; loss: 1.074; acc: 0.538; precision: 0.537, recall: 0.558, macrof1: 0.538, weightedf1: 0.538[0m
[92m2020-09-10 16:13:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:14 -- Epoch: 6/20; Train; loss: 0.632; acc: 0.692; precision: 0.711, recall: 0.648, macrof1: 0.691, weightedf1: 0.691[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:22 -- Epoch: 6/20; Valid; loss: 1.018; acc: 0.549; precision: 0.546, recall: 0.574, macrof1: 0.548, weightedf1: 0.548[0m
[92m2020-09-10 16:13:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:22 -- Epoch: 7/20; Train; loss: 0.540; acc: 0.732; precision: 0.750, recall: 0.696, macrof1: 0.732, weightedf1: 0.732[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:31 -- Epoch: 7/20; Valid; loss: 0.976; acc: 0.559; precision: 0.556, recall: 0.588, macrof1: 0.558, weightedf1: 0.558[0m
[92m2020-09-10 16:13:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:31 -- Epoch: 8/20; Train; loss: 0.475; acc: 0.780; precision: 0.802, recall: 0.744, macrof1: 0.780, weightedf1: 0.780[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:39 -- Epoch: 8/20; Valid; loss: 0.944; acc: 0.566; precision: 0.561, recall: 0.602, macrof1: 0.565, weightedf1: 0.565[0m
[92m2020-09-10 16:13:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:40 -- Epoch: 9/20; Train; loss: 0.431; acc: 0.800; precision: 0.821, recall: 0.768, macrof1: 0.800, weightedf1: 0.800[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:48 -- Epoch: 9/20; Valid; loss: 0.920; acc: 0.575; precision: 0.568, recall: 0.620, macrof1: 0.574, weightedf1: 0.574[0m
[92m2020-09-10 16:13:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:48 -- Epoch: 10/20; Train; loss: 0.390; acc: 0.840; precision: 0.840, recall: 0.840, macrof1: 0.840, weightedf1: 0.840[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:13:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:13:57 -- Epoch: 10/20; Valid; loss: 0.903; acc: 0.582; precision: 0.574, recall: 0.631, macrof1: 0.581, weightedf1: 0.581[0m
[92m2020-09-10 16:13:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:13:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:13:57 -- Epoch: 11/20; Train; loss: 0.352; acc: 0.856; precision: 0.856, recall: 0.856, macrof1: 0.856, weightedf1: 0.856[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:05 -- Epoch: 11/20; Valid; loss: 0.891; acc: 0.588; precision: 0.579, recall: 0.641, macrof1: 0.587, weightedf1: 0.587[0m
[92m2020-09-10 16:14:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:14:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:06 -- Epoch: 12/20; Train; loss: 0.324; acc: 0.868; precision: 0.859, recall: 0.880, macrof1: 0.868, weightedf1: 0.868[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:14 -- Epoch: 12/20; Valid; loss: 0.883; acc: 0.594; precision: 0.584, recall: 0.651, macrof1: 0.592, weightedf1: 0.592[0m
[92m2020-09-10 16:14:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:14:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:14 -- Epoch: 13/20; Train; loss: 0.293; acc: 0.900; precision: 0.891, recall: 0.912, macrof1: 0.900, weightedf1: 0.900[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:23 -- Epoch: 13/20; Valid; loss: 0.878; acc: 0.599; precision: 0.588, recall: 0.662, macrof1: 0.598, weightedf1: 0.598[0m
[92m2020-09-10 16:14:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:14:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:23 -- Epoch: 14/20; Train; loss: 0.266; acc: 0.912; precision: 0.899, recall: 0.928, macrof1: 0.912, weightedf1: 0.912[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:32 -- Epoch: 14/20; Valid; loss: 0.874; acc: 0.604; precision: 0.592, recall: 0.666, macrof1: 0.602, weightedf1: 0.602[0m
[92m2020-09-10 16:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:32 -- Epoch: 15/20; Train; loss: 0.241; acc: 0.928; precision: 0.921, recall: 0.936, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:40 -- Epoch: 15/20; Valid; loss: 0.874; acc: 0.609; precision: 0.596, recall: 0.673, macrof1: 0.607, weightedf1: 0.607[0m
[92m2020-09-10 16:14:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 16:14:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:41 -- Epoch: 16/20; Train; loss: 0.220; acc: 0.932; precision: 0.909, recall: 0.960, macrof1: 0.932, weightedf1: 0.932[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:14:49 -- Epoch: 16/20; Valid; loss: 0.875; acc: 0.613; precision: 0.601, recall: 0.675, macrof1: 0.612, weightedf1: 0.612[0m
[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 15) at ./models/wikigaz_en_ft_ocr_gru_v001_n250/wikigaz_en_ft_ocr_gru_v001_n250.model[0m
[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 16, selected epoch: 15[0m




User time: 139.8609


In [12]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:14:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:14:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:14:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:14:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06559967994689941[0m
[92m2020-09-10 16:14:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:14:50[

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 16:14:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:14:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:14:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:14:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:14:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:14:53 -- Epoch: 1/20; Train; loss: 1.595; acc: 0.450; precision: 0.444, recall: 0.396, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:01 -- Epoch: 1/20; Valid; loss: 1.339; acc: 0.499; precision: 0.499, recall: 0.494, macrof1: 0.499, weightedf1: 0.499[0m
[92m2020-09-10 16:15:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:01 -- Epoch: 2/20; Train; loss: 1.143; acc: 0.530; precision: 0.531, recall: 0.520, macrof1: 0.530, weightedf1: 0.530[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:10 -- Epoch: 2/20; Valid; loss: 1.110; acc: 0.532; precision: 0.529, recall: 0.582, macrof1: 0.531, weightedf1: 0.531[0m
[92m2020-09-10 16:15:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:10 -- Epoch: 3/20; Train; loss: 0.893; acc: 0.572; precision: 0.571, recall: 0.576, macrof1: 0.572, weightedf1: 0.572[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:19 -- Epoch: 3/20; Valid; loss: 0.961; acc: 0.556; precision: 0.551, recall: 0.603, macrof1: 0.555, weightedf1: 0.555[0m
[92m2020-09-10 16:15:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:19 -- Epoch: 4/20; Train; loss: 0.727; acc: 0.624; precision: 0.624, recall: 0.624, macrof1: 0.624, weightedf1: 0.624[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:28 -- Epoch: 4/20; Valid; loss: 0.870; acc: 0.579; precision: 0.572, recall: 0.628, macrof1: 0.578, weightedf1: 0.578[0m
[92m2020-09-10 16:15:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:28 -- Epoch: 5/20; Train; loss: 0.627; acc: 0.684; precision: 0.687, recall: 0.676, macrof1: 0.684, weightedf1: 0.684[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:37 -- Epoch: 5/20; Valid; loss: 0.806; acc: 0.597; precision: 0.590, recall: 0.633, macrof1: 0.596, weightedf1: 0.596[0m
[92m2020-09-10 16:15:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:37 -- Epoch: 6/20; Train; loss: 0.556; acc: 0.712; precision: 0.719, recall: 0.696, macrof1: 0.712, weightedf1: 0.712[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:46 -- Epoch: 6/20; Valid; loss: 0.768; acc: 0.613; precision: 0.604, recall: 0.657, macrof1: 0.612, weightedf1: 0.612[0m
[92m2020-09-10 16:15:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:46 -- Epoch: 7/20; Train; loss: 0.492; acc: 0.752; precision: 0.765, recall: 0.728, macrof1: 0.752, weightedf1: 0.752[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:15:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:15:55 -- Epoch: 7/20; Valid; loss: 0.743; acc: 0.627; precision: 0.616, recall: 0.673, macrof1: 0.626, weightedf1: 0.626[0m
[92m2020-09-10 16:15:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:15:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:15:55 -- Epoch: 8/20; Train; loss: 0.448; acc: 0.804; precision: 0.817, recall: 0.784, macrof1: 0.804, weightedf1: 0.804[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:03 -- Epoch: 8/20; Valid; loss: 0.726; acc: 0.642; precision: 0.627, recall: 0.699, macrof1: 0.641, weightedf1: 0.641[0m
[92m2020-09-10 16:16:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:16:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:04 -- Epoch: 9/20; Train; loss: 0.401; acc: 0.826; precision: 0.830, recall: 0.820, macrof1: 0.826, weightedf1: 0.826[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:12 -- Epoch: 9/20; Valid; loss: 0.713; acc: 0.652; precision: 0.636, recall: 0.713, macrof1: 0.651, weightedf1: 0.651[0m
[92m2020-09-10 16:16:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:16:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:12 -- Epoch: 10/20; Train; loss: 0.364; acc: 0.844; precision: 0.841, recall: 0.848, macrof1: 0.844, weightedf1: 0.844[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:21 -- Epoch: 10/20; Valid; loss: 0.701; acc: 0.664; precision: 0.650, recall: 0.712, macrof1: 0.663, weightedf1: 0.663[0m
[92m2020-09-10 16:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:21 -- Epoch: 11/20; Train; loss: 0.334; acc: 0.864; precision: 0.876, recall: 0.848, macrof1: 0.864, weightedf1: 0.864[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:30 -- Epoch: 11/20; Valid; loss: 0.694; acc: 0.674; precision: 0.665, recall: 0.703, macrof1: 0.674, weightedf1: 0.674[0m
[92m2020-09-10 16:16:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:16:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:30 -- Epoch: 12/20; Train; loss: 0.301; acc: 0.894; precision: 0.896, recall: 0.892, macrof1: 0.894, weightedf1: 0.894[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:38 -- Epoch: 12/20; Valid; loss: 0.692; acc: 0.684; precision: 0.668, recall: 0.730, macrof1: 0.683, weightedf1: 0.683[0m
[92m2020-09-10 16:16:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 16:16:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:39 -- Epoch: 13/20; Train; loss: 0.268; acc: 0.916; precision: 0.913, recall: 0.920, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:47 -- Epoch: 13/20; Valid; loss: 0.692; acc: 0.690; precision: 0.674, recall: 0.737, macrof1: 0.689, weightedf1: 0.689[0m
[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 12) at ./models/wikigaz_en_ft_ocr_gru_v001_n500/wikigaz_en_ft_ocr_gru_v001_n500.model[0m
[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 13, selected epoch: 12[0m




User time: 115.1197


In [13]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:16:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:16:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:16:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:16:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0675957202911377[0m
[92m2020-09-10 16:16:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:16:48[0

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 16:16:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:16:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:16:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:16:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:16:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:16:51 -- Epoch: 1/20; Train; loss: 1.366; acc: 0.484; precision: 0.484, recall: 0.490, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:16:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:16:59 -- Epoch: 1/20; Valid; loss: 1.083; acc: 0.522; precision: 0.522, recall: 0.515, macrof1: 0.522, weightedf1: 0.522[0m
[92m2020-09-10 16:16:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:00 -- Epoch: 2/20; Train; loss: 0.862; acc: 0.568; precision: 0.571, recall: 0.548, macrof1: 0.568, weightedf1: 0.568[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:09 -- Epoch: 2/20; Valid; loss: 0.832; acc: 0.572; precision: 0.573, recall: 0.563, macrof1: 0.572, weightedf1: 0.572[0m
[92m2020-09-10 16:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:09 -- Epoch: 3/20; Train; loss: 0.658; acc: 0.646; precision: 0.643, recall: 0.658, macrof1: 0.646, weightedf1: 0.646[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:18 -- Epoch: 3/20; Valid; loss: 0.733; acc: 0.614; precision: 0.605, recall: 0.655, macrof1: 0.613, weightedf1: 0.613[0m
[92m2020-09-10 16:17:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:18 -- Epoch: 4/20; Train; loss: 0.566; acc: 0.701; precision: 0.695, recall: 0.716, macrof1: 0.701, weightedf1: 0.701[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:27 -- Epoch: 4/20; Valid; loss: 0.680; acc: 0.646; precision: 0.639, recall: 0.668, macrof1: 0.645, weightedf1: 0.645[0m
[92m2020-09-10 16:17:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:27 -- Epoch: 5/20; Train; loss: 0.500; acc: 0.753; precision: 0.760, recall: 0.740, macrof1: 0.753, weightedf1: 0.753[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:36 -- Epoch: 5/20; Valid; loss: 0.643; acc: 0.673; precision: 0.667, recall: 0.693, macrof1: 0.673, weightedf1: 0.673[0m
[92m2020-09-10 16:17:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:36 -- Epoch: 6/20; Train; loss: 0.445; acc: 0.800; precision: 0.795, recall: 0.808, macrof1: 0.800, weightedf1: 0.800[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:45 -- Epoch: 6/20; Valid; loss: 0.619; acc: 0.694; precision: 0.688, recall: 0.710, macrof1: 0.694, weightedf1: 0.694[0m
[92m2020-09-10 16:17:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:46 -- Epoch: 7/20; Train; loss: 0.400; acc: 0.835; precision: 0.825, recall: 0.850, macrof1: 0.835, weightedf1: 0.835[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:17:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:17:54 -- Epoch: 7/20; Valid; loss: 0.603; acc: 0.712; precision: 0.709, recall: 0.720, macrof1: 0.712, weightedf1: 0.712[0m
[92m2020-09-10 16:17:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:17:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:17:55 -- Epoch: 8/20; Train; loss: 0.351; acc: 0.865; precision: 0.860, recall: 0.872, macrof1: 0.865, weightedf1: 0.865[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:03 -- Epoch: 8/20; Valid; loss: 0.589; acc: 0.724; precision: 0.725, recall: 0.722, macrof1: 0.724, weightedf1: 0.724[0m
[92m2020-09-10 16:18:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:18:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:04 -- Epoch: 9/20; Train; loss: 0.308; acc: 0.894; precision: 0.876, recall: 0.918, macrof1: 0.894, weightedf1: 0.894[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:12 -- Epoch: 9/20; Valid; loss: 0.581; acc: 0.732; precision: 0.730, recall: 0.736, macrof1: 0.732, weightedf1: 0.732[0m
[92m2020-09-10 16:18:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:18:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:13 -- Epoch: 10/20; Train; loss: 0.268; acc: 0.912; precision: 0.901, recall: 0.926, macrof1: 0.912, weightedf1: 0.912[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:22 -- Epoch: 10/20; Valid; loss: 0.573; acc: 0.743; precision: 0.742, recall: 0.744, macrof1: 0.743, weightedf1: 0.743[0m
[92m2020-09-10 16:18:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:18:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:22 -- Epoch: 11/20; Train; loss: 0.231; acc: 0.928; precision: 0.908, recall: 0.952, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:31 -- Epoch: 11/20; Valid; loss: 0.572; acc: 0.747; precision: 0.754, recall: 0.733, macrof1: 0.747, weightedf1: 0.747[0m
[92m2020-09-10 16:18:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 16:18:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:31 -- Epoch: 12/20; Train; loss: 0.201; acc: 0.949; precision: 0.939, recall: 0.960, macrof1: 0.949, weightedf1: 0.949[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:39 -- Epoch: 12/20; Valid; loss: 0.572; acc: 0.753; precision: 0.762, recall: 0.737, macrof1: 0.753, weightedf1: 0.753[0m
[92m2020-09-10 16:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_gru_v001_n1000/wikigaz_en_ft_ocr_gru_v001_n1000.model[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 12, selected epoch: 11[0m




User time: 109.2621


In [14]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.060266733169555664[0m
[92m2020-09-10 16:18:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:18:40

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 16:18:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:18:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:18:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:18:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:18:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:43 -- Epoch: 1/20; Train; loss: 1.167; acc: 0.515; precision: 0.515, recall: 0.521, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:18:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:18:52 -- Epoch: 1/20; Valid; loss: 0.843; acc: 0.569; precision: 0.569, recall: 0.563, macrof1: 0.569, weightedf1: 0.569[0m
[92m2020-09-10 16:18:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:18:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:18:53 -- Epoch: 2/20; Train; loss: 0.687; acc: 0.634; precision: 0.625, recall: 0.669, macrof1: 0.634, weightedf1: 0.634[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:01 -- Epoch: 2/20; Valid; loss: 0.670; acc: 0.644; precision: 0.633, recall: 0.687, macrof1: 0.644, weightedf1: 0.644[0m
[92m2020-09-10 16:19:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:02 -- Epoch: 3/20; Train; loss: 0.560; acc: 0.712; precision: 0.710, recall: 0.716, macrof1: 0.712, weightedf1: 0.712[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:10 -- Epoch: 3/20; Valid; loss: 0.606; acc: 0.691; precision: 0.665, recall: 0.770, macrof1: 0.689, weightedf1: 0.689[0m
[92m2020-09-10 16:19:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:11 -- Epoch: 4/20; Train; loss: 0.492; acc: 0.763; precision: 0.740, recall: 0.812, macrof1: 0.763, weightedf1: 0.763[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:20 -- Epoch: 4/20; Valid; loss: 0.565; acc: 0.727; precision: 0.733, recall: 0.715, macrof1: 0.727, weightedf1: 0.727[0m
[92m2020-09-10 16:19:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:21 -- Epoch: 5/20; Train; loss: 0.423; acc: 0.814; precision: 0.801, recall: 0.834, macrof1: 0.813, weightedf1: 0.813[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:29 -- Epoch: 5/20; Valid; loss: 0.535; acc: 0.749; precision: 0.736, recall: 0.777, macrof1: 0.749, weightedf1: 0.749[0m
[92m2020-09-10 16:19:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:30 -- Epoch: 6/20; Train; loss: 0.365; acc: 0.852; precision: 0.833, recall: 0.881, macrof1: 0.852, weightedf1: 0.852[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:39 -- Epoch: 6/20; Valid; loss: 0.515; acc: 0.763; precision: 0.741, recall: 0.809, macrof1: 0.763, weightedf1: 0.763[0m
[92m2020-09-10 16:19:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:40 -- Epoch: 7/20; Train; loss: 0.307; acc: 0.881; precision: 0.861, recall: 0.908, macrof1: 0.881, weightedf1: 0.881[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:48 -- Epoch: 7/20; Valid; loss: 0.498; acc: 0.779; precision: 0.779, recall: 0.781, macrof1: 0.779, weightedf1: 0.779[0m
[92m2020-09-10 16:19:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:49 -- Epoch: 8/20; Train; loss: 0.266; acc: 0.908; precision: 0.896, recall: 0.924, macrof1: 0.908, weightedf1: 0.908[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:19:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:19:57 -- Epoch: 8/20; Valid; loss: 0.494; acc: 0.784; precision: 0.776, recall: 0.799, macrof1: 0.784, weightedf1: 0.784[0m
[92m2020-09-10 16:19:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:19:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:19:58 -- Epoch: 9/20; Train; loss: 0.226; acc: 0.937; precision: 0.918, recall: 0.960, macrof1: 0.937, weightedf1: 0.937[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:05 -- Epoch: 9/20; Valid; loss: 0.491; acc: 0.790; precision: 0.781, recall: 0.805, macrof1: 0.790, weightedf1: 0.790[0m
[92m2020-09-10 16:20:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 16:20:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:06 -- Epoch: 10/20; Train; loss: 0.186; acc: 0.950; precision: 0.932, recall: 0.970, macrof1: 0.949, weightedf1: 0.949[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:14 -- Epoch: 10/20; Valid; loss: 0.492; acc: 0.793; precision: 0.779, recall: 0.820, macrof1: 0.793, weightedf1: 0.793[0m
[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_gru_v001_n2000/wikigaz_en_ft_ocr_gru_v001_n2000.model[0m
[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 10, selected epoch: 9[0m




User time: 92.1158


In [15]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:20:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:20:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:20:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:20:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.051637887954711914[0m
[92m2020-09-10 16:20:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:20:15

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 16:20:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:20:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:20:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:20:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:20:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:19 -- Epoch: 1/20; Train; loss: 0.938; acc: 0.582; precision: 0.581, recall: 0.584, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:27 -- Epoch: 1/20; Valid; loss: 0.660; acc: 0.652; precision: 0.644, recall: 0.679, macrof1: 0.652, weightedf1: 0.652[0m
[92m2020-09-10 16:20:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:20:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:29 -- Epoch: 2/20; Train; loss: 0.549; acc: 0.727; precision: 0.724, recall: 0.734, macrof1: 0.727, weightedf1: 0.727[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:37 -- Epoch: 2/20; Valid; loss: 0.537; acc: 0.745; precision: 0.752, recall: 0.730, macrof1: 0.745, weightedf1: 0.745[0m
[92m2020-09-10 16:20:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:20:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:39 -- Epoch: 3/20; Train; loss: 0.447; acc: 0.795; precision: 0.799, recall: 0.789, macrof1: 0.795, weightedf1: 0.795[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:46 -- Epoch: 3/20; Valid; loss: 0.482; acc: 0.781; precision: 0.786, recall: 0.773, macrof1: 0.781, weightedf1: 0.781[0m
[92m2020-09-10 16:20:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:20:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:48 -- Epoch: 4/20; Train; loss: 0.376; acc: 0.836; precision: 0.832, recall: 0.843, macrof1: 0.836, weightedf1: 0.836[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:20:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:20:56 -- Epoch: 4/20; Valid; loss: 0.447; acc: 0.802; precision: 0.805, recall: 0.798, macrof1: 0.802, weightedf1: 0.802[0m
[92m2020-09-10 16:20:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:20:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:20:58 -- Epoch: 5/20; Train; loss: 0.324; acc: 0.860; precision: 0.850, recall: 0.873, macrof1: 0.860, weightedf1: 0.860[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:21:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:21:06 -- Epoch: 5/20; Valid; loss: 0.430; acc: 0.814; precision: 0.805, recall: 0.829, macrof1: 0.814, weightedf1: 0.814[0m
[92m2020-09-10 16:21:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:21:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:21:08 -- Epoch: 6/20; Train; loss: 0.275; acc: 0.895; precision: 0.887, recall: 0.906, macrof1: 0.895, weightedf1: 0.895[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:21:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:21:16 -- Epoch: 6/20; Valid; loss: 0.424; acc: 0.819; precision: 0.801, recall: 0.849, macrof1: 0.819, weightedf1: 0.819[0m
[92m2020-09-10 16:21:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:21:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:21:17 -- Epoch: 7/20; Train; loss: 0.228; acc: 0.919; precision: 0.906, recall: 0.934, macrof1: 0.919, weightedf1: 0.919[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:21:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:21:25 -- Epoch: 7/20; Valid; loss: 0.418; acc: 0.826; precision: 0.822, recall: 0.832, macrof1: 0.826, weightedf1: 0.826[0m
[92m2020-09-10 16:21:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 16:21:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:21:27 -- Epoch: 8/20; Train; loss: 0.187; acc: 0.941; precision: 0.929, recall: 0.954, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:21:35 -- Epoch: 8/20; Valid; loss: 0.418; acc: 0.830; precision: 0.831, recall: 0.829, macrof1: 0.830, weightedf1: 0.830[0m
[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_gru_v001_n4000/wikigaz_en_ft_ocr_gru_v001_n4000.model[0m
[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 8, selected epoch: 7[0m




User time: 77.7573


In [16]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:21:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0495457649230957[0m
[92m2020-09-10 16:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:21:36[0

                                                    

[92m2020-09-10 16:21:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:21:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:21:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:21:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:21:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:21:41 -- Epoch: 1/20; Train; loss: 0.786; acc: 0.627; precision: 0.622, recall: 0.648, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:21:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:21:49 -- Epoch: 1/20; Valid; loss: 0.540; acc: 0.738; precision: 0.732, recall: 0.751, macrof1: 0.738, weightedf1: 0.738[0m
[92m2020-09-10 16:21:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:21:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:21:53 -- Epoch: 2/20; Train; loss: 0.461; acc: 0.788; precision: 0.773, recall: 0.815, macrof1: 0.787, weightedf1: 0.787[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:01 -- Epoch: 2/20; Valid; loss: 0.437; acc: 0.804; precision: 0.785, recall: 0.838, macrof1: 0.804, weightedf1: 0.804[0m
[92m2020-09-10 16:22:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:22:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:22:04 -- Epoch: 3/20; Train; loss: 0.370; acc: 0.841; precision: 0.827, recall: 0.862, macrof1: 0.841, weightedf1: 0.841[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:12 -- Epoch: 3/20; Valid; loss: 0.396; acc: 0.829; precision: 0.825, recall: 0.836, macrof1: 0.829, weightedf1: 0.829[0m
[92m2020-09-10 16:22:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:22:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:22:16 -- Epoch: 4/20; Train; loss: 0.309; acc: 0.875; precision: 0.860, recall: 0.895, macrof1: 0.875, weightedf1: 0.875[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:24 -- Epoch: 4/20; Valid; loss: 0.379; acc: 0.837; precision: 0.810, recall: 0.881, macrof1: 0.837, weightedf1: 0.837[0m
[92m2020-09-10 16:22:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:22:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:22:27 -- Epoch: 5/20; Train; loss: 0.258; acc: 0.898; precision: 0.883, recall: 0.917, macrof1: 0.898, weightedf1: 0.898[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:35 -- Epoch: 5/20; Valid; loss: 0.363; acc: 0.849; precision: 0.838, recall: 0.867, macrof1: 0.849, weightedf1: 0.849[0m
[92m2020-09-10 16:22:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:22:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:22:39 -- Epoch: 6/20; Train; loss: 0.210; acc: 0.926; precision: 0.915, recall: 0.940, macrof1: 0.926, weightedf1: 0.926[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:46 -- Epoch: 6/20; Valid; loss: 0.362; acc: 0.854; precision: 0.847, recall: 0.863, macrof1: 0.854, weightedf1: 0.854[0m
[92m2020-09-10 16:22:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 16:22:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:22:50 -- Epoch: 7/20; Train; loss: 0.172; acc: 0.945; precision: 0.933, recall: 0.958, macrof1: 0.944, weightedf1: 0.944[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:22:58 -- Epoch: 7/20; Valid; loss: 0.373; acc: 0.854; precision: 0.850, recall: 0.861, macrof1: 0.854, weightedf1: 0.854[0m
[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_gru_v001_n8000/wikigaz_en_ft_ocr_gru_v001_n8000.model[0m
[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 7, selected epoch: 6[0m




User time: 80.4513


In [17]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:22:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:22:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:22:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:22:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.047933101654052734[0m
[92m2020-09-10 16:22:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:22:59

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 16:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:23:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:23:08 -- Epoch: 1/20; Train; loss: 0.624; acc: 0.705; precision: 0.700, recall: 0.715, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:23:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:23:16 -- Epoch: 1/20; Valid; loss: 0.434; acc: 0.807; precision: 0.837, recall: 0.761, macrof1: 0.806, weightedf1: 0.806[0m
[92m2020-09-10 16:23:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 16:23:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:23:23 -- Epoch: 2/20; Train; loss: 0.367; acc: 0.843; precision: 0.834, recall: 0.856, macrof1: 0.843, weightedf1: 0.843[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:23:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:23:31 -- Epoch: 2/20; Valid; loss: 0.356; acc: 0.847; precision: 0.833, recall: 0.869, macrof1: 0.847, weightedf1: 0.847[0m
[92m2020-09-10 16:23:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 16:23:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:23:38 -- Epoch: 3/20; Train; loss: 0.295; acc: 0.881; precision: 0.868, recall: 0.898, macrof1: 0.881, weightedf1: 0.881[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:23:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:23:46 -- Epoch: 3/20; Valid; loss: 0.322; acc: 0.865; precision: 0.849, recall: 0.888, macrof1: 0.865, weightedf1: 0.865[0m
[92m2020-09-10 16:23:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 16:23:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:23:53 -- Epoch: 4/20; Train; loss: 0.243; acc: 0.906; precision: 0.893, recall: 0.922, macrof1: 0.906, weightedf1: 0.906[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:24:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:24:01 -- Epoch: 4/20; Valid; loss: 0.308; acc: 0.873; precision: 0.855, recall: 0.897, macrof1: 0.873, weightedf1: 0.873[0m
[92m2020-09-10 16:24:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 16:24:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:24:08 -- Epoch: 5/20; Train; loss: 0.197; acc: 0.927; precision: 0.918, recall: 0.938, macrof1: 0.927, weightedf1: 0.927[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:24:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:24:16 -- Epoch: 5/20; Valid; loss: 0.307; acc: 0.877; precision: 0.859, recall: 0.903, macrof1: 0.877, weightedf1: 0.877[0m
[92m2020-09-10 16:24:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 16:24:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:24:23 -- Epoch: 6/20; Train; loss: 0.158; acc: 0.944; precision: 0.935, recall: 0.954, macrof1: 0.944, weightedf1: 0.944[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:24:31 -- Epoch: 6/20; Valid; loss: 0.312; acc: 0.878; precision: 0.882, recall: 0.874, macrof1: 0.878, weightedf1: 0.878[0m
[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v001_n16000/wikigaz_en_ft_ocr_gru_v001_n16000.model[0m
[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 89.9560


In [18]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A.yaml[0m
[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 16:24:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 16:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 16:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 16:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05030536651611328[0m
[92m2020-09-10 16:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 16:24:32[

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 16:24:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 16:24:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 16:24:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 16:24:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 16:24:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:24:48 -- Epoch: 1/20; Train; loss: 0.504; acc: 0.767; precision: 0.756, recall: 0.788, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:24:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:24:56 -- Epoch: 1/20; Valid; loss: 0.357; acc: 0.850; precision: 0.838, recall: 0.869, macrof1: 0.850, weightedf1: 0.850[0m
[92m2020-09-10 16:24:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 16:25:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:25:10 -- Epoch: 2/20; Train; loss: 0.307; acc: 0.873; precision: 0.861, recall: 0.889, macrof1: 0.873, weightedf1: 0.873[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:25:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:25:18 -- Epoch: 2/20; Valid; loss: 0.300; acc: 0.878; precision: 0.873, recall: 0.883, macrof1: 0.878, weightedf1: 0.878[0m
[92m2020-09-10 16:25:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 16:25:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:25:32 -- Epoch: 3/20; Train; loss: 0.245; acc: 0.904; precision: 0.895, recall: 0.916, macrof1: 0.904, weightedf1: 0.904[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:25:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:25:40 -- Epoch: 3/20; Valid; loss: 0.277; acc: 0.887; precision: 0.870, recall: 0.911, macrof1: 0.887, weightedf1: 0.887[0m
[92m2020-09-10 16:25:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 16:25:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:25:54 -- Epoch: 4/20; Train; loss: 0.197; acc: 0.925; precision: 0.916, recall: 0.936, macrof1: 0.925, weightedf1: 0.925[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:26:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:26:02 -- Epoch: 4/20; Valid; loss: 0.271; acc: 0.892; precision: 0.886, recall: 0.900, macrof1: 0.892, weightedf1: 0.892[0m
[92m2020-09-10 16:26:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 16:26:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:26:16 -- Epoch: 5/20; Train; loss: 0.161; acc: 0.940; precision: 0.932, recall: 0.950, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:26:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:26:24 -- Epoch: 5/20; Valid; loss: 0.268; acc: 0.896; precision: 0.894, recall: 0.898, macrof1: 0.896, weightedf1: 0.896[0m
[92m2020-09-10 16:26:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 16:26:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_16:26:39 -- Epoch: 6/20; Train; loss: 0.128; acc: 0.955; precision: 0.949, recall: 0.962, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 16:26:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_16:26:47 -- Epoch: 6/20; Valid; loss: 0.282; acc: 0.895; precision: 0.882, recall: 0.912, macrof1: 0.895, weightedf1: 0.895[0m
[92m2020-09-10 16:26:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v001_n32000/wikigaz_en_ft_ocr_gru_v001_n32000.model[0m
[92m2020-09-10 16:26:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 16:26:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 132.1225


In [22]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:24:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:24:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:24:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06342124938964844[0m
[92m2020-09-10 17:24:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 17:24:26[0m [95mlwm-embeddings[0m [1m[90m[INF

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 17:24:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:24:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:24:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 17:24:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2



HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:24:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:24:55 -- Epoch: 1/10; Train; loss: 0.412; acc: 0.818; precision: 0.810, recall: 0.832, mac

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:25:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:25:01 -- Epoch: 1/10; Valid; loss: 0.289; acc: 0.883; precision: 0.868, recall: 0.903, macrof1: 0.883, weightedf1: 0.883[0m
[92m2020-09-10 17:25:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:25:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:25:26 -- Epoch: 2/10; Train; loss: 0.253; acc: 0.899; precision: 0.889, recall: 0.912, macrof1: 0.899, weightedf1: 0.899[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:25:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:25:33 -- Epoch: 2/10; Valid; loss: 0.245; acc: 0.902; precision: 0.887, recall: 0.921, macrof1: 0.902, weightedf1: 0.902[0m
[92m2020-09-10 17:25:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:25:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:25:58 -- Epoch: 3/10; Train; loss: 0.201; acc: 0.921; precision: 0.913, recall: 0.931, macrof1: 0.921, weightedf1: 0.921[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:26:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:26:04 -- Epoch: 3/10; Valid; loss: 0.231; acc: 0.909; precision: 0.888, recall: 0.937, macrof1: 0.909, weightedf1: 0.909[0m
[92m2020-09-10 17:26:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:26:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:26:30 -- Epoch: 4/10; Train; loss: 0.164; acc: 0.937; precision: 0.929, recall: 0.947, macrof1: 0.937, weightedf1: 0.937[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:26:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:26:36 -- Epoch: 4/10; Valid; loss: 0.224; acc: 0.914; precision: 0.888, recall: 0.947, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 17:26:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:27:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:27:02 -- Epoch: 5/10; Train; loss: 0.137; acc: 0.948; precision: 0.942, recall: 0.955, macrof1: 0.948, weightedf1: 0.948[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:27:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:27:08 -- Epoch: 5/10; Valid; loss: 0.226; acc: 0.914; precision: 0.904, recall: 0.927, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 17:27:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:27:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:27:33 -- Epoch: 6/10; Train; loss: 0.113; acc: 0.958; precision: 0.952, recall: 0.965, macrof1: 0.958, weightedf1: 0.958[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:27:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:27:39 -- Epoch: 6/10; Valid; loss: 0.229; acc: 0.916; precision: 0.907, recall: 0.928, macrof1: 0.916, weightedf1: 0.916[0m
[92m2020-09-10 17:27:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:28:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:28:05 -- Epoch: 7/10; Train; loss: 0.094; acc: 0.967; precision: 0.962, recall: 0.972, macrof1: 0.967, weightedf1: 0.967[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:28:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:28:11 -- Epoch: 7/10; Valid; loss: 0.246; acc: 0.914; precision: 0.912, recall: 0.917, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 17:28:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:28:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:28:37 -- Epoch: 8/10; Train; loss: 0.078; acc: 0.973; precision: 0.968, recall: 0.978, macrof1: 0.973, weightedf1: 0.973[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:28:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:28:43 -- Epoch: 8/10; Valid; loss: 0.258; acc: 0.914; precision: 0.915, recall: 0.914, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 17:28:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:29:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:29:08 -- Epoch: 9/10; Train; loss: 0.066; acc: 0.977; precision: 0.973, recall: 0.982, macrof1: 0.977, weightedf1: 0.977[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:29:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:29:15 -- Epoch: 9/10; Valid; loss: 0.282; acc: 0.915; precision: 0.893, recall: 0.942, macrof1: 0.915, weightedf1: 0.915[0m
[92m2020-09-10 17:29:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:29:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:29:42 -- Epoch: 10/10; Train; loss: 0.056; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:29:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:29:49 -- Epoch: 10/10; Valid; loss: 0.292; acc: 0.915; precision: 0.904, recall: 0.929, macrof1: 0.915, weightedf1: 0.915[0m
[92m2020-09-10 17:29:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 17:29:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v001_n64000/wikigaz_en_ft_ocr_gru_v001_n64000.model[0m



User time: 320.0446


In [21]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:18:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 17:18:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:18:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:18:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:18:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:18:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.050637245178222656[0m
[92m2020-09-10 17:18:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 17:18:31[0m [95mlwm-embeddings[0m [1m[90m[IN

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 17:18:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:18:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:18:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 17:18:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:19:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:19:10 -- Epoch: 1/10; Train; loss: 0.384; acc: 0.834; precision: 0.827, recall: 0.845, mac

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:19:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:19:10 -- Epoch: 1/10; Valid; loss: 0.252; acc: 0.910; precision: 0.933, recall: 0.884, macrof1: 0.910, weightedf1: 0.910[0m
[92m2020-09-10 17:19:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:19:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:19:46 -- Epoch: 2/10; Train; loss: 0.235; acc: 0.906; precision: 0.896, recall: 0.919, macrof1: 0.906, weightedf1: 0.906[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:19:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:19:46 -- Epoch: 2/10; Valid; loss: 0.191; acc: 0.924; precision: 0.913, recall: 0.937, macrof1: 0.924, weightedf1: 0.924[0m
[92m2020-09-10 17:19:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:20:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:20:23 -- Epoch: 3/10; Train; loss: 0.186; acc: 0.928; precision: 0.920, recall: 0.938, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:20:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:20:23 -- Epoch: 3/10; Valid; loss: 0.183; acc: 0.932; precision: 0.933, recall: 0.930, macrof1: 0.932, weightedf1: 0.932[0m
[92m2020-09-10 17:20:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:20:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:20:59 -- Epoch: 4/10; Train; loss: 0.154; acc: 0.941; precision: 0.933, recall: 0.951, macrof1: 0.941, weightedf1: 0.941[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:20:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:20:59 -- Epoch: 4/10; Valid; loss: 0.195; acc: 0.930; precision: 0.954, recall: 0.904, macrof1: 0.930, weightedf1: 0.930[0m
[92m2020-09-10 17:20:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:21:36 -- Epoch: 5/10; Train; loss: 0.130; acc: 0.951; precision: 0.944, recall: 0.959, macrof1: 0.951, weightedf1: 0.951[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:21:36 -- Epoch: 5/10; Valid; loss: 0.185; acc: 0.925; precision: 0.913, recall: 0.940, macrof1: 0.925, weightedf1: 0.925[0m
[92m2020-09-10 17:21:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:22:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:22:10 -- Epoch: 6/10; Train; loss: 0.109; acc: 0.959; precision: 0.953, recall: 0.966, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:22:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:22:10 -- Epoch: 6/10; Valid; loss: 0.207; acc: 0.937; precision: 0.928, recall: 0.947, macrof1: 0.937, weightedf1: 0.937[0m
[92m2020-09-10 17:22:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:22:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:22:44 -- Epoch: 7/10; Train; loss: 0.093; acc: 0.966; precision: 0.961, recall: 0.972, macrof1: 0.966, weightedf1: 0.966[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:22:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:22:44 -- Epoch: 7/10; Valid; loss: 0.206; acc: 0.925; precision: 0.927, recall: 0.924, macrof1: 0.925, weightedf1: 0.925[0m
[92m2020-09-10 17:22:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:23:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:23:18 -- Epoch: 8/10; Train; loss: 0.079; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:23:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:23:18 -- Epoch: 8/10; Valid; loss: 0.210; acc: 0.929; precision: 0.901, recall: 0.963, macrof1: 0.929, weightedf1: 0.929[0m
[92m2020-09-10 17:23:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:23:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:23:51 -- Epoch: 9/10; Train; loss: 0.069; acc: 0.976; precision: 0.972, recall: 0.980, macrof1: 0.975, weightedf1: 0.975[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:23:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:23:52 -- Epoch: 9/10; Valid; loss: 0.216; acc: 0.929; precision: 0.919, recall: 0.940, macrof1: 0.929, weightedf1: 0.929[0m
[92m2020-09-10 17:23:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:24:25 -- Epoch: 10/10; Train; loss: 0.059; acc: 0.980; precision: 0.976, recall: 0.983, macrof1: 0.980, weightedf1: 0.980[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:24:25 -- Epoch: 10/10; Valid; loss: 0.232; acc: 0.937; precision: 0.946, recall: 0.927, macrof1: 0.937, weightedf1: 0.937[0m
[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 17:24:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v001_n84000/wikigaz_en_ft_ocr_gru_v001_n84000.model[0m



User time: 351.9607


## Fine-Tune, model A, LSTM

In [25]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 250

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:31:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:31:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:31:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:31:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:31:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:31:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04691624641418457[0m
[92m2020-09-10 17:31:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:31:51

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 17:31:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:31:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:31:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:31:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:31:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:31:53 -- Epoch: 1/20; Train; loss: 1.620; acc: 0.448; precision: 0.446, recall: 0.432, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:02 -- Epoch: 1/20; Valid; loss: 1.699; acc: 0.472; precision: 0.474, recall: 0.509, macrof1: 0.471, weightedf1: 0.471[0m
[92m2020-09-10 17:32:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:02 -- Epoch: 2/20; Train; loss: 1.309; acc: 0.492; precision: 0.492, recall: 0.472, macrof1: 0.492, weightedf1: 0.492[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:13 -- Epoch: 2/20; Valid; loss: 1.589; acc: 0.487; precision: 0.488, recall: 0.526, macrof1: 0.486, weightedf1: 0.486[0m
[92m2020-09-10 17:32:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:13 -- Epoch: 3/20; Train; loss: 1.089; acc: 0.524; precision: 0.525, recall: 0.504, macrof1: 0.524, weightedf1: 0.524[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:23 -- Epoch: 3/20; Valid; loss: 1.493; acc: 0.503; precision: 0.503, recall: 0.549, macrof1: 0.502, weightedf1: 0.502[0m
[92m2020-09-10 17:32:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:23 -- Epoch: 4/20; Train; loss: 0.899; acc: 0.620; precision: 0.625, recall: 0.600, macrof1: 0.620, weightedf1: 0.620[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:33 -- Epoch: 4/20; Valid; loss: 1.411; acc: 0.516; precision: 0.515, recall: 0.566, macrof1: 0.515, weightedf1: 0.515[0m
[92m2020-09-10 17:32:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:33 -- Epoch: 5/20; Train; loss: 0.749; acc: 0.684; precision: 0.689, recall: 0.672, macrof1: 0.684, weightedf1: 0.684[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:42 -- Epoch: 5/20; Valid; loss: 1.343; acc: 0.526; precision: 0.523, recall: 0.578, macrof1: 0.524, weightedf1: 0.524[0m
[92m2020-09-10 17:32:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:43 -- Epoch: 6/20; Train; loss: 0.635; acc: 0.732; precision: 0.734, recall: 0.728, macrof1: 0.732, weightedf1: 0.732[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:32:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:32:51 -- Epoch: 6/20; Valid; loss: 1.286; acc: 0.536; precision: 0.532, recall: 0.600, macrof1: 0.535, weightedf1: 0.535[0m
[92m2020-09-10 17:32:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:32:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:32:51 -- Epoch: 7/20; Train; loss: 0.534; acc: 0.772; precision: 0.766, recall: 0.784, macrof1: 0.772, weightedf1: 0.772[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:00 -- Epoch: 7/20; Valid; loss: 1.241; acc: 0.546; precision: 0.540, recall: 0.616, macrof1: 0.543, weightedf1: 0.543[0m
[92m2020-09-10 17:33:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:00 -- Epoch: 8/20; Train; loss: 0.463; acc: 0.804; precision: 0.802, recall: 0.808, macrof1: 0.804, weightedf1: 0.804[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:08 -- Epoch: 8/20; Valid; loss: 1.201; acc: 0.553; precision: 0.546, recall: 0.627, macrof1: 0.551, weightedf1: 0.551[0m
[92m2020-09-10 17:33:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:09 -- Epoch: 9/20; Train; loss: 0.400; acc: 0.860; precision: 0.846, recall: 0.880, macrof1: 0.860, weightedf1: 0.860[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:17 -- Epoch: 9/20; Valid; loss: 1.168; acc: 0.562; precision: 0.554, recall: 0.638, macrof1: 0.559, weightedf1: 0.559[0m
[92m2020-09-10 17:33:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:17 -- Epoch: 10/20; Train; loss: 0.348; acc: 0.884; precision: 0.858, recall: 0.920, macrof1: 0.884, weightedf1: 0.884[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:26 -- Epoch: 10/20; Valid; loss: 1.141; acc: 0.569; precision: 0.559, recall: 0.646, macrof1: 0.566, weightedf1: 0.566[0m
[92m2020-09-10 17:33:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:26 -- Epoch: 11/20; Train; loss: 0.308; acc: 0.896; precision: 0.878, recall: 0.920, macrof1: 0.896, weightedf1: 0.896[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:35 -- Epoch: 11/20; Valid; loss: 1.119; acc: 0.574; precision: 0.564, recall: 0.652, macrof1: 0.572, weightedf1: 0.572[0m
[92m2020-09-10 17:33:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:35 -- Epoch: 12/20; Train; loss: 0.275; acc: 0.916; precision: 0.894, recall: 0.944, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:43 -- Epoch: 12/20; Valid; loss: 1.100; acc: 0.579; precision: 0.569, recall: 0.654, macrof1: 0.577, weightedf1: 0.577[0m
[92m2020-09-10 17:33:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:43 -- Epoch: 13/20; Train; loss: 0.245; acc: 0.928; precision: 0.908, recall: 0.952, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:33:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:33:52 -- Epoch: 13/20; Valid; loss: 1.086; acc: 0.584; precision: 0.573, recall: 0.658, macrof1: 0.581, weightedf1: 0.581[0m
[92m2020-09-10 17:33:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:33:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:33:52 -- Epoch: 14/20; Train; loss: 0.212; acc: 0.940; precision: 0.923, recall: 0.960, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:01 -- Epoch: 14/20; Valid; loss: 1.074; acc: 0.587; precision: 0.576, recall: 0.664, macrof1: 0.585, weightedf1: 0.585[0m
[92m2020-09-10 17:34:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:01 -- Epoch: 15/20; Train; loss: 0.191; acc: 0.960; precision: 0.939, recall: 0.984, macrof1: 0.960, weightedf1: 0.960[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:09 -- Epoch: 15/20; Valid; loss: 1.067; acc: 0.592; precision: 0.579, recall: 0.670, macrof1: 0.589, weightedf1: 0.589[0m
[92m2020-09-10 17:34:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:10 -- Epoch: 16/20; Train; loss: 0.171; acc: 0.968; precision: 0.953, recall: 0.984, macrof1: 0.968, weightedf1: 0.968[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:18 -- Epoch: 16/20; Valid; loss: 1.060; acc: 0.596; precision: 0.583, recall: 0.674, macrof1: 0.593, weightedf1: 0.593[0m
[92m2020-09-10 17:34:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:18 -- Epoch: 17/20; Train; loss: 0.152; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:27 -- Epoch: 17/20; Valid; loss: 1.055; acc: 0.598; precision: 0.585, recall: 0.674, macrof1: 0.596, weightedf1: 0.596[0m
[92m2020-09-10 17:34:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:27 -- Epoch: 18/20; Train; loss: 0.141; acc: 0.976; precision: 0.961, recall: 0.992, macrof1: 0.976, weightedf1: 0.976[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:36 -- Epoch: 18/20; Valid; loss: 1.051; acc: 0.601; precision: 0.588, recall: 0.674, macrof1: 0.599, weightedf1: 0.599[0m
[92m2020-09-10 17:34:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:36 -- Epoch: 19/20; Train; loss: 0.129; acc: 0.980; precision: 0.969, recall: 0.992, macrof1: 0.980, weightedf1: 0.980[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:44 -- Epoch: 19/20; Valid; loss: 1.047; acc: 0.604; precision: 0.591, recall: 0.676, macrof1: 0.602, weightedf1: 0.602[0m
[92m2020-09-10 17:34:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 17:34:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:45 -- Epoch: 20/20; Train; loss: 0.116; acc: 0.988; precision: 0.984, recall: 0.992, macrof1: 0.988, weightedf1: 0.988[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:34:54 -- Epoch: 20/20; Valid; loss: 1.046; acc: 0.607; precision: 0.594, recall: 0.677, macrof1: 0.605, weightedf1: 0.605[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 20) at ./models/wikigaz_en_ft_ocr_lstm_v001_n250/wikigaz_en_ft_ocr_lstm_v001_n250.model[0m



User time: 180.7035


In [26]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04954886436462402[0m
[92m2020-09-10 17:34:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:34:54

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 17:34:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:34:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:34:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:34:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:34:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:34:57 -- Epoch: 1/20; Train; loss: 1.562; acc: 0.460; precision: 0.458, recall: 0.440, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:05 -- Epoch: 1/20; Valid; loss: 1.557; acc: 0.492; precision: 0.493, recall: 0.543, macrof1: 0.491, weightedf1: 0.491[0m
[92m2020-09-10 17:35:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:06 -- Epoch: 2/20; Train; loss: 1.176; acc: 0.522; precision: 0.522, recall: 0.524, macrof1: 0.522, weightedf1: 0.522[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:14 -- Epoch: 2/20; Valid; loss: 1.362; acc: 0.522; precision: 0.520, recall: 0.563, macrof1: 0.521, weightedf1: 0.521[0m
[92m2020-09-10 17:35:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:14 -- Epoch: 3/20; Train; loss: 0.929; acc: 0.606; precision: 0.602, recall: 0.628, macrof1: 0.606, weightedf1: 0.606[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:23 -- Epoch: 3/20; Valid; loss: 1.222; acc: 0.548; precision: 0.542, recall: 0.615, macrof1: 0.546, weightedf1: 0.546[0m
[92m2020-09-10 17:35:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:23 -- Epoch: 4/20; Train; loss: 0.752; acc: 0.660; precision: 0.652, recall: 0.688, macrof1: 0.660, weightedf1: 0.660[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:32 -- Epoch: 4/20; Valid; loss: 1.114; acc: 0.567; precision: 0.560, recall: 0.629, macrof1: 0.565, weightedf1: 0.565[0m
[92m2020-09-10 17:35:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:32 -- Epoch: 5/20; Train; loss: 0.628; acc: 0.708; precision: 0.695, recall: 0.740, macrof1: 0.708, weightedf1: 0.708[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:41 -- Epoch: 5/20; Valid; loss: 1.036; acc: 0.583; precision: 0.574, recall: 0.645, macrof1: 0.582, weightedf1: 0.582[0m
[92m2020-09-10 17:35:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:41 -- Epoch: 6/20; Train; loss: 0.542; acc: 0.760; precision: 0.748, recall: 0.784, macrof1: 0.760, weightedf1: 0.760[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:50 -- Epoch: 6/20; Valid; loss: 0.978; acc: 0.596; precision: 0.587, recall: 0.653, macrof1: 0.595, weightedf1: 0.595[0m
[92m2020-09-10 17:35:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:50 -- Epoch: 7/20; Train; loss: 0.471; acc: 0.788; precision: 0.775, recall: 0.812, macrof1: 0.788, weightedf1: 0.788[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:35:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:35:58 -- Epoch: 7/20; Valid; loss: 0.935; acc: 0.609; precision: 0.598, recall: 0.666, macrof1: 0.608, weightedf1: 0.608[0m
[92m2020-09-10 17:35:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:35:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:35:59 -- Epoch: 8/20; Train; loss: 0.413; acc: 0.830; precision: 0.821, recall: 0.844, macrof1: 0.830, weightedf1: 0.830[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:07 -- Epoch: 8/20; Valid; loss: 0.902; acc: 0.620; precision: 0.608, recall: 0.674, macrof1: 0.619, weightedf1: 0.619[0m
[92m2020-09-10 17:36:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:08 -- Epoch: 9/20; Train; loss: 0.361; acc: 0.858; precision: 0.843, recall: 0.880, macrof1: 0.858, weightedf1: 0.858[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:16 -- Epoch: 9/20; Valid; loss: 0.879; acc: 0.629; precision: 0.616, recall: 0.687, macrof1: 0.628, weightedf1: 0.628[0m
[92m2020-09-10 17:36:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:16 -- Epoch: 10/20; Train; loss: 0.323; acc: 0.884; precision: 0.861, recall: 0.916, macrof1: 0.884, weightedf1: 0.884[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:25 -- Epoch: 10/20; Valid; loss: 0.858; acc: 0.636; precision: 0.623, recall: 0.689, macrof1: 0.635, weightedf1: 0.635[0m
[92m2020-09-10 17:36:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:25 -- Epoch: 11/20; Train; loss: 0.285; acc: 0.904; precision: 0.888, recall: 0.924, macrof1: 0.904, weightedf1: 0.904[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:34 -- Epoch: 11/20; Valid; loss: 0.843; acc: 0.644; precision: 0.631, recall: 0.693, macrof1: 0.643, weightedf1: 0.643[0m
[92m2020-09-10 17:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:34 -- Epoch: 12/20; Train; loss: 0.254; acc: 0.922; precision: 0.904, recall: 0.944, macrof1: 0.922, weightedf1: 0.922[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:43 -- Epoch: 12/20; Valid; loss: 0.832; acc: 0.652; precision: 0.639, recall: 0.696, macrof1: 0.651, weightedf1: 0.651[0m
[92m2020-09-10 17:36:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:43 -- Epoch: 13/20; Train; loss: 0.230; acc: 0.936; precision: 0.926, recall: 0.948, macrof1: 0.936, weightedf1: 0.936[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:36:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:36:51 -- Epoch: 13/20; Valid; loss: 0.820; acc: 0.658; precision: 0.647, recall: 0.697, macrof1: 0.658, weightedf1: 0.658[0m
[92m2020-09-10 17:36:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:36:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:36:52 -- Epoch: 14/20; Train; loss: 0.204; acc: 0.946; precision: 0.937, recall: 0.956, macrof1: 0.946, weightedf1: 0.946[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:00 -- Epoch: 14/20; Valid; loss: 0.811; acc: 0.665; precision: 0.653, recall: 0.703, macrof1: 0.664, weightedf1: 0.664[0m
[92m2020-09-10 17:37:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:01 -- Epoch: 15/20; Train; loss: 0.184; acc: 0.958; precision: 0.953, recall: 0.964, macrof1: 0.958, weightedf1: 0.958[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:09 -- Epoch: 15/20; Valid; loss: 0.803; acc: 0.672; precision: 0.659, recall: 0.710, macrof1: 0.671, weightedf1: 0.671[0m
[92m2020-09-10 17:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:09 -- Epoch: 16/20; Train; loss: 0.163; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:18 -- Epoch: 16/20; Valid; loss: 0.796; acc: 0.676; precision: 0.664, recall: 0.714, macrof1: 0.676, weightedf1: 0.676[0m
[92m2020-09-10 17:37:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:18 -- Epoch: 17/20; Train; loss: 0.146; acc: 0.980; precision: 0.976, recall: 0.984, macrof1: 0.980, weightedf1: 0.980[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:27 -- Epoch: 17/20; Valid; loss: 0.789; acc: 0.681; precision: 0.670, recall: 0.712, macrof1: 0.681, weightedf1: 0.681[0m
[92m2020-09-10 17:37:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:27 -- Epoch: 18/20; Train; loss: 0.130; acc: 0.990; precision: 0.988, recall: 0.992, macrof1: 0.990, weightedf1: 0.990[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:36 -- Epoch: 18/20; Valid; loss: 0.786; acc: 0.685; precision: 0.673, recall: 0.719, macrof1: 0.685, weightedf1: 0.685[0m
[92m2020-09-10 17:37:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:36 -- Epoch: 19/20; Train; loss: 0.116; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:45 -- Epoch: 19/20; Valid; loss: 0.781; acc: 0.690; precision: 0.680, recall: 0.716, macrof1: 0.690, weightedf1: 0.690[0m
[92m2020-09-10 17:37:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 17:37:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:45 -- Epoch: 20/20; Train; loss: 0.104; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:37:54 -- Epoch: 20/20; Valid; loss: 0.780; acc: 0.693; precision: 0.682, recall: 0.724, macrof1: 0.693, weightedf1: 0.693[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 20) at ./models/wikigaz_en_ft_ocr_lstm_v001_n500/wikigaz_en_ft_ocr_lstm_v001_n500.model[0m



User time: 177.3186


In [27]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.054633378982543945[0m
[92m2020-09-10 17:37:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:37:54

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 17:37:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:37:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:37:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:37:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:37:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:37:57 -- Epoch: 1/20; Train; loss: 1.514; acc: 0.493; precision: 0.493, recall: 0.514, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:06 -- Epoch: 1/20; Valid; loss: 1.341; acc: 0.520; precision: 0.519, recall: 0.559, macrof1: 0.520, weightedf1: 0.520[0m
[92m2020-09-10 17:38:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:07 -- Epoch: 2/20; Train; loss: 1.017; acc: 0.595; precision: 0.592, recall: 0.614, macrof1: 0.595, weightedf1: 0.595[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:15 -- Epoch: 2/20; Valid; loss: 1.063; acc: 0.570; precision: 0.567, recall: 0.590, macrof1: 0.570, weightedf1: 0.570[0m
[92m2020-09-10 17:38:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:16 -- Epoch: 3/20; Train; loss: 0.759; acc: 0.654; precision: 0.650, recall: 0.666, macrof1: 0.654, weightedf1: 0.654[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:24 -- Epoch: 3/20; Valid; loss: 0.906; acc: 0.607; precision: 0.601, recall: 0.634, macrof1: 0.607, weightedf1: 0.607[0m
[92m2020-09-10 17:38:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:25 -- Epoch: 4/20; Train; loss: 0.599; acc: 0.716; precision: 0.705, recall: 0.744, macrof1: 0.716, weightedf1: 0.716[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:33 -- Epoch: 4/20; Valid; loss: 0.816; acc: 0.634; precision: 0.624, recall: 0.674, macrof1: 0.633, weightedf1: 0.633[0m
[92m2020-09-10 17:38:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:34 -- Epoch: 5/20; Train; loss: 0.495; acc: 0.761; precision: 0.753, recall: 0.776, macrof1: 0.761, weightedf1: 0.761[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:43 -- Epoch: 5/20; Valid; loss: 0.752; acc: 0.659; precision: 0.655, recall: 0.673, macrof1: 0.659, weightedf1: 0.659[0m
[92m2020-09-10 17:38:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:43 -- Epoch: 6/20; Train; loss: 0.422; acc: 0.812; precision: 0.811, recall: 0.814, macrof1: 0.812, weightedf1: 0.812[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:38:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:38:52 -- Epoch: 6/20; Valid; loss: 0.705; acc: 0.678; precision: 0.674, recall: 0.689, macrof1: 0.678, weightedf1: 0.678[0m
[92m2020-09-10 17:38:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:38:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:38:52 -- Epoch: 7/20; Train; loss: 0.365; acc: 0.852; precision: 0.851, recall: 0.854, macrof1: 0.852, weightedf1: 0.852[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:01 -- Epoch: 7/20; Valid; loss: 0.668; acc: 0.696; precision: 0.691, recall: 0.708, macrof1: 0.696, weightedf1: 0.696[0m
[92m2020-09-10 17:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:01 -- Epoch: 8/20; Train; loss: 0.316; acc: 0.883; precision: 0.878, recall: 0.890, macrof1: 0.883, weightedf1: 0.883[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:10 -- Epoch: 8/20; Valid; loss: 0.645; acc: 0.710; precision: 0.709, recall: 0.713, macrof1: 0.710, weightedf1: 0.710[0m
[92m2020-09-10 17:39:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:10 -- Epoch: 9/20; Train; loss: 0.271; acc: 0.912; precision: 0.902, recall: 0.924, macrof1: 0.912, weightedf1: 0.912[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:19 -- Epoch: 9/20; Valid; loss: 0.627; acc: 0.722; precision: 0.718, recall: 0.733, macrof1: 0.722, weightedf1: 0.722[0m
[92m2020-09-10 17:39:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:20 -- Epoch: 10/20; Train; loss: 0.237; acc: 0.928; precision: 0.928, recall: 0.928, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:28 -- Epoch: 10/20; Valid; loss: 0.615; acc: 0.733; precision: 0.733, recall: 0.731, macrof1: 0.733, weightedf1: 0.733[0m
[92m2020-09-10 17:39:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:29 -- Epoch: 11/20; Train; loss: 0.213; acc: 0.944; precision: 0.934, recall: 0.956, macrof1: 0.944, weightedf1: 0.944[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:38 -- Epoch: 11/20; Valid; loss: 0.605; acc: 0.740; precision: 0.737, recall: 0.748, macrof1: 0.740, weightedf1: 0.740[0m
[92m2020-09-10 17:39:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:38 -- Epoch: 12/20; Train; loss: 0.185; acc: 0.957; precision: 0.942, recall: 0.974, macrof1: 0.957, weightedf1: 0.957[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:47 -- Epoch: 12/20; Valid; loss: 0.598; acc: 0.748; precision: 0.745, recall: 0.754, macrof1: 0.748, weightedf1: 0.748[0m
[92m2020-09-10 17:39:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:47 -- Epoch: 13/20; Train; loss: 0.166; acc: 0.965; precision: 0.959, recall: 0.972, macrof1: 0.965, weightedf1: 0.965[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:39:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:39:56 -- Epoch: 13/20; Valid; loss: 0.592; acc: 0.753; precision: 0.754, recall: 0.752, macrof1: 0.753, weightedf1: 0.753[0m
[92m2020-09-10 17:39:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:39:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:39:56 -- Epoch: 14/20; Train; loss: 0.146; acc: 0.970; precision: 0.959, recall: 0.982, macrof1: 0.970, weightedf1: 0.970[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:05 -- Epoch: 14/20; Valid; loss: 0.591; acc: 0.758; precision: 0.756, recall: 0.762, macrof1: 0.758, weightedf1: 0.758[0m
[92m2020-09-10 17:40:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:40:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:05 -- Epoch: 15/20; Train; loss: 0.131; acc: 0.975; precision: 0.967, recall: 0.984, macrof1: 0.975, weightedf1: 0.975[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:14 -- Epoch: 15/20; Valid; loss: 0.589; acc: 0.761; precision: 0.764, recall: 0.756, macrof1: 0.761, weightedf1: 0.761[0m
[92m2020-09-10 17:40:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:40:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:14 -- Epoch: 16/20; Train; loss: 0.115; acc: 0.983; precision: 0.974, recall: 0.992, macrof1: 0.983, weightedf1: 0.983[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:23 -- Epoch: 16/20; Valid; loss: 0.588; acc: 0.764; precision: 0.761, recall: 0.769, macrof1: 0.764, weightedf1: 0.764[0m
[92m2020-09-10 17:40:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 17:40:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:24 -- Epoch: 17/20; Train; loss: 0.103; acc: 0.989; precision: 0.982, recall: 0.996, macrof1: 0.989, weightedf1: 0.989[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:32 -- Epoch: 17/20; Valid; loss: 0.591; acc: 0.766; precision: 0.768, recall: 0.762, macrof1: 0.766, weightedf1: 0.766[0m
[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 16) at ./models/wikigaz_en_ft_ocr_lstm_v001_n1000/wikigaz_en_ft_ocr_lstm_v001_n1000.model[0m
[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 17, selected epoch: 16[0m




User time: 155.5745


In [28]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:40:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05253171920776367[0m
[92m2020-09-10 17:40:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:40:33

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 17:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:40:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:36 -- Epoch: 1/20; Train; loss: 1.323; acc: 0.522; precision: 0.522, recall: 0.528, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:45 -- Epoch: 1/20; Valid; loss: 1.067; acc: 0.574; precision: 0.570, recall: 0.601, macrof1: 0.574, weightedf1: 0.574[0m
[92m2020-09-10 17:40:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:40:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:46 -- Epoch: 2/20; Train; loss: 0.783; acc: 0.653; precision: 0.641, recall: 0.699, macrof1: 0.653, weightedf1: 0.653[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:40:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:40:54 -- Epoch: 2/20; Valid; loss: 0.792; acc: 0.641; precision: 0.633, recall: 0.667, macrof1: 0.640, weightedf1: 0.640[0m
[92m2020-09-10 17:40:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:40:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:40:55 -- Epoch: 3/20; Train; loss: 0.561; acc: 0.731; precision: 0.726, recall: 0.743, macrof1: 0.731, weightedf1: 0.731[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:04 -- Epoch: 3/20; Valid; loss: 0.665; acc: 0.689; precision: 0.683, recall: 0.704, macrof1: 0.689, weightedf1: 0.689[0m
[92m2020-09-10 17:41:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:05 -- Epoch: 4/20; Train; loss: 0.456; acc: 0.790; precision: 0.788, recall: 0.793, macrof1: 0.790, weightedf1: 0.790[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:13 -- Epoch: 4/20; Valid; loss: 0.593; acc: 0.726; precision: 0.715, recall: 0.750, macrof1: 0.725, weightedf1: 0.725[0m
[92m2020-09-10 17:41:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:14 -- Epoch: 5/20; Train; loss: 0.385; acc: 0.845; precision: 0.840, recall: 0.852, macrof1: 0.845, weightedf1: 0.845[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:23 -- Epoch: 5/20; Valid; loss: 0.551; acc: 0.751; precision: 0.752, recall: 0.747, macrof1: 0.751, weightedf1: 0.751[0m
[92m2020-09-10 17:41:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:24 -- Epoch: 6/20; Train; loss: 0.329; acc: 0.876; precision: 0.876, recall: 0.877, macrof1: 0.876, weightedf1: 0.876[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:32 -- Epoch: 6/20; Valid; loss: 0.518; acc: 0.768; precision: 0.764, recall: 0.776, macrof1: 0.768, weightedf1: 0.768[0m
[92m2020-09-10 17:41:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:33 -- Epoch: 7/20; Train; loss: 0.286; acc: 0.897; precision: 0.889, recall: 0.907, macrof1: 0.897, weightedf1: 0.897[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:42 -- Epoch: 7/20; Valid; loss: 0.499; acc: 0.781; precision: 0.778, recall: 0.785, macrof1: 0.781, weightedf1: 0.781[0m
[92m2020-09-10 17:41:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:43 -- Epoch: 8/20; Train; loss: 0.247; acc: 0.916; precision: 0.915, recall: 0.917, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:41:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:41:52 -- Epoch: 8/20; Valid; loss: 0.486; acc: 0.789; precision: 0.782, recall: 0.803, macrof1: 0.789, weightedf1: 0.789[0m
[92m2020-09-10 17:41:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:41:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:41:53 -- Epoch: 9/20; Train; loss: 0.216; acc: 0.933; precision: 0.930, recall: 0.938, macrof1: 0.933, weightedf1: 0.933[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:01 -- Epoch: 9/20; Valid; loss: 0.476; acc: 0.797; precision: 0.795, recall: 0.801, macrof1: 0.797, weightedf1: 0.797[0m
[92m2020-09-10 17:42:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:42:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:02 -- Epoch: 10/20; Train; loss: 0.189; acc: 0.946; precision: 0.940, recall: 0.953, macrof1: 0.946, weightedf1: 0.946[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:11 -- Epoch: 10/20; Valid; loss: 0.468; acc: 0.804; precision: 0.810, recall: 0.794, macrof1: 0.804, weightedf1: 0.804[0m
[92m2020-09-10 17:42:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:42:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:12 -- Epoch: 11/20; Train; loss: 0.164; acc: 0.958; precision: 0.956, recall: 0.959, macrof1: 0.957, weightedf1: 0.957[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:20 -- Epoch: 11/20; Valid; loss: 0.466; acc: 0.807; precision: 0.802, recall: 0.815, macrof1: 0.807, weightedf1: 0.807[0m
[92m2020-09-10 17:42:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:42:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:21 -- Epoch: 12/20; Train; loss: 0.147; acc: 0.966; precision: 0.963, recall: 0.968, macrof1: 0.965, weightedf1: 0.965[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:30 -- Epoch: 12/20; Valid; loss: 0.462; acc: 0.811; precision: 0.808, recall: 0.815, macrof1: 0.811, weightedf1: 0.811[0m
[92m2020-09-10 17:42:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 17:42:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:31 -- Epoch: 13/20; Train; loss: 0.123; acc: 0.978; precision: 0.977, recall: 0.979, macrof1: 0.978, weightedf1: 0.978[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:40 -- Epoch: 13/20; Valid; loss: 0.463; acc: 0.815; precision: 0.816, recall: 0.813, macrof1: 0.815, weightedf1: 0.815[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 12) at ./models/wikigaz_en_ft_ocr_lstm_v001_n2000/wikigaz_en_ft_ocr_lstm_v001_n2000.model[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 13, selected epoch: 12[0m




User time: 124.7803


In [29]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05044984817504883[0m
[92m2020-09-10 17:42:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:42:40

                                                    

[92m2020-09-10 17:42:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:42:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:42:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:42:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:42:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:46 -- Epoch: 1/20; Train; loss: 1.086; acc: 0.581; precision: 0.577, recall: 0.607, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:42:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:42:56 -- Epoch: 1/20; Valid; loss: 0.793; acc: 0.637; precision: 0.641, recall: 0.624, macrof1: 0.637, weightedf1: 0.637[0m
[92m2020-09-10 17:42:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:42:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:42:58 -- Epoch: 2/20; Train; loss: 0.556; acc: 0.736; precision: 0.732, recall: 0.745, macrof1: 0.736, weightedf1: 0.736[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:43:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:43:08 -- Epoch: 2/20; Valid; loss: 0.577; acc: 0.728; precision: 0.735, recall: 0.715, macrof1: 0.728, weightedf1: 0.728[0m
[92m2020-09-10 17:43:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:43:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:43:10 -- Epoch: 3/20; Train; loss: 0.412; acc: 0.819; precision: 0.814, recall: 0.828, macrof1: 0.819, weightedf1: 0.819[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:43:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:43:20 -- Epoch: 3/20; Valid; loss: 0.492; acc: 0.779; precision: 0.773, recall: 0.790, macrof1: 0.779, weightedf1: 0.779[0m
[92m2020-09-10 17:43:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:43:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:43:22 -- Epoch: 4/20; Train; loss: 0.337; acc: 0.864; precision: 0.855, recall: 0.876, macrof1: 0.864, weightedf1: 0.864[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:43:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:43:31 -- Epoch: 4/20; Valid; loss: 0.455; acc: 0.801; precision: 0.791, recall: 0.819, macrof1: 0.801, weightedf1: 0.801[0m
[92m2020-09-10 17:43:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:43:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:43:33 -- Epoch: 5/20; Train; loss: 0.282; acc: 0.895; precision: 0.889, recall: 0.902, macrof1: 0.895, weightedf1: 0.895[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:43:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:43:41 -- Epoch: 5/20; Valid; loss: 0.431; acc: 0.813; precision: 0.793, recall: 0.848, macrof1: 0.813, weightedf1: 0.813[0m
[92m2020-09-10 17:43:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:43:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:43:43 -- Epoch: 6/20; Train; loss: 0.242; acc: 0.914; precision: 0.899, recall: 0.933, macrof1: 0.914, weightedf1: 0.914[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:43:52 -- Epoch: 6/20; Valid; loss: 0.424; acc: 0.824; precision: 0.837, recall: 0.804, macrof1: 0.824, weightedf1: 0.824[0m
[92m2020-09-10 17:43:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:43:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:43:54 -- Epoch: 7/20; Train; loss: 0.205; acc: 0.932; precision: 0.925, recall: 0.941, macrof1: 0.932, weightedf1: 0.932[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:44:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:44:02 -- Epoch: 7/20; Valid; loss: 0.406; acc: 0.830; precision: 0.826, recall: 0.837, macrof1: 0.830, weightedf1: 0.830[0m
[92m2020-09-10 17:44:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 17:44:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:44:04 -- Epoch: 8/20; Train; loss: 0.176; acc: 0.950; precision: 0.941, recall: 0.960, macrof1: 0.950, weightedf1: 0.950[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:44:13 -- Epoch: 8/20; Valid; loss: 0.407; acc: 0.834; precision: 0.836, recall: 0.831, macrof1: 0.834, weightedf1: 0.834[0m
[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_lstm_v001_n4000/wikigaz_en_ft_ocr_lstm_v001_n4000.model[0m
[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 8, selected epoch: 7[0m




User time: 90.2244


In [30]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:44:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:44:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:44:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:44:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.049421072006225586[0m
[92m2020-09-10 17:44:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:44:14

length s1:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 17:44:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:44:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:44:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:44:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:44:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:44:20 -- Epoch: 1/20; Train; loss: 0.865; acc: 0.637; precision: 0.631, recall: 0.659, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:44:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:44:28 -- Epoch: 1/20; Valid; loss: 0.557; acc: 0.737; precision: 0.754, recall: 0.704, macrof1: 0.737, weightedf1: 0.737[0m
[92m2020-09-10 17:44:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 17:44:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:44:32 -- Epoch: 2/20; Train; loss: 0.437; acc: 0.805; precision: 0.801, recall: 0.811, macrof1: 0.805, weightedf1: 0.805[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:44:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:44:41 -- Epoch: 2/20; Valid; loss: 0.426; acc: 0.813; precision: 0.804, recall: 0.827, macrof1: 0.813, weightedf1: 0.813[0m
[92m2020-09-10 17:44:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 17:44:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:44:44 -- Epoch: 3/20; Train; loss: 0.344; acc: 0.859; precision: 0.851, recall: 0.871, macrof1: 0.859, weightedf1: 0.859[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:44:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:44:53 -- Epoch: 3/20; Valid; loss: 0.378; acc: 0.841; precision: 0.831, recall: 0.856, macrof1: 0.841, weightedf1: 0.841[0m
[92m2020-09-10 17:44:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 17:44:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:44:57 -- Epoch: 4/20; Train; loss: 0.288; acc: 0.890; precision: 0.880, recall: 0.902, macrof1: 0.890, weightedf1: 0.890[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:45:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:45:05 -- Epoch: 4/20; Valid; loss: 0.351; acc: 0.854; precision: 0.848, recall: 0.863, macrof1: 0.854, weightedf1: 0.854[0m
[92m2020-09-10 17:45:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 17:45:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:45:09 -- Epoch: 5/20; Train; loss: 0.242; acc: 0.911; precision: 0.899, recall: 0.925, macrof1: 0.911, weightedf1: 0.911[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:45:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:45:18 -- Epoch: 5/20; Valid; loss: 0.340; acc: 0.862; precision: 0.859, recall: 0.865, macrof1: 0.862, weightedf1: 0.862[0m
[92m2020-09-10 17:45:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 17:45:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:45:21 -- Epoch: 6/20; Train; loss: 0.206; acc: 0.928; precision: 0.920, recall: 0.939, macrof1: 0.928, weightedf1: 0.928[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:45:30 -- Epoch: 6/20; Valid; loss: 0.345; acc: 0.862; precision: 0.885, recall: 0.832, macrof1: 0.862, weightedf1: 0.862[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n8000/wikigaz_en_ft_ocr_lstm_v001_n8000.model[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 73.9374


In [31]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:45:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0449368953704834[0m
[92m2020-09-10 17:45:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:45:31[

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 17:45:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:45:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:45:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:45:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:45:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:45:40 -- Epoch: 1/20; Train; loss: 0.679; acc: 0.710; precision: 0.705, recall: 0.721, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:45:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:45:49 -- Epoch: 1/20; Valid; loss: 0.419; acc: 0.817; precision: 0.816, recall: 0.819, macrof1: 0.817, weightedf1: 0.817[0m
[92m2020-09-10 17:45:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:45:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:45:56 -- Epoch: 2/20; Train; loss: 0.349; acc: 0.855; precision: 0.846, recall: 0.867, macrof1: 0.855, weightedf1: 0.855[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:46:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:46:05 -- Epoch: 2/20; Valid; loss: 0.343; acc: 0.857; precision: 0.842, recall: 0.879, macrof1: 0.857, weightedf1: 0.857[0m
[92m2020-09-10 17:46:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:46:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:46:12 -- Epoch: 3/20; Train; loss: 0.280; acc: 0.892; precision: 0.882, recall: 0.907, macrof1: 0.892, weightedf1: 0.892[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:46:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:46:21 -- Epoch: 3/20; Valid; loss: 0.321; acc: 0.867; precision: 0.886, recall: 0.842, macrof1: 0.866, weightedf1: 0.866[0m
[92m2020-09-10 17:46:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:46:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:46:28 -- Epoch: 4/20; Train; loss: 0.234; acc: 0.914; precision: 0.905, recall: 0.924, macrof1: 0.914, weightedf1: 0.914[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:46:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:46:37 -- Epoch: 4/20; Valid; loss: 0.300; acc: 0.877; precision: 0.858, recall: 0.905, macrof1: 0.877, weightedf1: 0.877[0m
[92m2020-09-10 17:46:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:46:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:46:45 -- Epoch: 5/20; Train; loss: 0.197; acc: 0.929; precision: 0.920, recall: 0.939, macrof1: 0.929, weightedf1: 0.929[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:46:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:46:53 -- Epoch: 5/20; Valid; loss: 0.290; acc: 0.883; precision: 0.888, recall: 0.877, macrof1: 0.883, weightedf1: 0.883[0m
[92m2020-09-10 17:46:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:47:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:47:01 -- Epoch: 6/20; Train; loss: 0.166; acc: 0.943; precision: 0.935, recall: 0.952, macrof1: 0.943, weightedf1: 0.943[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:47:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:47:09 -- Epoch: 6/20; Valid; loss: 0.285; acc: 0.885; precision: 0.879, recall: 0.894, macrof1: 0.885, weightedf1: 0.885[0m
[92m2020-09-10 17:47:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 17:47:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:47:17 -- Epoch: 7/20; Train; loss: 0.137; acc: 0.955; precision: 0.947, recall: 0.963, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:47:25 -- Epoch: 7/20; Valid; loss: 0.291; acc: 0.888; precision: 0.865, recall: 0.920, macrof1: 0.888, weightedf1: 0.888[0m
[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_lstm_v001_n16000/wikigaz_en_ft_ocr_lstm_v001_n16000.model[0m
[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 7, selected epoch: 6[0m




User time: 112.5512


In [32]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A.yaml[0m
[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:47:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:47:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:47:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:47:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04964113235473633[0m
[92m2020-09-10 17:47:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 17:47:26

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 17:47:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:47:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:47:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:47:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:47:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:47:44 -- Epoch: 1/20; Train; loss: 0.522; acc: 0.779; precision: 0.772, recall: 0.792, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:47:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:47:52 -- Epoch: 1/20; Valid; loss: 0.338; acc: 0.860; precision: 0.854, recall: 0.869, macrof1: 0.860, weightedf1: 0.860[0m
[92m2020-09-10 17:47:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 17:48:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:48:08 -- Epoch: 2/20; Train; loss: 0.286; acc: 0.885; precision: 0.876, recall: 0.898, macrof1: 0.885, weightedf1: 0.885[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:48:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:48:17 -- Epoch: 2/20; Valid; loss: 0.293; acc: 0.878; precision: 0.851, recall: 0.917, macrof1: 0.878, weightedf1: 0.878[0m
[92m2020-09-10 17:48:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 17:48:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:48:32 -- Epoch: 3/20; Train; loss: 0.231; acc: 0.909; precision: 0.900, recall: 0.921, macrof1: 0.909, weightedf1: 0.909[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:48:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:48:41 -- Epoch: 3/20; Valid; loss: 0.271; acc: 0.890; precision: 0.863, recall: 0.926, macrof1: 0.889, weightedf1: 0.889[0m
[92m2020-09-10 17:48:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 17:48:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:48:56 -- Epoch: 4/20; Train; loss: 0.190; acc: 0.926; precision: 0.918, recall: 0.935, macrof1: 0.926, weightedf1: 0.926[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:49:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:49:04 -- Epoch: 4/20; Valid; loss: 0.258; acc: 0.899; precision: 0.881, recall: 0.922, macrof1: 0.899, weightedf1: 0.899[0m
[92m2020-09-10 17:49:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 17:49:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:49:19 -- Epoch: 5/20; Train; loss: 0.157; acc: 0.942; precision: 0.933, recall: 0.951, macrof1: 0.942, weightedf1: 0.942[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:49:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:49:28 -- Epoch: 5/20; Valid; loss: 0.250; acc: 0.904; precision: 0.899, recall: 0.912, macrof1: 0.904, weightedf1: 0.904[0m
[92m2020-09-10 17:49:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 17:49:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:49:43 -- Epoch: 6/20; Train; loss: 0.128; acc: 0.954; precision: 0.947, recall: 0.963, macrof1: 0.954, weightedf1: 0.954[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:49:51 -- Epoch: 6/20; Valid; loss: 0.252; acc: 0.906; precision: 0.893, recall: 0.923, macrof1: 0.906, weightedf1: 0.906[0m
[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n32000/wikigaz_en_ft_ocr_lstm_v001_n32000.model[0m
[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 142.6903


In [33]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:49:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:49:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:49:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:49:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0503392219543457[0m
[92m2020-09-10 17:49:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 17:49:52[0m [95mlwm-embeddings[0m [1m[90m[INF

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 17:49:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:49:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:49:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:49:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:50:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:50:25 -- Epoch: 1/10; Train; loss: 0.419; acc: 0.823; precision: 0.813, recall: 0.840, ma

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:50:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:50:32 -- Epoch: 1/10; Valid; loss: 0.279; acc: 0.888; precision: 0.882, recall: 0.895, macrof1: 0.888, weightedf1: 0.888[0m
[92m2020-09-10 17:50:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:51:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:51:02 -- Epoch: 2/10; Train; loss: 0.244; acc: 0.904; precision: 0.894, recall: 0.916, macrof1: 0.904, weightedf1: 0.904[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:51:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:51:08 -- Epoch: 2/10; Valid; loss: 0.232; acc: 0.908; precision: 0.902, recall: 0.914, macrof1: 0.908, weightedf1: 0.908[0m
[92m2020-09-10 17:51:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:51:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:51:39 -- Epoch: 3/10; Train; loss: 0.195; acc: 0.925; precision: 0.916, recall: 0.937, macrof1: 0.925, weightedf1: 0.925[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:51:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:51:46 -- Epoch: 3/10; Valid; loss: 0.216; acc: 0.914; precision: 0.892, recall: 0.942, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 17:51:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:52:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:52:16 -- Epoch: 4/10; Train; loss: 0.161; acc: 0.940; precision: 0.931, recall: 0.949, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:52:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:52:23 -- Epoch: 4/10; Valid; loss: 0.204; acc: 0.921; precision: 0.905, recall: 0.941, macrof1: 0.921, weightedf1: 0.921[0m
[92m2020-09-10 17:52:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:52:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:52:53 -- Epoch: 5/10; Train; loss: 0.133; acc: 0.951; precision: 0.942, recall: 0.960, macrof1: 0.951, weightedf1: 0.951[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:53:00 -- Epoch: 5/10; Valid; loss: 0.203; acc: 0.923; precision: 0.913, recall: 0.935, macrof1: 0.923, weightedf1: 0.923[0m
[92m2020-09-10 17:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:53:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:53:30 -- Epoch: 6/10; Train; loss: 0.111; acc: 0.959; precision: 0.952, recall: 0.967, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:53:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:53:37 -- Epoch: 6/10; Valid; loss: 0.208; acc: 0.925; precision: 0.922, recall: 0.927, macrof1: 0.925, weightedf1: 0.925[0m
[92m2020-09-10 17:53:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:54:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:54:07 -- Epoch: 7/10; Train; loss: 0.093; acc: 0.967; precision: 0.961, recall: 0.973, macrof1: 0.967, weightedf1: 0.967[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:54:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:54:14 -- Epoch: 7/10; Valid; loss: 0.209; acc: 0.926; precision: 0.923, recall: 0.928, macrof1: 0.926, weightedf1: 0.926[0m
[92m2020-09-10 17:54:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:54:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:54:44 -- Epoch: 8/10; Train; loss: 0.077; acc: 0.974; precision: 0.968, recall: 0.979, macrof1: 0.974, weightedf1: 0.974[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:54:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:54:51 -- Epoch: 8/10; Valid; loss: 0.218; acc: 0.925; precision: 0.914, recall: 0.939, macrof1: 0.925, weightedf1: 0.925[0m
[92m2020-09-10 17:54:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:55:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:55:21 -- Epoch: 9/10; Train; loss: 0.063; acc: 0.979; precision: 0.974, recall: 0.984, macrof1: 0.979, weightedf1: 0.979[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:55:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:55:28 -- Epoch: 9/10; Valid; loss: 0.236; acc: 0.924; precision: 0.927, recall: 0.921, macrof1: 0.924, weightedf1: 0.924[0m
[92m2020-09-10 17:55:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 17:55:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:55:58 -- Epoch: 10/10; Train; loss: 0.052; acc: 0.983; precision: 0.980, recall: 0.986, macrof1: 0.983, weightedf1: 0.983[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:56:05 -- Epoch: 10/10; Valid; loss: 0.250; acc: 0.923; precision: 0.918, recall: 0.929, macrof1: 0.923, weightedf1: 0.923[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n64000/wikigaz_en_ft_ocr_lstm_v001_n64000.model[0m



User time: 369.7571


In [34]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04768061637878418[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 17:56:05[0m [95mlwm-embeddings[0m [1m[90m[IN

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 17:56:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 17:56:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 17:56:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 17:56:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 17:56:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:56:47 -- Epoch: 1/10; Train; loss: 0.382; acc: 0.841; precision: 0.832, recall: 0.855, ma

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:56:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:56:48 -- Epoch: 1/10; Valid; loss: 0.229; acc: 0.907; precision: 0.910, recall: 0.904, macrof1: 0.907, weightedf1: 0.907[0m
[92m2020-09-10 17:56:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:57:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:57:31 -- Epoch: 2/10; Train; loss: 0.223; acc: 0.913; precision: 0.903, recall: 0.926, macrof1: 0.913, weightedf1: 0.913[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:57:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:57:31 -- Epoch: 2/10; Valid; loss: 0.206; acc: 0.930; precision: 0.930, recall: 0.930, macrof1: 0.930, weightedf1: 0.930[0m
[92m2020-09-10 17:57:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:58:14 -- Epoch: 3/10; Train; loss: 0.178; acc: 0.932; precision: 0.922, recall: 0.943, macrof1: 0.932, weightedf1: 0.932[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:58:14 -- Epoch: 3/10; Valid; loss: 0.183; acc: 0.924; precision: 0.907, recall: 0.944, macrof1: 0.924, weightedf1: 0.924[0m
[92m2020-09-10 17:58:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:58:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:58:54 -- Epoch: 4/10; Train; loss: 0.145; acc: 0.945; precision: 0.937, recall: 0.954, macrof1: 0.945, weightedf1: 0.945[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:58:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:58:54 -- Epoch: 4/10; Valid; loss: 0.171; acc: 0.934; precision: 0.931, recall: 0.937, macrof1: 0.934, weightedf1: 0.934[0m
[92m2020-09-10 17:58:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 17:59:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_17:59:34 -- Epoch: 5/10; Train; loss: 0.123; acc: 0.955; precision: 0.948, recall: 0.962, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 17:59:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_17:59:34 -- Epoch: 5/10; Valid; loss: 0.154; acc: 0.937; precision: 0.931, recall: 0.944, macrof1: 0.937, weightedf1: 0.937[0m
[92m2020-09-10 17:59:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:00:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:00:14 -- Epoch: 6/10; Train; loss: 0.104; acc: 0.962; precision: 0.956, recall: 0.969, macrof1: 0.962, weightedf1: 0.962[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:00:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:00:14 -- Epoch: 6/10; Valid; loss: 0.169; acc: 0.935; precision: 0.943, recall: 0.927, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 18:00:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:00:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:00:53 -- Epoch: 7/10; Train; loss: 0.088; acc: 0.968; precision: 0.963, recall: 0.974, macrof1: 0.968, weightedf1: 0.968[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:00:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:00:54 -- Epoch: 7/10; Valid; loss: 0.183; acc: 0.929; precision: 0.911, recall: 0.950, macrof1: 0.929, weightedf1: 0.929[0m
[92m2020-09-10 18:00:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:01:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:01:38 -- Epoch: 8/10; Train; loss: 0.074; acc: 0.974; precision: 0.969, recall: 0.979, macrof1: 0.974, weightedf1: 0.974[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:01:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:01:38 -- Epoch: 8/10; Valid; loss: 0.178; acc: 0.945; precision: 0.944, recall: 0.947, macrof1: 0.945, weightedf1: 0.945[0m
[92m2020-09-10 18:01:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:02:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:02:21 -- Epoch: 9/10; Train; loss: 0.064; acc: 0.978; precision: 0.974, recall: 0.982, macrof1: 0.978, weightedf1: 0.978[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:02:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:02:21 -- Epoch: 9/10; Valid; loss: 0.182; acc: 0.935; precision: 0.920, recall: 0.953, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 18:02:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:03:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:01 -- Epoch: 10/10; Train; loss: 0.054; acc: 0.982; precision: 0.978, recall: 0.986, macrof1: 0.982, weightedf1: 0.982[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:02 -- Epoch: 10/10; Valid; loss: 0.199; acc: 0.934; precision: 0.917, recall: 0.953, macrof1: 0.934, weightedf1: 0.934[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v001_n84000/wikigaz_en_ft_ocr_lstm_v001_n84000.model[0m



User time: 413.5969


## Fine-Tune, model A, RNN

In [35]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 250

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0529019832611084[0m
[92m2020-09-10 18:03:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:03:02[0

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:03:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:03:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:03:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:03:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:03:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:05 -- Epoch: 1/20; Train; loss: 1.058; acc: 0.520; precision: 0.522, recall: 0.480, macrof1: 0.519, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:12 -- Epoch: 1/20; Valid; loss: 1.051; acc: 0.498; precision: 0.498, recall: 0.514, macrof1: 0.498, weightedf1: 0.498[0m
[92m2020-09-10 18:03:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:13 -- Epoch: 2/20; Train; loss: 0.857; acc: 0.564; precision: 0.566, recall: 0.552, macrof1: 0.564, weightedf1: 0.564[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:20 -- Epoch: 2/20; Valid; loss: 0.944; acc: 0.517; precision: 0.515, recall: 0.582, macrof1: 0.515, weightedf1: 0.515[0m
[92m2020-09-10 18:03:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:21 -- Epoch: 3/20; Train; loss: 0.728; acc: 0.608; precision: 0.602, recall: 0.640, macrof1: 0.608, weightedf1: 0.608[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:28 -- Epoch: 3/20; Valid; loss: 0.872; acc: 0.536; precision: 0.530, recall: 0.634, macrof1: 0.531, weightedf1: 0.531[0m
[92m2020-09-10 18:03:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:29 -- Epoch: 4/20; Train; loss: 0.643; acc: 0.644; precision: 0.630, recall: 0.696, macrof1: 0.643, weightedf1: 0.643[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:36 -- Epoch: 4/20; Valid; loss: 0.821; acc: 0.546; precision: 0.538, recall: 0.648, macrof1: 0.541, weightedf1: 0.541[0m
[92m2020-09-10 18:03:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:37 -- Epoch: 5/20; Train; loss: 0.578; acc: 0.668; precision: 0.657, recall: 0.704, macrof1: 0.668, weightedf1: 0.668[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:44 -- Epoch: 5/20; Valid; loss: 0.786; acc: 0.555; precision: 0.547, recall: 0.643, macrof1: 0.551, weightedf1: 0.551[0m
[92m2020-09-10 18:03:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:44 -- Epoch: 6/20; Train; loss: 0.531; acc: 0.716; precision: 0.705, recall: 0.744, macrof1: 0.716, weightedf1: 0.716[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:03:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:03:52 -- Epoch: 6/20; Valid; loss: 0.766; acc: 0.567; precision: 0.557, recall: 0.647, macrof1: 0.564, weightedf1: 0.564[0m
[92m2020-09-10 18:03:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:03:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:03:52 -- Epoch: 7/20; Train; loss: 0.496; acc: 0.748; precision: 0.738, recall: 0.768, macrof1: 0.748, weightedf1: 0.748[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:00 -- Epoch: 7/20; Valid; loss: 0.752; acc: 0.575; precision: 0.566, recall: 0.643, macrof1: 0.573, weightedf1: 0.573[0m
[92m2020-09-10 18:04:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:04:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:00 -- Epoch: 8/20; Train; loss: 0.468; acc: 0.772; precision: 0.774, recall: 0.768, macrof1: 0.772, weightedf1: 0.772[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:08 -- Epoch: 8/20; Valid; loss: 0.744; acc: 0.583; precision: 0.575, recall: 0.632, macrof1: 0.582, weightedf1: 0.582[0m
[92m2020-09-10 18:04:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:04:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:08 -- Epoch: 9/20; Train; loss: 0.444; acc: 0.792; precision: 0.797, recall: 0.784, macrof1: 0.792, weightedf1: 0.792[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:16 -- Epoch: 9/20; Valid; loss: 0.740; acc: 0.592; precision: 0.585, recall: 0.630, macrof1: 0.591, weightedf1: 0.591[0m
[92m2020-09-10 18:04:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:04:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:17 -- Epoch: 10/20; Train; loss: 0.415; acc: 0.812; precision: 0.831, recall: 0.784, macrof1: 0.812, weightedf1: 0.812[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:26 -- Epoch: 10/20; Valid; loss: 0.739; acc: 0.601; precision: 0.595, recall: 0.630, macrof1: 0.600, weightedf1: 0.600[0m
[92m2020-09-10 18:04:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:04:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:26 -- Epoch: 11/20; Train; loss: 0.392; acc: 0.828; precision: 0.842, recall: 0.808, macrof1: 0.828, weightedf1: 0.828[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:34 -- Epoch: 11/20; Valid; loss: 0.743; acc: 0.610; precision: 0.603, recall: 0.644, macrof1: 0.609, weightedf1: 0.609[0m
[92m2020-09-10 18:04:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 10) at ./models/wikigaz_en_ft_ocr_rnn_v001_n250/wikigaz_en_ft_ocr_rnn_v001_n250.model[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 11, selected epoch: 10[0m




User time: 90.0004


In [36]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.054543495178222656[0m
[92m2020-09-10 18:04:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:04:35

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:04:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:04:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:04:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:04:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:04:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:38 -- Epoch: 1/20; Train; loss: 1.054; acc: 0.492; precision: 0.492, recall: 0.492, macrof1: 0.492, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:47 -- Epoch: 1/20; Valid; loss: 0.930; acc: 0.511; precision: 0.510, recall: 0.551, macrof1: 0.510, weightedf1: 0.510[0m
[92m2020-09-10 18:04:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:04:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:47 -- Epoch: 2/20; Train; loss: 0.804; acc: 0.554; precision: 0.552, recall: 0.576, macrof1: 0.554, weightedf1: 0.554[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:04:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:04:56 -- Epoch: 2/20; Valid; loss: 0.791; acc: 0.534; precision: 0.530, recall: 0.614, macrof1: 0.531, weightedf1: 0.531[0m
[92m2020-09-10 18:04:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:04:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:04:56 -- Epoch: 3/20; Train; loss: 0.687; acc: 0.592; precision: 0.584, recall: 0.640, macrof1: 0.591, weightedf1: 0.591[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:05 -- Epoch: 3/20; Valid; loss: 0.728; acc: 0.554; precision: 0.545, recall: 0.664, macrof1: 0.549, weightedf1: 0.549[0m
[92m2020-09-10 18:05:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:05 -- Epoch: 4/20; Train; loss: 0.626; acc: 0.626; precision: 0.613, recall: 0.684, macrof1: 0.625, weightedf1: 0.625[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:14 -- Epoch: 4/20; Valid; loss: 0.694; acc: 0.573; precision: 0.560, recall: 0.676, macrof1: 0.568, weightedf1: 0.568[0m
[92m2020-09-10 18:05:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:14 -- Epoch: 5/20; Train; loss: 0.587; acc: 0.666; precision: 0.655, recall: 0.700, macrof1: 0.666, weightedf1: 0.666[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:22 -- Epoch: 5/20; Valid; loss: 0.673; acc: 0.593; precision: 0.580, recall: 0.674, macrof1: 0.591, weightedf1: 0.591[0m
[92m2020-09-10 18:05:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:22 -- Epoch: 6/20; Train; loss: 0.551; acc: 0.684; precision: 0.676, recall: 0.708, macrof1: 0.684, weightedf1: 0.684[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:30 -- Epoch: 6/20; Valid; loss: 0.660; acc: 0.618; precision: 0.608, recall: 0.668, macrof1: 0.617, weightedf1: 0.617[0m
[92m2020-09-10 18:05:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:30 -- Epoch: 7/20; Train; loss: 0.520; acc: 0.726; precision: 0.738, recall: 0.700, macrof1: 0.726, weightedf1: 0.726[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:38 -- Epoch: 7/20; Valid; loss: 0.649; acc: 0.644; precision: 0.643, recall: 0.647, macrof1: 0.644, weightedf1: 0.644[0m
[92m2020-09-10 18:05:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:38 -- Epoch: 8/20; Train; loss: 0.486; acc: 0.734; precision: 0.751, recall: 0.700, macrof1: 0.734, weightedf1: 0.734[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:46 -- Epoch: 8/20; Valid; loss: 0.642; acc: 0.661; precision: 0.667, recall: 0.644, macrof1: 0.661, weightedf1: 0.661[0m
[92m2020-09-10 18:05:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:46 -- Epoch: 9/20; Train; loss: 0.456; acc: 0.778; precision: 0.779, recall: 0.776, macrof1: 0.778, weightedf1: 0.778[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:05:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:05:54 -- Epoch: 9/20; Valid; loss: 0.638; acc: 0.671; precision: 0.669, recall: 0.677, macrof1: 0.671, weightedf1: 0.671[0m
[92m2020-09-10 18:05:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:05:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:05:54 -- Epoch: 10/20; Train; loss: 0.421; acc: 0.798; precision: 0.802, recall: 0.792, macrof1: 0.798, weightedf1: 0.798[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:02 -- Epoch: 10/20; Valid; loss: 0.632; acc: 0.683; precision: 0.687, recall: 0.671, macrof1: 0.683, weightedf1: 0.683[0m
[92m2020-09-10 18:06:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:06:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:02 -- Epoch: 11/20; Train; loss: 0.390; acc: 0.818; precision: 0.824, recall: 0.808, macrof1: 0.818, weightedf1: 0.818[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:10 -- Epoch: 11/20; Valid; loss: 0.631; acc: 0.688; precision: 0.690, recall: 0.683, macrof1: 0.688, weightedf1: 0.688[0m
[92m2020-09-10 18:06:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:06:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:11 -- Epoch: 12/20; Train; loss: 0.357; acc: 0.842; precision: 0.841, recall: 0.844, macrof1: 0.842, weightedf1: 0.842[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:18 -- Epoch: 12/20; Valid; loss: 0.635; acc: 0.695; precision: 0.702, recall: 0.677, macrof1: 0.695, weightedf1: 0.695[0m
[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_rnn_v001_n500/wikigaz_en_ft_ocr_rnn_v001_n500.model[0m
[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 12, selected epoch: 11[0m




User time: 100.9527


In [37]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:06:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:06:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:06:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:06:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.045389652252197266[0m
[92m2020-09-10 18:06:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:06:19

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:06:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:06:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:06:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:06:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:06:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:22 -- Epoch: 1/20; Train; loss: 0.965; acc: 0.499; precision: 0.499, recall: 0.492, macrof1: 0.499, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:30 -- Epoch: 1/20; Valid; loss: 0.803; acc: 0.529; precision: 0.525, recall: 0.601, macrof1: 0.526, weightedf1: 0.526[0m
[92m2020-09-10 18:06:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:06:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:30 -- Epoch: 2/20; Train; loss: 0.689; acc: 0.591; precision: 0.578, recall: 0.674, macrof1: 0.588, weightedf1: 0.588[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:38 -- Epoch: 2/20; Valid; loss: 0.696; acc: 0.568; precision: 0.557, recall: 0.660, macrof1: 0.564, weightedf1: 0.564[0m
[92m2020-09-10 18:06:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:06:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:38 -- Epoch: 3/20; Train; loss: 0.612; acc: 0.633; precision: 0.621, recall: 0.684, macrof1: 0.632, weightedf1: 0.632[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:46 -- Epoch: 3/20; Valid; loss: 0.656; acc: 0.614; precision: 0.607, recall: 0.650, macrof1: 0.614, weightedf1: 0.614[0m
[92m2020-09-10 18:06:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:06:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:47 -- Epoch: 4/20; Train; loss: 0.564; acc: 0.694; precision: 0.690, recall: 0.704, macrof1: 0.694, weightedf1: 0.694[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:06:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:06:56 -- Epoch: 4/20; Valid; loss: 0.632; acc: 0.658; precision: 0.655, recall: 0.665, macrof1: 0.658, weightedf1: 0.658[0m
[92m2020-09-10 18:06:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:06:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:06:56 -- Epoch: 5/20; Train; loss: 0.520; acc: 0.722; precision: 0.715, recall: 0.738, macrof1: 0.722, weightedf1: 0.722[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:05 -- Epoch: 5/20; Valid; loss: 0.614; acc: 0.683; precision: 0.689, recall: 0.665, macrof1: 0.683, weightedf1: 0.683[0m
[92m2020-09-10 18:07:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:07:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:05 -- Epoch: 6/20; Train; loss: 0.479; acc: 0.755; precision: 0.775, recall: 0.718, macrof1: 0.755, weightedf1: 0.755[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:14 -- Epoch: 6/20; Valid; loss: 0.600; acc: 0.695; precision: 0.696, recall: 0.694, macrof1: 0.695, weightedf1: 0.695[0m
[92m2020-09-10 18:07:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:07:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:15 -- Epoch: 7/20; Train; loss: 0.440; acc: 0.784; precision: 0.780, recall: 0.792, macrof1: 0.784, weightedf1: 0.784[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:24 -- Epoch: 7/20; Valid; loss: 0.593; acc: 0.706; precision: 0.710, recall: 0.696, macrof1: 0.706, weightedf1: 0.706[0m
[92m2020-09-10 18:07:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:07:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:24 -- Epoch: 8/20; Train; loss: 0.393; acc: 0.820; precision: 0.817, recall: 0.824, macrof1: 0.820, weightedf1: 0.820[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:33 -- Epoch: 8/20; Valid; loss: 0.591; acc: 0.714; precision: 0.708, recall: 0.726, macrof1: 0.714, weightedf1: 0.714[0m
[92m2020-09-10 18:07:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:07:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:33 -- Epoch: 9/20; Train; loss: 0.361; acc: 0.841; precision: 0.865, recall: 0.808, macrof1: 0.841, weightedf1: 0.841[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:42 -- Epoch: 9/20; Valid; loss: 0.585; acc: 0.721; precision: 0.713, recall: 0.742, macrof1: 0.721, weightedf1: 0.721[0m
[92m2020-09-10 18:07:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:07:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:43 -- Epoch: 10/20; Train; loss: 0.327; acc: 0.872; precision: 0.852, recall: 0.900, macrof1: 0.872, weightedf1: 0.872[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:07:51 -- Epoch: 10/20; Valid; loss: 0.589; acc: 0.723; precision: 0.734, recall: 0.699, macrof1: 0.723, weightedf1: 0.723[0m
[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_rnn_v001_n1000/wikigaz_en_ft_ocr_rnn_v001_n1000.model[0m
[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 10, selected epoch: 9[0m




User time: 89.8996


In [38]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:07:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:07:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:07:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:07:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05057549476623535[0m
[92m2020-09-10 18:07:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:07:52[

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:07:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:07:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:07:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:07:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:07:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:07:55 -- Epoch: 1/20; Train; loss: 0.814; acc: 0.550; precision: 0.545, recall: 0.607, macrof1: 0.549, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:03 -- Epoch: 1/20; Valid; loss: 0.676; acc: 0.595; precision: 0.581, recall: 0.684, macrof1: 0.592, weightedf1: 0.592[0m
[92m2020-09-10 18:08:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:08:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:04 -- Epoch: 2/20; Train; loss: 0.602; acc: 0.667; precision: 0.652, recall: 0.717, macrof1: 0.666, weightedf1: 0.666[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:11 -- Epoch: 2/20; Valid; loss: 0.603; acc: 0.691; precision: 0.698, recall: 0.671, macrof1: 0.690, weightedf1: 0.690[0m
[92m2020-09-10 18:08:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:08:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:12 -- Epoch: 3/20; Train; loss: 0.531; acc: 0.723; precision: 0.717, recall: 0.736, macrof1: 0.723, weightedf1: 0.723[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:20 -- Epoch: 3/20; Valid; loss: 0.574; acc: 0.714; precision: 0.736, recall: 0.667, macrof1: 0.713, weightedf1: 0.713[0m
[92m2020-09-10 18:08:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:08:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:21 -- Epoch: 4/20; Train; loss: 0.486; acc: 0.756; precision: 0.753, recall: 0.761, macrof1: 0.756, weightedf1: 0.756[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:29 -- Epoch: 4/20; Valid; loss: 0.554; acc: 0.733; precision: 0.719, recall: 0.766, macrof1: 0.733, weightedf1: 0.733[0m
[92m2020-09-10 18:08:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:08:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:30 -- Epoch: 5/20; Train; loss: 0.442; acc: 0.785; precision: 0.777, recall: 0.799, macrof1: 0.785, weightedf1: 0.785[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:38 -- Epoch: 5/20; Valid; loss: 0.543; acc: 0.743; precision: 0.718, recall: 0.801, macrof1: 0.743, weightedf1: 0.743[0m
[92m2020-09-10 18:08:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:08:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:39 -- Epoch: 6/20; Train; loss: 0.402; acc: 0.809; precision: 0.800, recall: 0.824, macrof1: 0.809, weightedf1: 0.809[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:46 -- Epoch: 6/20; Valid; loss: 0.545; acc: 0.747; precision: 0.714, recall: 0.826, macrof1: 0.746, weightedf1: 0.746[0m
[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n2000/wikigaz_en_ft_ocr_rnn_v001_n2000.model[0m
[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 52.4146


In [39]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:08:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:08:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:08:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:08:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05385422706604004[0m
[92m2020-09-10 18:08:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:08:47[

length s1:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:08:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:08:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:08:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:08:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:08:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:08:51 -- Epoch: 1/20; Train; loss: 0.742; acc: 0.588; precision: 0.582, recall: 0.629, macrof1: 0.588, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:08:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:08:59 -- Epoch: 1/20; Valid; loss: 0.610; acc: 0.679; precision: 0.689, recall: 0.653, macrof1: 0.679, weightedf1: 0.679[0m
[92m2020-09-10 18:08:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:09:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:01 -- Epoch: 2/20; Train; loss: 0.548; acc: 0.720; precision: 0.723, recall: 0.714, macrof1: 0.720, weightedf1: 0.720[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:09:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:09:09 -- Epoch: 2/20; Valid; loss: 0.551; acc: 0.729; precision: 0.744, recall: 0.699, macrof1: 0.729, weightedf1: 0.729[0m
[92m2020-09-10 18:09:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:09:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:11 -- Epoch: 3/20; Train; loss: 0.490; acc: 0.764; precision: 0.760, recall: 0.771, macrof1: 0.764, weightedf1: 0.764[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:09:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:09:19 -- Epoch: 3/20; Valid; loss: 0.524; acc: 0.748; precision: 0.710, recall: 0.837, macrof1: 0.746, weightedf1: 0.746[0m
[92m2020-09-10 18:09:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:09:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:21 -- Epoch: 4/20; Train; loss: 0.438; acc: 0.802; precision: 0.791, recall: 0.822, macrof1: 0.802, weightedf1: 0.802[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:09:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:09:29 -- Epoch: 4/20; Valid; loss: 0.496; acc: 0.769; precision: 0.775, recall: 0.758, macrof1: 0.769, weightedf1: 0.769[0m
[92m2020-09-10 18:09:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:09:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:31 -- Epoch: 5/20; Train; loss: 0.389; acc: 0.826; precision: 0.810, recall: 0.851, macrof1: 0.826, weightedf1: 0.826[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:09:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:09:39 -- Epoch: 5/20; Valid; loss: 0.481; acc: 0.777; precision: 0.774, recall: 0.782, macrof1: 0.777, weightedf1: 0.777[0m
[92m2020-09-10 18:09:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:09:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:41 -- Epoch: 6/20; Train; loss: 0.347; acc: 0.853; precision: 0.837, recall: 0.875, macrof1: 0.852, weightedf1: 0.852[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:09:49 -- Epoch: 6/20; Valid; loss: 0.483; acc: 0.781; precision: 0.796, recall: 0.754, macrof1: 0.781, weightedf1: 0.781[0m
[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n4000/wikigaz_en_ft_ocr_rnn_v001_n4000.model[0m
[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 59.4692


In [40]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:09:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:09:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:09:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:09:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05468893051147461[0m
[92m2020-09-10 18:09:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:09:50[

                                                    

[92m2020-09-10 18:09:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:09:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:09:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:09:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:09:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:09:56 -- Epoch: 1/20; Train; loss: 0.649; acc: 0.654; precision: 0.645, recall: 0.686, macrof1: 0.654, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:10:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:10:04 -- Epoch: 1/20; Valid; loss: 0.534; acc: 0.741; precision: 0.725, recall: 0.774, macrof1: 0.740, weightedf1: 0.740[0m
[92m2020-09-10 18:10:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:10:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:10:08 -- Epoch: 2/20; Train; loss: 0.487; acc: 0.769; precision: 0.756, recall: 0.795, macrof1: 0.769, weightedf1: 0.769[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:10:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:10:16 -- Epoch: 2/20; Valid; loss: 0.487; acc: 0.768; precision: 0.726, recall: 0.861, macrof1: 0.766, weightedf1: 0.766[0m
[92m2020-09-10 18:10:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:10:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:10:19 -- Epoch: 3/20; Train; loss: 0.425; acc: 0.805; precision: 0.790, recall: 0.833, macrof1: 0.805, weightedf1: 0.805[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:10:27 -- Epoch: 3/20; Valid; loss: 0.444; acc: 0.797; precision: 0.778, recall: 0.831, macrof1: 0.797, weightedf1: 0.797[0m
[92m2020-09-10 18:10:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:10:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:10:31 -- Epoch: 4/20; Train; loss: 0.372; acc: 0.835; precision: 0.816, recall: 0.864, macrof1: 0.835, weightedf1: 0.835[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:10:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:10:39 -- Epoch: 4/20; Valid; loss: 0.429; acc: 0.807; precision: 0.792, recall: 0.832, macrof1: 0.807, weightedf1: 0.807[0m
[92m2020-09-10 18:10:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:10:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:10:43 -- Epoch: 5/20; Train; loss: 0.326; acc: 0.863; precision: 0.846, recall: 0.886, macrof1: 0.862, weightedf1: 0.862[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:10:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:10:51 -- Epoch: 5/20; Valid; loss: 0.426; acc: 0.812; precision: 0.795, recall: 0.841, macrof1: 0.812, weightedf1: 0.812[0m
[92m2020-09-10 18:10:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:10:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:10:54 -- Epoch: 6/20; Train; loss: 0.291; acc: 0.879; precision: 0.864, recall: 0.898, macrof1: 0.879, weightedf1: 0.879[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:11:02 -- Epoch: 6/20; Valid; loss: 0.435; acc: 0.810; precision: 0.767, recall: 0.889, macrof1: 0.808, weightedf1: 0.808[0m
[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n8000/wikigaz_en_ft_ocr_rnn_v001_n8000.model[0m
[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 70.3708


In [41]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:11:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:11:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:11:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:11:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0556797981262207[0m
[92m2020-09-10 18:11:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:11:03[0

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 18:11:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:11:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:11:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:11:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:11:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:11:13 -- Epoch: 1/20; Train; loss: 0.574; acc: 0.707; precision: 0.696, recall: 0.734, macrof1: 0.707, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:11:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:11:21 -- Epoch: 1/20; Valid; loss: 0.479; acc: 0.777; precision: 0.760, recall: 0.807, macrof1: 0.776, weightedf1: 0.776[0m
[92m2020-09-10 18:11:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:11:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:11:28 -- Epoch: 2/20; Train; loss: 0.428; acc: 0.803; precision: 0.790, recall: 0.827, macrof1: 0.803, weightedf1: 0.803[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:11:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:11:36 -- Epoch: 2/20; Valid; loss: 0.426; acc: 0.809; precision: 0.813, recall: 0.802, macrof1: 0.809, weightedf1: 0.809[0m
[92m2020-09-10 18:11:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:11:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:11:43 -- Epoch: 3/20; Train; loss: 0.368; acc: 0.837; precision: 0.822, recall: 0.861, macrof1: 0.837, weightedf1: 0.837[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:11:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:11:51 -- Epoch: 3/20; Valid; loss: 0.397; acc: 0.825; precision: 0.800, recall: 0.867, macrof1: 0.825, weightedf1: 0.825[0m
[92m2020-09-10 18:11:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:11:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:11:58 -- Epoch: 4/20; Train; loss: 0.323; acc: 0.863; precision: 0.849, recall: 0.884, macrof1: 0.863, weightedf1: 0.863[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:12:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:12:06 -- Epoch: 4/20; Valid; loss: 0.384; acc: 0.834; precision: 0.814, recall: 0.866, macrof1: 0.834, weightedf1: 0.834[0m
[92m2020-09-10 18:12:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:12:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:12:14 -- Epoch: 5/20; Train; loss: 0.290; acc: 0.878; precision: 0.866, recall: 0.894, macrof1: 0.878, weightedf1: 0.878[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:12:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:12:22 -- Epoch: 5/20; Valid; loss: 0.384; acc: 0.834; precision: 0.799, recall: 0.894, macrof1: 0.834, weightedf1: 0.834[0m
[92m2020-09-10 18:12:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:12:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:12:29 -- Epoch: 6/20; Train; loss: 0.259; acc: 0.893; precision: 0.879, recall: 0.912, macrof1: 0.893, weightedf1: 0.893[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:12:37 -- Epoch: 6/20; Valid; loss: 0.387; acc: 0.839; precision: 0.830, recall: 0.855, macrof1: 0.839, weightedf1: 0.839[0m
[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v001_n16000/wikigaz_en_ft_ocr_rnn_v001_n16000.model[0m
[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 91.5719


In [42]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A.yaml[0m
[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:12:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06780815124511719[0m
[92m2020-09-10 18:12:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:12:38[

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 18:12:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:12:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:12:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:12:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:12:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:12:55 -- Epoch: 1/20; Train; loss: 0.509; acc: 0.753; precision: 0.743, recall: 0.775, macrof1: 0.753, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:13:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:13:03 -- Epoch: 1/20; Valid; loss: 0.406; acc: 0.819; precision: 0.803, recall: 0.846, macrof1: 0.819, weightedf1: 0.819[0m
[92m2020-09-10 18:13:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:13:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:13:17 -- Epoch: 2/20; Train; loss: 0.372; acc: 0.838; precision: 0.821, recall: 0.863, macrof1: 0.838, weightedf1: 0.838[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:13:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:13:25 -- Epoch: 2/20; Valid; loss: 0.367; acc: 0.838; precision: 0.822, recall: 0.863, macrof1: 0.838, weightedf1: 0.838[0m
[92m2020-09-10 18:13:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:13:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:13:40 -- Epoch: 3/20; Train; loss: 0.327; acc: 0.861; precision: 0.845, recall: 0.884, macrof1: 0.861, weightedf1: 0.861[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:13:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:13:48 -- Epoch: 3/20; Valid; loss: 0.352; acc: 0.849; precision: 0.833, recall: 0.872, macrof1: 0.849, weightedf1: 0.849[0m
[92m2020-09-10 18:13:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:14:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:14:02 -- Epoch: 4/20; Train; loss: 0.291; acc: 0.877; precision: 0.861, recall: 0.898, macrof1: 0.876, weightedf1: 0.876[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:14:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:14:10 -- Epoch: 4/20; Valid; loss: 0.344; acc: 0.852; precision: 0.845, recall: 0.863, macrof1: 0.852, weightedf1: 0.852[0m
[92m2020-09-10 18:14:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:14:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:14:23 -- Epoch: 5/20; Train; loss: 0.264; acc: 0.890; precision: 0.876, recall: 0.909, macrof1: 0.890, weightedf1: 0.890[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:14:31 -- Epoch: 5/20; Valid; loss: 0.350; acc: 0.852; precision: 0.824, recall: 0.895, macrof1: 0.852, weightedf1: 0.852[0m
[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v001_n32000/wikigaz_en_ft_ocr_rnn_v001_n32000.model[0m
[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 110.5102


In [43]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:14:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05356478691101074[0m
[92m2020-09-10 18:14:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 18:14:32[0m [95mlwm-embeddings[0m [1m[90m[INF

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 18:14:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:14:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:14:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:14:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:15:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:15:02 -- Epoch: 1/10; Train; loss: 0.448; acc: 0.791; precision: 0.777, recall: 0.817, macrof1: 0.791, weig

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:15:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:15:09 -- Epoch: 1/10; Valid; loss: 0.386; acc: 0.829; precision: 0.862, recall: 0.783, macrof1: 0.828, weightedf1: 0.828[0m
[92m2020-09-10 18:15:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:15:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:15:37 -- Epoch: 2/10; Train; loss: 0.337; acc: 0.855; precision: 0.839, recall: 0.878, macrof1: 0.854, weightedf1: 0.854[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:15:43 -- Epoch: 2/10; Valid; loss: 0.346; acc: 0.849; precision: 0.809, recall: 0.915, macrof1: 0.849, weightedf1: 0.849[0m
[92m2020-09-10 18:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:16:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:16:10 -- Epoch: 3/10; Train; loss: 0.298; acc: 0.873; precision: 0.858, recall: 0.895, macrof1: 0.873, weightedf1: 0.873[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:16:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:16:17 -- Epoch: 3/10; Valid; loss: 0.323; acc: 0.862; precision: 0.850, recall: 0.880, macrof1: 0.862, weightedf1: 0.862[0m
[92m2020-09-10 18:16:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:16:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:16:45 -- Epoch: 4/10; Train; loss: 0.271; acc: 0.885; precision: 0.870, recall: 0.905, macrof1: 0.885, weightedf1: 0.885[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:16:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:16:51 -- Epoch: 4/10; Valid; loss: 0.324; acc: 0.861; precision: 0.836, recall: 0.898, macrof1: 0.860, weightedf1: 0.860[0m
[92m2020-09-10 18:16:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:17:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:17:20 -- Epoch: 5/10; Train; loss: 0.250; acc: 0.893; precision: 0.880, recall: 0.911, macrof1: 0.893, weightedf1: 0.893[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:17:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:17:27 -- Epoch: 5/10; Valid; loss: 0.323; acc: 0.864; precision: 0.829, recall: 0.918, macrof1: 0.864, weightedf1: 0.864[0m
[92m2020-09-10 18:17:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:17:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:17:56 -- Epoch: 6/10; Train; loss: 0.233; acc: 0.901; precision: 0.889, recall: 0.917, macrof1: 0.901, weightedf1: 0.901[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:18:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:18:04 -- Epoch: 6/10; Valid; loss: 0.319; acc: 0.867; precision: 0.865, recall: 0.869, macrof1: 0.867, weightedf1: 0.867[0m
[92m2020-09-10 18:18:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:18:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:18:33 -- Epoch: 7/10; Train; loss: 0.217; acc: 0.909; precision: 0.898, recall: 0.924, macrof1: 0.909, weightedf1: 0.909[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:18:39 -- Epoch: 7/10; Valid; loss: 0.337; acc: 0.863; precision: 0.832, recall: 0.910, macrof1: 0.863, weightedf1: 0.863[0m
[92m2020-09-10 18:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:19:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:19:11 -- Epoch: 8/10; Train; loss: 0.203; acc: 0.915; precision: 0.904, recall: 0.928, macrof1: 0.915, weightedf1: 0.915[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:19:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:19:22 -- Epoch: 8/10; Valid; loss: 0.336; acc: 0.867; precision: 0.833, recall: 0.917, macrof1: 0.867, weightedf1: 0.867[0m
[92m2020-09-10 18:19:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:20:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:20:00 -- Epoch: 9/10; Train; loss: 0.190; acc: 0.922; precision: 0.912, recall: 0.935, macrof1: 0.922, weightedf1: 0.922[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:20:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:20:08 -- Epoch: 9/10; Valid; loss: 0.339; acc: 0.869; precision: 0.857, recall: 0.886, macrof1: 0.869, weightedf1: 0.869[0m
[92m2020-09-10 18:20:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:20:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:20:37 -- Epoch: 10/10; Train; loss: 0.180; acc: 0.926; precision: 0.917, recall: 0.937, macrof1: 0.926, weightedf1: 0.926[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:20:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:20:43 -- Epoch: 10/10; Valid; loss: 0.359; acc: 0.861; precision: 0.879, recall: 0.837, macrof1: 0.861, weightedf1: 0.861[0m
[92m2020-09-10 18:20:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 18:20:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_rnn_v001_n64000/wikigaz_en_ft_ocr_rnn_v001_n64000.model[0m



User time: 368.7503


In [44]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v001_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_A_no_early_stopping.yaml[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05586600303649902[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 18:20:44[0m [95mlwm-embeddings[0m [1m[90m[INF

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 18:20:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 False
rnn_1.weight_hh_l0 False
rnn_1.bias_ih_l0 False
rnn_1.bias_hh_l0 False
rnn_1.weight_ih_l0_reverse False
rnn_1.weight_hh_l0_reverse False
rnn_1.bias_ih_l0_reverse False
rnn_1.bias_hh_l0_reverse False
rnn_1.weight_ih_l1 False
rnn_1.weight_hh_l1 False
rnn_1.bias_ih_l1 False
rnn_1.bias_hh_l1 False
rnn_1.weight_ih_l1_reverse False
rnn_1.weight_hh_l1_reverse False
rnn_1.bias_ih_l1_reverse False
rnn_1.bias_hh_l1_reverse False
attn_step1.weight False
attn_step1.bias False
attn_step2.weight False
attn_step2.bias False
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:20:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:20:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 18:20:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:21:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:21:43 -- Epoch: 1/10; Train; loss: 0.427; acc: 0.805; precision: 0.790, recall: 0.830, macrof1: 0.804, weig

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:21:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:21:43 -- Epoch: 1/10; Valid; loss: 0.365; acc: 0.836; precision: 0.824, recall: 0.854, macrof1: 0.836, weightedf1: 0.836[0m
[92m2020-09-10 18:21:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:22:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:22:21 -- Epoch: 2/10; Train; loss: 0.323; acc: 0.861; precision: 0.845, recall: 0.884, macrof1: 0.861, weightedf1: 0.861[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:22:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:22:21 -- Epoch: 2/10; Valid; loss: 0.321; acc: 0.864; precision: 0.829, recall: 0.917, macrof1: 0.864, weightedf1: 0.864[0m
[92m2020-09-10 18:22:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:23:01 -- Epoch: 3/10; Train; loss: 0.289; acc: 0.878; precision: 0.863, recall: 0.898, macrof1: 0.878, weightedf1: 0.878[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:23:01 -- Epoch: 3/10; Valid; loss: 0.325; acc: 0.864; precision: 0.833, recall: 0.910, macrof1: 0.864, weightedf1: 0.864[0m
[92m2020-09-10 18:23:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:23:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:23:57 -- Epoch: 4/10; Train; loss: 0.265; acc: 0.889; precision: 0.875, recall: 0.907, macrof1: 0.889, weightedf1: 0.889[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:23:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:23:58 -- Epoch: 4/10; Valid; loss: 0.312; acc: 0.876; precision: 0.865, recall: 0.890, macrof1: 0.876, weightedf1: 0.876[0m
[92m2020-09-10 18:23:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:24:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:24:36 -- Epoch: 5/10; Train; loss: 0.245; acc: 0.897; precision: 0.885, recall: 0.914, macrof1: 0.897, weightedf1: 0.897[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:24:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:24:36 -- Epoch: 5/10; Valid; loss: 0.318; acc: 0.872; precision: 0.852, recall: 0.900, macrof1: 0.872, weightedf1: 0.872[0m
[92m2020-09-10 18:24:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:25:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:25:16 -- Epoch: 6/10; Train; loss: 0.230; acc: 0.904; precision: 0.893, recall: 0.919, macrof1: 0.904, weightedf1: 0.904[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:25:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:25:16 -- Epoch: 6/10; Valid; loss: 0.324; acc: 0.871; precision: 0.845, recall: 0.907, macrof1: 0.870, weightedf1: 0.870[0m
[92m2020-09-10 18:25:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:25:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:25:59 -- Epoch: 7/10; Train; loss: 0.216; acc: 0.910; precision: 0.899, recall: 0.924, macrof1: 0.910, weightedf1: 0.910[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:26:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:26:00 -- Epoch: 7/10; Valid; loss: 0.329; acc: 0.869; precision: 0.849, recall: 0.897, macrof1: 0.869, weightedf1: 0.869[0m
[92m2020-09-10 18:26:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:26:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:26:52 -- Epoch: 8/10; Train; loss: 0.204; acc: 0.915; precision: 0.904, recall: 0.928, macrof1: 0.915, weightedf1: 0.915[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:26:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:26:52 -- Epoch: 8/10; Valid; loss: 0.382; acc: 0.859; precision: 0.827, recall: 0.907, macrof1: 0.859, weightedf1: 0.859[0m
[92m2020-09-10 18:26:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:27:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:27:32 -- Epoch: 9/10; Train; loss: 0.194; acc: 0.920; precision: 0.910, recall: 0.931, macrof1: 0.920, weightedf1: 0.920[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:27:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:27:32 -- Epoch: 9/10; Valid; loss: 0.337; acc: 0.876; precision: 0.862, recall: 0.894, macrof1: 0.876, weightedf1: 0.876[0m
[92m2020-09-10 18:27:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:28:29 -- Epoch: 10/10; Train; loss: 0.184; acc: 0.924; precision: 0.914, recall: 0.935, macrof1: 0.924, weightedf1: 0.924[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:28:29 -- Epoch: 10/10; Valid; loss: 0.381; acc: 0.867; precision: 0.860, recall: 0.877, macrof1: 0.867, weightedf1: 0.867[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v001_n84000/wikigaz_en_ft_ocr_rnn_v001_n84000.model[0m



User time: 461.5594


## Fine-Tune, model B, GRU

In [45]:
from DeezyMatch import finetune as dm_finetune

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name="wikigaz_en_ft_ocr_gru_v002_n250",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=250
           )

[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05349993705749512[0m
[92m2020-09-10 18:28:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:28:29[

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:28:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:28:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:28:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:28:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:28:32[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:28:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:28:32 -- Epoch: 1/20; Train; loss: 1.552; acc: 0.460; precision: 0.456, recall: 0.416, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:28:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:28:40 -- Epoch: 1/20; Valid; loss: 1.395; acc: 0.483; precision: 0.481, recall: 0.436, macrof1: 0.482, weightedf1: 0.482[0m
[92m2020-09-10 18:28:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:28:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:28:41 -- Epoch: 2/20; Train; loss: 0.882; acc: 0.612; precision: 0.637, recall: 0.520, macrof1: 0.609, weightedf1: 0.609[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:28:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:28:49 -- Epoch: 2/20; Valid; loss: 1.187; acc: 0.507; precision: 0.508, recall: 0.483, macrof1: 0.507, weightedf1: 0.507[0m
[92m2020-09-10 18:28:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:28:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:28:49 -- Epoch: 3/20; Train; loss: 0.596; acc: 0.724; precision: 0.764, recall: 0.648, macrof1: 0.722, weightedf1: 0.722[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:28:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:28:57 -- Epoch: 3/20; Valid; loss: 1.060; acc: 0.532; precision: 0.530, recall: 0.558, macrof1: 0.532, weightedf1: 0.532[0m
[92m2020-09-10 18:28:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:28:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:28:58 -- Epoch: 4/20; Train; loss: 0.449; acc: 0.808; precision: 0.824, recall: 0.784, macrof1: 0.808, weightedf1: 0.808[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:06 -- Epoch: 4/20; Valid; loss: 0.989; acc: 0.549; precision: 0.543, recall: 0.618, macrof1: 0.547, weightedf1: 0.547[0m
[92m2020-09-10 18:29:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:29:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:06 -- Epoch: 5/20; Train; loss: 0.354; acc: 0.856; precision: 0.845, recall: 0.872, macrof1: 0.856, weightedf1: 0.856[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:15 -- Epoch: 5/20; Valid; loss: 0.952; acc: 0.564; precision: 0.554, recall: 0.654, macrof1: 0.560, weightedf1: 0.560[0m
[92m2020-09-10 18:29:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:29:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:15 -- Epoch: 6/20; Train; loss: 0.292; acc: 0.912; precision: 0.893, recall: 0.936, macrof1: 0.912, weightedf1: 0.912[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:23 -- Epoch: 6/20; Valid; loss: 0.930; acc: 0.576; precision: 0.564, recall: 0.664, macrof1: 0.572, weightedf1: 0.572[0m
[92m2020-09-10 18:29:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:29:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:23 -- Epoch: 7/20; Train; loss: 0.237; acc: 0.932; precision: 0.915, recall: 0.952, macrof1: 0.932, weightedf1: 0.932[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:32 -- Epoch: 7/20; Valid; loss: 0.921; acc: 0.587; precision: 0.575, recall: 0.668, macrof1: 0.584, weightedf1: 0.584[0m
[92m2020-09-10 18:29:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 18:29:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:32 -- Epoch: 8/20; Train; loss: 0.195; acc: 0.948; precision: 0.918, recall: 0.984, macrof1: 0.948, weightedf1: 0.948[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:40 -- Epoch: 8/20; Valid; loss: 0.922; acc: 0.597; precision: 0.585, recall: 0.664, macrof1: 0.595, weightedf1: 0.595[0m
[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_gru_v002_n250/wikigaz_en_ft_ocr_gru_v002_n250.model[0m
[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 8, selected epoch: 7[0m




User time: 68.0159


In [46]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:29:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:29:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:29:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:29:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.048142194747924805[0m
[92m2020-09-10 18:29:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:29:41

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:29:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:29:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:29:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:29:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:29:43[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:29:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:43 -- Epoch: 1/20; Train; loss: 1.492; acc: 0.472; precision: 0.468, recall: 0.408, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:29:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:29:52 -- Epoch: 1/20; Valid; loss: 1.160; acc: 0.513; precision: 0.514, recall: 0.489, macrof1: 0.513, weightedf1: 0.513[0m
[92m2020-09-10 18:29:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:29:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:29:52 -- Epoch: 2/20; Train; loss: 0.790; acc: 0.618; precision: 0.628, recall: 0.580, macrof1: 0.617, weightedf1: 0.617[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:00 -- Epoch: 2/20; Valid; loss: 0.900; acc: 0.563; precision: 0.555, recall: 0.635, macrof1: 0.560, weightedf1: 0.560[0m
[92m2020-09-10 18:30:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:30:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:01 -- Epoch: 3/20; Train; loss: 0.555; acc: 0.726; precision: 0.716, recall: 0.748, macrof1: 0.726, weightedf1: 0.726[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:09 -- Epoch: 3/20; Valid; loss: 0.794; acc: 0.597; precision: 0.581, recall: 0.695, macrof1: 0.593, weightedf1: 0.593[0m
[92m2020-09-10 18:30:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:30:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:09 -- Epoch: 4/20; Train; loss: 0.448; acc: 0.806; precision: 0.782, recall: 0.848, macrof1: 0.806, weightedf1: 0.806[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:18 -- Epoch: 4/20; Valid; loss: 0.746; acc: 0.622; precision: 0.604, recall: 0.706, macrof1: 0.619, weightedf1: 0.619[0m
[92m2020-09-10 18:30:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:30:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:18 -- Epoch: 5/20; Train; loss: 0.371; acc: 0.850; precision: 0.849, recall: 0.852, macrof1: 0.850, weightedf1: 0.850[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:26 -- Epoch: 5/20; Valid; loss: 0.728; acc: 0.641; precision: 0.626, recall: 0.699, macrof1: 0.639, weightedf1: 0.639[0m
[92m2020-09-10 18:30:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 18:30:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:27 -- Epoch: 6/20; Train; loss: 0.303; acc: 0.902; precision: 0.900, recall: 0.904, macrof1: 0.902, weightedf1: 0.902[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:35 -- Epoch: 6/20; Valid; loss: 0.735; acc: 0.658; precision: 0.641, recall: 0.717, macrof1: 0.656, weightedf1: 0.656[0m
[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n500/wikigaz_en_ft_ocr_gru_v002_n500.model[0m
[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 52.2374


In [47]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:30:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:30:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:30:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:30:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05417013168334961[0m
[92m2020-09-10 18:30:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:30:36[

                                                    

[92m2020-09-10 18:30:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:30:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:30:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:30:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:30:38[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:30:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:39 -- Epoch: 1/20; Train; loss: 1.237; acc: 0.504; precision: 0.504, recall: 0.482, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:47 -- Epoch: 1/20; Valid; loss: 0.864; acc: 0.561; precision: 0.559, recall: 0.579, macrof1: 0.561, weightedf1: 0.561[0m
[92m2020-09-10 18:30:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:30:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:48 -- Epoch: 2/20; Train; loss: 0.626; acc: 0.660; precision: 0.652, recall: 0.688, macrof1: 0.660, weightedf1: 0.660[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:30:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:30:57 -- Epoch: 2/20; Valid; loss: 0.690; acc: 0.624; precision: 0.609, recall: 0.694, macrof1: 0.622, weightedf1: 0.622[0m
[92m2020-09-10 18:30:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:30:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:30:57 -- Epoch: 3/20; Train; loss: 0.491; acc: 0.765; precision: 0.738, recall: 0.822, macrof1: 0.764, weightedf1: 0.764[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:31:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:31:06 -- Epoch: 3/20; Valid; loss: 0.643; acc: 0.658; precision: 0.642, recall: 0.714, macrof1: 0.657, weightedf1: 0.657[0m
[92m2020-09-10 18:31:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:31:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:31:07 -- Epoch: 4/20; Train; loss: 0.398; acc: 0.830; precision: 0.829, recall: 0.832, macrof1: 0.830, weightedf1: 0.830[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:31:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:31:19 -- Epoch: 4/20; Valid; loss: 0.621; acc: 0.690; precision: 0.688, recall: 0.696, macrof1: 0.690, weightedf1: 0.690[0m
[92m2020-09-10 18:31:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:31:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:31:20 -- Epoch: 5/20; Train; loss: 0.315; acc: 0.881; precision: 0.883, recall: 0.878, macrof1: 0.881, weightedf1: 0.881[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:31:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:31:31 -- Epoch: 5/20; Valid; loss: 0.607; acc: 0.716; precision: 0.705, recall: 0.741, macrof1: 0.715, weightedf1: 0.715[0m
[92m2020-09-10 18:31:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:31:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:31:32 -- Epoch: 6/20; Train; loss: 0.234; acc: 0.925; precision: 0.928, recall: 0.922, macrof1: 0.925, weightedf1: 0.925[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:31:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:31:43 -- Epoch: 6/20; Valid; loss: 0.598; acc: 0.738; precision: 0.726, recall: 0.764, macrof1: 0.738, weightedf1: 0.738[0m
[92m2020-09-10 18:31:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 18:31:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:31:45 -- Epoch: 7/20; Train; loss: 0.161; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:31:54 -- Epoch: 7/20; Valid; loss: 0.598; acc: 0.748; precision: 0.750, recall: 0.742, macrof1: 0.748, weightedf1: 0.748[0m
[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_gru_v002_n1000/wikigaz_en_ft_ocr_gru_v002_n1000.model[0m
[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 7, selected epoch: 6[0m




User time: 75.6827


In [48]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:31:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:31:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:31:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:31:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05487489700317383[0m
[92m2020-09-10 18:31:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:31:55[

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:31:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:31:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:31:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:31:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:31:57[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:31:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:31:59 -- Epoch: 1/20; Train; loss: 1.002; acc: 0.548; precision: 0.548, recall: 0.551, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:07 -- Epoch: 1/20; Valid; loss: 0.675; acc: 0.628; precision: 0.603, recall: 0.755, macrof1: 0.622, weightedf1: 0.622[0m
[92m2020-09-10 18:32:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:32:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:32:09 -- Epoch: 2/20; Train; loss: 0.556; acc: 0.708; precision: 0.686, recall: 0.765, macrof1: 0.707, weightedf1: 0.707[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:17 -- Epoch: 2/20; Valid; loss: 0.584; acc: 0.695; precision: 0.679, recall: 0.743, macrof1: 0.695, weightedf1: 0.695[0m
[92m2020-09-10 18:32:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:32:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:32:19 -- Epoch: 3/20; Train; loss: 0.437; acc: 0.805; precision: 0.798, recall: 0.816, macrof1: 0.805, weightedf1: 0.805[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:27 -- Epoch: 3/20; Valid; loss: 0.540; acc: 0.744; precision: 0.743, recall: 0.746, macrof1: 0.744, weightedf1: 0.744[0m
[92m2020-09-10 18:32:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:32:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:32:29 -- Epoch: 4/20; Train; loss: 0.323; acc: 0.871; precision: 0.875, recall: 0.864, macrof1: 0.870, weightedf1: 0.870[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:37 -- Epoch: 4/20; Valid; loss: 0.506; acc: 0.777; precision: 0.779, recall: 0.773, macrof1: 0.777, weightedf1: 0.777[0m
[92m2020-09-10 18:32:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:32:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:32:39 -- Epoch: 5/20; Train; loss: 0.221; acc: 0.922; precision: 0.931, recall: 0.911, macrof1: 0.922, weightedf1: 0.922[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:47 -- Epoch: 5/20; Valid; loss: 0.491; acc: 0.796; precision: 0.794, recall: 0.800, macrof1: 0.796, weightedf1: 0.796[0m
[92m2020-09-10 18:32:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 18:32:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:32:49 -- Epoch: 6/20; Train; loss: 0.140; acc: 0.962; precision: 0.967, recall: 0.957, macrof1: 0.962, weightedf1: 0.962[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:32:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:32:57 -- Epoch: 6/20; Valid; loss: 0.503; acc: 0.809; precision: 0.801, recall: 0.822, macrof1: 0.809, weightedf1: 0.809[0m
[92m2020-09-10 18:32:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n2000/wikigaz_en_ft_ocr_gru_v002_n2000.model[0m
[92m2020-09-10 18:32:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:32:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 60.4684


In [49]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:32:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05280113220214844[0m
[92m2020-09-10 18:32:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:32:58[

length s1:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 18:33:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:33:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:33:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:33:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:33:01[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:33:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:33:04 -- Epoch: 1/20; Train; loss: 0.804; acc: 0.617; precision: 0.610, recall: 0.649, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:33:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:33:12 -- Epoch: 1/20; Valid; loss: 0.582; acc: 0.692; precision: 0.670, recall: 0.757, macrof1: 0.691, weightedf1: 0.691[0m
[92m2020-09-10 18:33:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:33:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:33:16 -- Epoch: 2/20; Train; loss: 0.472; acc: 0.770; precision: 0.757, recall: 0.794, macrof1: 0.770, weightedf1: 0.770[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:33:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:33:24 -- Epoch: 2/20; Valid; loss: 0.484; acc: 0.775; precision: 0.812, recall: 0.717, macrof1: 0.775, weightedf1: 0.775[0m
[92m2020-09-10 18:33:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:33:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:33:28 -- Epoch: 3/20; Train; loss: 0.325; acc: 0.868; precision: 0.876, recall: 0.858, macrof1: 0.868, weightedf1: 0.868[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:33:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:33:38 -- Epoch: 3/20; Valid; loss: 0.413; acc: 0.824; precision: 0.819, recall: 0.831, macrof1: 0.824, weightedf1: 0.824[0m
[92m2020-09-10 18:33:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:33:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:33:44 -- Epoch: 4/20; Train; loss: 0.208; acc: 0.930; precision: 0.928, recall: 0.931, macrof1: 0.930, weightedf1: 0.930[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:33:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:33:57 -- Epoch: 4/20; Valid; loss: 0.391; acc: 0.840; precision: 0.852, recall: 0.822, macrof1: 0.840, weightedf1: 0.840[0m
[92m2020-09-10 18:33:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:34:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:34:03 -- Epoch: 5/20; Train; loss: 0.125; acc: 0.966; precision: 0.965, recall: 0.967, macrof1: 0.966, weightedf1: 0.966[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:34:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:34:15 -- Epoch: 5/20; Valid; loss: 0.391; acc: 0.848; precision: 0.850, recall: 0.845, macrof1: 0.848, weightedf1: 0.848[0m
[92m2020-09-10 18:34:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 18:34:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:34:21 -- Epoch: 6/20; Train; loss: 0.071; acc: 0.987; precision: 0.987, recall: 0.987, macrof1: 0.987, weightedf1: 0.987[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:34:33 -- Epoch: 6/20; Valid; loss: 0.410; acc: 0.853; precision: 0.837, recall: 0.878, macrof1: 0.853, weightedf1: 0.853[0m
[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_gru_v002_n4000/wikigaz_en_ft_ocr_gru_v002_n4000.model[0m
[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 92.7731


In [50]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:34:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:34:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:34:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:34:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0547635555267334[0m
[92m2020-09-10 18:34:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:34:34[0

s2 padding:   0%|          | 0/8000 [00:00<?, ?it/s]

[92m2020-09-10 18:34:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:34:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:34:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:34:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:34:37[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:34:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:34:48 -- Epoch: 1/20; Train; loss: 0.664; acc: 0.671; precision: 0.660, recall: 0.705, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:35:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:35:00 -- Epoch: 1/20; Valid; loss: 0.459; acc: 0.789; precision: 0.807, recall: 0.758, macrof1: 0.788, weightedf1: 0.788[0m
[92m2020-09-10 18:35:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:35:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:35:11 -- Epoch: 2/20; Train; loss: 0.351; acc: 0.849; precision: 0.849, recall: 0.849, macrof1: 0.849, weightedf1: 0.849[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:35:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:35:23 -- Epoch: 2/20; Valid; loss: 0.339; acc: 0.861; precision: 0.859, recall: 0.862, macrof1: 0.861, weightedf1: 0.861[0m
[92m2020-09-10 18:35:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:35:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:35:35 -- Epoch: 3/20; Train; loss: 0.215; acc: 0.920; precision: 0.918, recall: 0.922, macrof1: 0.920, weightedf1: 0.920[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:35:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:35:47 -- Epoch: 3/20; Valid; loss: 0.306; acc: 0.878; precision: 0.885, recall: 0.868, macrof1: 0.878, weightedf1: 0.878[0m
[92m2020-09-10 18:35:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:35:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:35:58 -- Epoch: 4/20; Train; loss: 0.129; acc: 0.960; precision: 0.958, recall: 0.961, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:36:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:36:11 -- Epoch: 4/20; Valid; loss: 0.299; acc: 0.886; precision: 0.882, recall: 0.891, macrof1: 0.886, weightedf1: 0.886[0m
[92m2020-09-10 18:36:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 18:36:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:36:23 -- Epoch: 5/20; Train; loss: 0.070; acc: 0.982; precision: 0.980, recall: 0.985, macrof1: 0.982, weightedf1: 0.982[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:36:34 -- Epoch: 5/20; Valid; loss: 0.319; acc: 0.890; precision: 0.882, recall: 0.902, macrof1: 0.890, weightedf1: 0.890[0m
[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v002_n8000/wikigaz_en_ft_ocr_gru_v002_n8000.model[0m
[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 117.6755


In [51]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:36:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:36:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:36:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:36:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05351972579956055[0m
[92m2020-09-10 18:36:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:36:35[

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 18:36:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:36:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:36:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:36:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:36:38[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:37:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:37:00 -- Epoch: 1/20; Train; loss: 0.526; acc: 0.755; precision: 0.747, recall: 0.774, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:37:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:37:12 -- Epoch: 1/20; Valid; loss: 0.340; acc: 0.862; precision: 0.890, recall: 0.825, macrof1: 0.861, weightedf1: 0.861[0m
[92m2020-09-10 18:37:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:37:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:37:35 -- Epoch: 2/20; Train; loss: 0.244; acc: 0.904; precision: 0.898, recall: 0.910, macrof1: 0.904, weightedf1: 0.904[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:37:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:37:48 -- Epoch: 2/20; Valid; loss: 0.256; acc: 0.899; precision: 0.897, recall: 0.902, macrof1: 0.899, weightedf1: 0.899[0m
[92m2020-09-10 18:37:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:38:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:38:10 -- Epoch: 3/20; Train; loss: 0.145; acc: 0.949; precision: 0.945, recall: 0.953, macrof1: 0.949, weightedf1: 0.949[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:38:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:38:23 -- Epoch: 3/20; Valid; loss: 0.235; acc: 0.911; precision: 0.903, recall: 0.921, macrof1: 0.911, weightedf1: 0.911[0m
[92m2020-09-10 18:38:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 18:38:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:38:45 -- Epoch: 4/20; Train; loss: 0.084; acc: 0.975; precision: 0.971, recall: 0.978, macrof1: 0.975, weightedf1: 0.975[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:38:57 -- Epoch: 4/20; Valid; loss: 0.242; acc: 0.914; precision: 0.916, recall: 0.912, macrof1: 0.914, weightedf1: 0.914[0m
[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v002_n16000/wikigaz_en_ft_ocr_gru_v002_n16000.model[0m
[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 139.5836


In [52]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B.yaml[0m
[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:38:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:38:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:38:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:38:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06592154502868652[0m
[92m2020-09-10 18:38:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 18:38:58[

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 18:39:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:39:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:39:01[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:39:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:39:45 -- Epoch: 1/20; Train; loss: 0.405; acc: 0.819; precision: 0.812, recall: 0.830, mac

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:39:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:39:58 -- Epoch: 1/20; Valid; loss: 0.259; acc: 0.895; precision: 0.864, recall: 0.937, macrof1: 0.895, weightedf1: 0.895[0m
[92m2020-09-10 18:39:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:40:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:40:43 -- Epoch: 2/20; Train; loss: 0.179; acc: 0.933; precision: 0.927, recall: 0.940, macrof1: 0.933, weightedf1: 0.933[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:40:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:40:56 -- Epoch: 2/20; Valid; loss: 0.203; acc: 0.922; precision: 0.896, recall: 0.956, macrof1: 0.922, weightedf1: 0.922[0m
[92m2020-09-10 18:40:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:41:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:41:28 -- Epoch: 3/20; Train; loss: 0.105; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:41:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:41:36 -- Epoch: 3/20; Valid; loss: 0.190; acc: 0.932; precision: 0.931, recall: 0.933, macrof1: 0.932, weightedf1: 0.932[0m
[92m2020-09-10 18:41:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 18:42:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:42:05 -- Epoch: 4/20; Train; loss: 0.063; acc: 0.980; precision: 0.977, recall: 0.984, macrof1: 0.980, weightedf1: 0.980[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:42:13 -- Epoch: 4/20; Valid; loss: 0.192; acc: 0.937; precision: 0.934, recall: 0.940, macrof1: 0.937, weightedf1: 0.937[0m
[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_gru_v002_n32000/wikigaz_en_ft_ocr_gru_v002_n32000.model[0m
[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 192.2439


In [53]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:42:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:42:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:42:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:42:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0485692024230957[0m
[92m2020-09-10 18:42:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 18:42:14[0m [95mlwm-embeddings[0m [1m[90m[INFO

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 18:42:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:42:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:42:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:42:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:42:17[

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:43:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:43:14 -- Epoch: 1/10; Train; loss: 0.306; acc: 0.870; precision: 0.862, recall: 0.882, mac

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:43:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:43:20 -- Epoch: 1/10; Valid; loss: 0.183; acc: 0.930; precision: 0.932, recall: 0.927, macrof1: 0.930, weightedf1: 0.930[0m
[92m2020-09-10 18:43:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:44:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:44:17 -- Epoch: 2/10; Train; loss: 0.134; acc: 0.951; precision: 0.944, recall: 0.959, macrof1: 0.951, weightedf1: 0.951[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:44:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:44:24 -- Epoch: 2/10; Valid; loss: 0.142; acc: 0.949; precision: 0.938, recall: 0.961, macrof1: 0.949, weightedf1: 0.949[0m
[92m2020-09-10 18:44:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:45:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:45:20 -- Epoch: 3/10; Train; loss: 0.086; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:45:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:45:28 -- Epoch: 3/10; Valid; loss: 0.143; acc: 0.949; precision: 0.961, recall: 0.935, macrof1: 0.949, weightedf1: 0.949[0m
[92m2020-09-10 18:45:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:46:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:46:58 -- Epoch: 4/10; Train; loss: 0.059; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:47:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:47:09 -- Epoch: 4/10; Valid; loss: 0.141; acc: 0.952; precision: 0.952, recall: 0.952, macrof1: 0.952, weightedf1: 0.952[0m
[92m2020-09-10 18:47:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:48:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:48:44 -- Epoch: 5/10; Train; loss: 0.042; acc: 0.987; precision: 0.985, recall: 0.988, macrof1: 0.987, weightedf1: 0.987[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:48:55 -- Epoch: 5/10; Valid; loss: 0.162; acc: 0.951; precision: 0.935, recall: 0.969, macrof1: 0.951, weightedf1: 0.951[0m
[92m2020-09-10 18:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:50:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:50:30 -- Epoch: 6/10; Train; loss: 0.032; acc: 0.990; precision: 0.988, recall: 0.992, macrof1: 0.990, weightedf1: 0.990[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:50:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:50:42 -- Epoch: 6/10; Valid; loss: 0.162; acc: 0.953; precision: 0.955, recall: 0.950, macrof1: 0.953, weightedf1: 0.953[0m
[92m2020-09-10 18:50:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:52:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:52:19 -- Epoch: 7/10; Train; loss: 0.028; acc: 0.991; precision: 0.990, recall: 0.992, macrof1: 0.991, weightedf1: 0.991[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:52:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:52:28 -- Epoch: 7/10; Valid; loss: 0.175; acc: 0.953; precision: 0.945, recall: 0.961, macrof1: 0.953, weightedf1: 0.953[0m
[92m2020-09-10 18:52:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:54:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:54:05 -- Epoch: 8/10; Train; loss: 0.027; acc: 0.992; precision: 0.991, recall: 0.993, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:54:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:54:17 -- Epoch: 8/10; Valid; loss: 0.174; acc: 0.953; precision: 0.956, recall: 0.950, macrof1: 0.953, weightedf1: 0.953[0m
[92m2020-09-10 18:54:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:55:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:55:51 -- Epoch: 9/10; Train; loss: 0.021; acc: 0.993; precision: 0.992, recall: 0.994, macrof1: 0.993, weightedf1: 0.993[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:56:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:56:03 -- Epoch: 9/10; Valid; loss: 0.182; acc: 0.954; precision: 0.942, recall: 0.967, macrof1: 0.954, weightedf1: 0.954[0m
[92m2020-09-10 18:56:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 18:57:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:57:26 -- Epoch: 10/10; Train; loss: 0.022; acc: 0.993; precision: 0.992, recall: 0.994, macrof1: 0.993, weightedf1: 0.993[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:57:36 -- Epoch: 10/10; Valid; loss: 0.199; acc: 0.952; precision: 0.938, recall: 0.969, macrof1: 0.952, weightedf1: 0.952[0m
[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_gru_v002_n64000/wikigaz_en_ft_ocr_gru_v002_n64000.model[0m



User time: 919.6673


In [54]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_gru_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_gru_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_gru_001/wikigaz_en_gru_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_gru_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 18:57:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 18:57:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 18:57:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 18:57:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05806612968444824[0m
[92m2020-09-10 18:57:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 18:57:37[0m [95mlwm-embeddings[0m [1m[90m[INF

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 18:57:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 18:57:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:57:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) GRU ****[0m
[92m2020-09-10 18:57:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 18:57:40[

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 684843

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): GRU(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((180, 60), (180, 60), (180,), (180,), (180, 60), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,), (180, 120), (180, 60), (180,), (180,)), parameters=109440
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 18:59:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_18:59:42 -- Epoch: 1/10; Train; loss: 0.272; acc: 0.887; precision: 0.880, recall: 0.896, mac

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 18:59:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_18:59:42 -- Epoch: 1/10; Valid; loss: 0.144; acc: 0.947; precision: 0.932, recall: 0.963, macrof1: 0.947, weightedf1: 0.947[0m
[92m2020-09-10 18:59:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:01:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:01:44 -- Epoch: 2/10; Train; loss: 0.121; acc: 0.957; precision: 0.950, recall: 0.965, macrof1: 0.957, weightedf1: 0.957[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:01:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:01:44 -- Epoch: 2/10; Valid; loss: 0.098; acc: 0.960; precision: 0.957, recall: 0.963, macrof1: 0.960, weightedf1: 0.960[0m
[92m2020-09-10 19:01:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:03:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:03:43 -- Epoch: 3/10; Train; loss: 0.081; acc: 0.972; precision: 0.968, recall: 0.977, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:03:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:03:44 -- Epoch: 3/10; Valid; loss: 0.106; acc: 0.959; precision: 0.939, recall: 0.980, macrof1: 0.959, weightedf1: 0.959[0m
[92m2020-09-10 19:03:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:05:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:05:45 -- Epoch: 4/10; Train; loss: 0.058; acc: 0.981; precision: 0.978, recall: 0.985, macrof1: 0.981, weightedf1: 0.981[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:05:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:05:46 -- Epoch: 4/10; Valid; loss: 0.104; acc: 0.960; precision: 0.945, recall: 0.977, macrof1: 0.960, weightedf1: 0.960[0m
[92m2020-09-10 19:05:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:07:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:07:16 -- Epoch: 5/10; Train; loss: 0.042; acc: 0.987; precision: 0.984, recall: 0.989, macrof1: 0.987, weightedf1: 0.987[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:07:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:07:16 -- Epoch: 5/10; Valid; loss: 0.097; acc: 0.972; precision: 0.958, recall: 0.987, macrof1: 0.972, weightedf1: 0.972[0m
[92m2020-09-10 19:07:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:08:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:08:31 -- Epoch: 6/10; Train; loss: 0.036; acc: 0.989; precision: 0.987, recall: 0.990, macrof1: 0.989, weightedf1: 0.989[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:08:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:08:31 -- Epoch: 6/10; Valid; loss: 0.101; acc: 0.967; precision: 0.946, recall: 0.990, macrof1: 0.967, weightedf1: 0.967[0m
[92m2020-09-10 19:08:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:09:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:09:47 -- Epoch: 7/10; Train; loss: 0.031; acc: 0.990; precision: 0.989, recall: 0.991, macrof1: 0.990, weightedf1: 0.990[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:09:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:09:47 -- Epoch: 7/10; Valid; loss: 0.094; acc: 0.964; precision: 0.957, recall: 0.970, macrof1: 0.964, weightedf1: 0.964[0m
[92m2020-09-10 19:09:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:11:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:11:08 -- Epoch: 8/10; Train; loss: 0.026; acc: 0.992; precision: 0.991, recall: 0.993, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:11:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:11:09 -- Epoch: 8/10; Valid; loss: 0.105; acc: 0.967; precision: 0.958, recall: 0.977, macrof1: 0.967, weightedf1: 0.967[0m
[92m2020-09-10 19:11:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:12:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:12:31 -- Epoch: 9/10; Train; loss: 0.024; acc: 0.993; precision: 0.991, recall: 0.994, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:12:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:12:31 -- Epoch: 9/10; Valid; loss: 0.093; acc: 0.973; precision: 0.961, recall: 0.987, macrof1: 0.973, weightedf1: 0.973[0m
[92m2020-09-10 19:12:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:13:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:13:46 -- Epoch: 10/10; Train; loss: 0.024; acc: 0.993; precision: 0.991, recall: 0.994, macrof1: 0.993, weightedf1: 0.993[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:13:47 -- Epoch: 10/10; Valid; loss: 0.108; acc: 0.970; precision: 0.955, recall: 0.987, macrof1: 0.970, weightedf1: 0.970[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_gru_v002_n84000/wikigaz_en_ft_ocr_gru_v002_n84000.model[0m



User time: 966.3750


## Fine-Tune, model B, LSTM

In [55]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 250

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05909252166748047[0m
[92m2020-09-10 19:13:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:13:47

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:13:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:13:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:13:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:13:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:13:50

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:13:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:13:50 -- Epoch: 1/20; Train; loss: 1.577; acc: 0.448; precision: 0.446, recall: 0.432, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:13:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:13:59 -- Epoch: 1/20; Valid; loss: 1.609; acc: 0.480; precision: 0.482, recall: 0.519, macrof1: 0.479, weightedf1: 0.479[0m
[92m2020-09-10 19:13:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:13:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:13:59 -- Epoch: 2/20; Train; loss: 0.977; acc: 0.580; precision: 0.588, recall: 0.536, macrof1: 0.579, weightedf1: 0.579[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:08 -- Epoch: 2/20; Valid; loss: 1.437; acc: 0.505; precision: 0.505, recall: 0.541, macrof1: 0.505, weightedf1: 0.505[0m
[92m2020-09-10 19:14:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:08 -- Epoch: 3/20; Train; loss: 0.644; acc: 0.724; precision: 0.733, recall: 0.704, macrof1: 0.724, weightedf1: 0.724[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:17 -- Epoch: 3/20; Valid; loss: 1.318; acc: 0.525; precision: 0.523, recall: 0.583, macrof1: 0.524, weightedf1: 0.524[0m
[92m2020-09-10 19:14:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:17 -- Epoch: 4/20; Train; loss: 0.438; acc: 0.816; precision: 0.826, recall: 0.800, macrof1: 0.816, weightedf1: 0.816[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:26 -- Epoch: 4/20; Valid; loss: 1.233; acc: 0.542; precision: 0.536, recall: 0.620, macrof1: 0.539, weightedf1: 0.539[0m
[92m2020-09-10 19:14:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:27 -- Epoch: 5/20; Train; loss: 0.319; acc: 0.876; precision: 0.856, recall: 0.904, macrof1: 0.876, weightedf1: 0.876[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:36 -- Epoch: 5/20; Valid; loss: 1.177; acc: 0.554; precision: 0.545, recall: 0.650, macrof1: 0.550, weightedf1: 0.550[0m
[92m2020-09-10 19:14:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:36 -- Epoch: 6/20; Train; loss: 0.237; acc: 0.912; precision: 0.887, recall: 0.944, macrof1: 0.912, weightedf1: 0.912[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:45 -- Epoch: 6/20; Valid; loss: 1.139; acc: 0.566; precision: 0.554, recall: 0.674, macrof1: 0.561, weightedf1: 0.561[0m
[92m2020-09-10 19:14:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:45 -- Epoch: 7/20; Train; loss: 0.179; acc: 0.940; precision: 0.923, recall: 0.960, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:14:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:14:54 -- Epoch: 7/20; Valid; loss: 1.113; acc: 0.574; precision: 0.561, recall: 0.683, macrof1: 0.569, weightedf1: 0.569[0m
[92m2020-09-10 19:14:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:14:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:14:54 -- Epoch: 8/20; Train; loss: 0.145; acc: 0.972; precision: 0.961, recall: 0.984, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:03 -- Epoch: 8/20; Valid; loss: 1.093; acc: 0.580; precision: 0.566, recall: 0.683, macrof1: 0.576, weightedf1: 0.576[0m
[92m2020-09-10 19:15:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:15:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:03 -- Epoch: 9/20; Train; loss: 0.115; acc: 0.988; precision: 0.984, recall: 0.992, macrof1: 0.988, weightedf1: 0.988[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:12 -- Epoch: 9/20; Valid; loss: 1.080; acc: 0.587; precision: 0.573, recall: 0.680, macrof1: 0.583, weightedf1: 0.583[0m
[92m2020-09-10 19:15:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:15:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:13 -- Epoch: 10/20; Train; loss: 0.095; acc: 0.992; precision: 0.992, recall: 0.992, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:22 -- Epoch: 10/20; Valid; loss: 1.073; acc: 0.592; precision: 0.579, recall: 0.675, macrof1: 0.589, weightedf1: 0.589[0m
[92m2020-09-10 19:15:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:15:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:22 -- Epoch: 11/20; Train; loss: 0.078; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:31 -- Epoch: 11/20; Valid; loss: 1.073; acc: 0.596; precision: 0.583, recall: 0.676, macrof1: 0.593, weightedf1: 0.593[0m
[92m2020-09-10 19:15:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:15:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:31 -- Epoch: 12/20; Train; loss: 0.066; acc: 0.996; precision: 0.992, recall: 1.000, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:40 -- Epoch: 12/20; Valid; loss: 1.076; acc: 0.599; precision: 0.585, recall: 0.678, macrof1: 0.596, weightedf1: 0.596[0m
[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 11) at ./models/wikigaz_en_ft_ocr_lstm_v002_n250/wikigaz_en_ft_ocr_lstm_v002_n250.model[0m
[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 12, selected epoch: 11[0m




User time: 110.5851


In [56]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:15:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.056529998779296875[0m
[92m2020-09-10 19:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:15:41

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:15:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:15:43

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:15:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:44 -- Epoch: 1/20; Train; loss: 1.497; acc: 0.456; precision: 0.454, recall: 0.436, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:15:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:15:53 -- Epoch: 1/20; Valid; loss: 1.416; acc: 0.508; precision: 0.507, recall: 0.543, macrof1: 0.507, weightedf1: 0.507[0m
[92m2020-09-10 19:15:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:15:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:15:53 -- Epoch: 2/20; Train; loss: 0.832; acc: 0.638; precision: 0.636, recall: 0.644, macrof1: 0.638, weightedf1: 0.638[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:02 -- Epoch: 2/20; Valid; loss: 1.148; acc: 0.553; precision: 0.547, recall: 0.621, macrof1: 0.551, weightedf1: 0.551[0m
[92m2020-09-10 19:16:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:03 -- Epoch: 3/20; Train; loss: 0.523; acc: 0.774; precision: 0.762, recall: 0.796, macrof1: 0.774, weightedf1: 0.774[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:11 -- Epoch: 3/20; Valid; loss: 1.008; acc: 0.585; precision: 0.573, recall: 0.666, macrof1: 0.582, weightedf1: 0.582[0m
[92m2020-09-10 19:16:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:12 -- Epoch: 4/20; Train; loss: 0.390; acc: 0.844; precision: 0.826, recall: 0.872, macrof1: 0.844, weightedf1: 0.844[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:21 -- Epoch: 4/20; Valid; loss: 0.939; acc: 0.605; precision: 0.589, recall: 0.693, macrof1: 0.602, weightedf1: 0.602[0m
[92m2020-09-10 19:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:21 -- Epoch: 5/20; Train; loss: 0.301; acc: 0.892; precision: 0.871, recall: 0.920, macrof1: 0.892, weightedf1: 0.892[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:30 -- Epoch: 5/20; Valid; loss: 0.895; acc: 0.618; precision: 0.604, recall: 0.687, macrof1: 0.617, weightedf1: 0.617[0m
[92m2020-09-10 19:16:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:31 -- Epoch: 6/20; Train; loss: 0.231; acc: 0.934; precision: 0.922, recall: 0.948, macrof1: 0.934, weightedf1: 0.934[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:40 -- Epoch: 6/20; Valid; loss: 0.877; acc: 0.632; precision: 0.618, recall: 0.690, macrof1: 0.631, weightedf1: 0.631[0m
[92m2020-09-10 19:16:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:40 -- Epoch: 7/20; Train; loss: 0.178; acc: 0.968; precision: 0.957, recall: 0.980, macrof1: 0.968, weightedf1: 0.968[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:49 -- Epoch: 7/20; Valid; loss: 0.875; acc: 0.642; precision: 0.627, recall: 0.704, macrof1: 0.641, weightedf1: 0.641[0m
[92m2020-09-10 19:16:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:49 -- Epoch: 8/20; Train; loss: 0.142; acc: 0.976; precision: 0.969, recall: 0.984, macrof1: 0.976, weightedf1: 0.976[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:16:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:16:58 -- Epoch: 8/20; Valid; loss: 0.874; acc: 0.652; precision: 0.638, recall: 0.704, macrof1: 0.651, weightedf1: 0.651[0m
[92m2020-09-10 19:16:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:16:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:16:59 -- Epoch: 9/20; Train; loss: 0.112; acc: 0.992; precision: 0.992, recall: 0.992, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:17:08 -- Epoch: 9/20; Valid; loss: 0.878; acc: 0.659; precision: 0.643, recall: 0.715, macrof1: 0.658, weightedf1: 0.658[0m
[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_lstm_v002_n500/wikigaz_en_ft_ocr_lstm_v002_n500.model[0m
[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 9, selected epoch: 8[0m




User time: 84.4243


In [57]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:17:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.060021400451660156[0m
[92m2020-09-10 19:17:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:17:09

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:17:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:17:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:17:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:17:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:17:11

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:17:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:17:12 -- Epoch: 1/20; Train; loss: 1.376; acc: 0.505; precision: 0.505, recall: 0.486, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:17:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:17:21 -- Epoch: 1/20; Valid; loss: 1.094; acc: 0.557; precision: 0.554, recall: 0.584, macrof1: 0.556, weightedf1: 0.556[0m
[92m2020-09-10 19:17:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:17:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:17:22 -- Epoch: 2/20; Train; loss: 0.672; acc: 0.710; precision: 0.692, recall: 0.756, macrof1: 0.709, weightedf1: 0.709[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:17:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:17:31 -- Epoch: 2/20; Valid; loss: 0.833; acc: 0.620; precision: 0.606, recall: 0.686, macrof1: 0.619, weightedf1: 0.619[0m
[92m2020-09-10 19:17:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:17:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:17:32 -- Epoch: 3/20; Train; loss: 0.455; acc: 0.802; precision: 0.797, recall: 0.810, macrof1: 0.802, weightedf1: 0.802[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:17:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:17:41 -- Epoch: 3/20; Valid; loss: 0.735; acc: 0.657; precision: 0.651, recall: 0.677, macrof1: 0.657, weightedf1: 0.657[0m
[92m2020-09-10 19:17:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:17:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:17:41 -- Epoch: 4/20; Train; loss: 0.338; acc: 0.875; precision: 0.857, recall: 0.900, macrof1: 0.875, weightedf1: 0.875[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:17:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:17:50 -- Epoch: 4/20; Valid; loss: 0.696; acc: 0.681; precision: 0.670, recall: 0.716, macrof1: 0.681, weightedf1: 0.681[0m
[92m2020-09-10 19:17:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:17:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:17:51 -- Epoch: 5/20; Train; loss: 0.262; acc: 0.916; precision: 0.906, recall: 0.928, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:00 -- Epoch: 5/20; Valid; loss: 0.677; acc: 0.702; precision: 0.698, recall: 0.714, macrof1: 0.702, weightedf1: 0.702[0m
[92m2020-09-10 19:18:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:18:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:01 -- Epoch: 6/20; Train; loss: 0.205; acc: 0.947; precision: 0.944, recall: 0.950, macrof1: 0.947, weightedf1: 0.947[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:09 -- Epoch: 6/20; Valid; loss: 0.666; acc: 0.717; precision: 0.710, recall: 0.734, macrof1: 0.717, weightedf1: 0.717[0m
[92m2020-09-10 19:18:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:18:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:10 -- Epoch: 7/20; Train; loss: 0.159; acc: 0.962; precision: 0.955, recall: 0.970, macrof1: 0.962, weightedf1: 0.962[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:19 -- Epoch: 7/20; Valid; loss: 0.657; acc: 0.727; precision: 0.721, recall: 0.742, macrof1: 0.727, weightedf1: 0.727[0m
[92m2020-09-10 19:18:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:18:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:20 -- Epoch: 8/20; Train; loss: 0.123; acc: 0.982; precision: 0.976, recall: 0.988, macrof1: 0.982, weightedf1: 0.982[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:28 -- Epoch: 8/20; Valid; loss: 0.653; acc: 0.737; precision: 0.736, recall: 0.739, macrof1: 0.737, weightedf1: 0.737[0m
[92m2020-09-10 19:18:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:18:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:29 -- Epoch: 9/20; Train; loss: 0.094; acc: 0.987; precision: 0.984, recall: 0.990, macrof1: 0.987, weightedf1: 0.987[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:38 -- Epoch: 9/20; Valid; loss: 0.655; acc: 0.745; precision: 0.741, recall: 0.755, macrof1: 0.745, weightedf1: 0.745[0m
[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_lstm_v002_n1000/wikigaz_en_ft_ocr_lstm_v002_n1000.model[0m
[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 9, selected epoch: 8[0m




User time: 87.2856


In [58]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:18:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.06361913681030273[0m
[92m2020-09-10 19:18:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:18:39

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:18:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:18:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:18:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:18:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:18:41

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:18:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:43 -- Epoch: 1/20; Train; loss: 1.142; acc: 0.568; precision: 0.565, recall: 0.593, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:18:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:18:51 -- Epoch: 1/20; Valid; loss: 0.808; acc: 0.632; precision: 0.623, recall: 0.668, macrof1: 0.632, weightedf1: 0.632[0m
[92m2020-09-10 19:18:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:18:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:18:53 -- Epoch: 2/20; Train; loss: 0.533; acc: 0.754; precision: 0.735, recall: 0.794, macrof1: 0.754, weightedf1: 0.754[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:02 -- Epoch: 2/20; Valid; loss: 0.639; acc: 0.698; precision: 0.695, recall: 0.705, macrof1: 0.698, weightedf1: 0.698[0m
[92m2020-09-10 19:19:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:04 -- Epoch: 3/20; Train; loss: 0.387; acc: 0.843; precision: 0.833, recall: 0.858, macrof1: 0.843, weightedf1: 0.843[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:12 -- Epoch: 3/20; Valid; loss: 0.576; acc: 0.741; precision: 0.743, recall: 0.736, macrof1: 0.741, weightedf1: 0.741[0m
[92m2020-09-10 19:19:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:14 -- Epoch: 4/20; Train; loss: 0.284; acc: 0.903; precision: 0.901, recall: 0.906, macrof1: 0.903, weightedf1: 0.903[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:22 -- Epoch: 4/20; Valid; loss: 0.539; acc: 0.770; precision: 0.764, recall: 0.781, macrof1: 0.770, weightedf1: 0.770[0m
[92m2020-09-10 19:19:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:24 -- Epoch: 5/20; Train; loss: 0.216; acc: 0.935; precision: 0.932, recall: 0.938, macrof1: 0.935, weightedf1: 0.935[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:33 -- Epoch: 5/20; Valid; loss: 0.514; acc: 0.787; precision: 0.780, recall: 0.799, macrof1: 0.787, weightedf1: 0.787[0m
[92m2020-09-10 19:19:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:34 -- Epoch: 6/20; Train; loss: 0.156; acc: 0.964; precision: 0.962, recall: 0.966, macrof1: 0.964, weightedf1: 0.964[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:43 -- Epoch: 6/20; Valid; loss: 0.501; acc: 0.800; precision: 0.801, recall: 0.797, macrof1: 0.800, weightedf1: 0.800[0m
[92m2020-09-10 19:19:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:44 -- Epoch: 7/20; Train; loss: 0.111; acc: 0.977; precision: 0.977, recall: 0.976, macrof1: 0.976, weightedf1: 0.976[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:19:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:19:53 -- Epoch: 7/20; Valid; loss: 0.496; acc: 0.810; precision: 0.807, recall: 0.815, macrof1: 0.810, weightedf1: 0.810[0m
[92m2020-09-10 19:19:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:19:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:19:55 -- Epoch: 8/20; Train; loss: 0.081; acc: 0.989; precision: 0.986, recall: 0.992, macrof1: 0.989, weightedf1: 0.989[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:20:03 -- Epoch: 8/20; Valid; loss: 0.497; acc: 0.815; precision: 0.815, recall: 0.816, macrof1: 0.815, weightedf1: 0.815[0m
[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 7) at ./models/wikigaz_en_ft_ocr_lstm_v002_n2000/wikigaz_en_ft_ocr_lstm_v002_n2000.model[0m
[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 8, selected epoch: 7[0m




User time: 82.0991


In [59]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:20:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:20:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:20:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:20:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04894113540649414[0m
[92m2020-09-10 19:20:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:20:04

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:20:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:20:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:20:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:20:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:20:06

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:20:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:20:09 -- Epoch: 1/20; Train; loss: 0.899; acc: 0.624; precision: 0.617, recall: 0.657, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:20:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:20:18 -- Epoch: 1/20; Valid; loss: 0.619; acc: 0.701; precision: 0.707, recall: 0.686, macrof1: 0.700, weightedf1: 0.700[0m
[92m2020-09-10 19:20:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:20:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:20:21 -- Epoch: 2/20; Train; loss: 0.429; acc: 0.812; precision: 0.805, recall: 0.824, macrof1: 0.812, weightedf1: 0.812[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:20:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:20:30 -- Epoch: 2/20; Valid; loss: 0.494; acc: 0.778; precision: 0.765, recall: 0.802, macrof1: 0.778, weightedf1: 0.778[0m
[92m2020-09-10 19:20:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:20:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:20:33 -- Epoch: 3/20; Train; loss: 0.294; acc: 0.885; precision: 0.881, recall: 0.892, macrof1: 0.885, weightedf1: 0.885[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:20:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:20:42 -- Epoch: 3/20; Valid; loss: 0.430; acc: 0.816; precision: 0.817, recall: 0.815, macrof1: 0.816, weightedf1: 0.816[0m
[92m2020-09-10 19:20:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:20:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:20:45 -- Epoch: 4/20; Train; loss: 0.202; acc: 0.932; precision: 0.929, recall: 0.935, macrof1: 0.932, weightedf1: 0.932[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:20:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:20:53 -- Epoch: 4/20; Valid; loss: 0.398; acc: 0.837; precision: 0.833, recall: 0.844, macrof1: 0.837, weightedf1: 0.837[0m
[92m2020-09-10 19:20:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:20:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:20:57 -- Epoch: 5/20; Train; loss: 0.139; acc: 0.964; precision: 0.960, recall: 0.968, macrof1: 0.963, weightedf1: 0.963[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:21:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:21:05 -- Epoch: 5/20; Valid; loss: 0.385; acc: 0.848; precision: 0.852, recall: 0.844, macrof1: 0.848, weightedf1: 0.848[0m
[92m2020-09-10 19:21:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:21:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:21:08 -- Epoch: 6/20; Train; loss: 0.092; acc: 0.981; precision: 0.978, recall: 0.984, macrof1: 0.981, weightedf1: 0.981[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:21:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:21:17 -- Epoch: 6/20; Valid; loss: 0.384; acc: 0.854; precision: 0.861, recall: 0.844, macrof1: 0.854, weightedf1: 0.854[0m
[92m2020-09-10 19:21:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:21:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:21:20 -- Epoch: 7/20; Train; loss: 0.059; acc: 0.993; precision: 0.992, recall: 0.993, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:21:29 -- Epoch: 7/20; Valid; loss: 0.392; acc: 0.858; precision: 0.868, recall: 0.845, macrof1: 0.858, weightedf1: 0.858[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 6) at ./models/wikigaz_en_ft_ocr_lstm_v002_n4000/wikigaz_en_ft_ocr_lstm_v002_n4000.model[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 7, selected epoch: 6[0m




User time: 82.6551


In [60]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04610013961791992[0m
[92m2020-09-10 19:21:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:21:29

length s1:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:21:31[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:21:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:21:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:21:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:21:32

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:21:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:21:38 -- Epoch: 1/20; Train; loss: 0.715; acc: 0.688; precision: 0.678, recall: 0.715, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:21:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:21:47 -- Epoch: 1/20; Valid; loss: 0.453; acc: 0.801; precision: 0.818, recall: 0.775, macrof1: 0.801, weightedf1: 0.801[0m
[92m2020-09-10 19:21:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:21:53[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:21:53 -- Epoch: 2/20; Train; loss: 0.328; acc: 0.867; precision: 0.861, recall: 0.876, macrof1: 0.867, weightedf1: 0.867[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:22:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:22:02 -- Epoch: 2/20; Valid; loss: 0.334; acc: 0.863; precision: 0.860, recall: 0.868, macrof1: 0.863, weightedf1: 0.863[0m
[92m2020-09-10 19:22:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:22:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:22:08 -- Epoch: 3/20; Train; loss: 0.211; acc: 0.925; precision: 0.915, recall: 0.936, macrof1: 0.925, weightedf1: 0.925[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:22:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:22:17 -- Epoch: 3/20; Valid; loss: 0.286; acc: 0.889; precision: 0.884, recall: 0.895, macrof1: 0.889, weightedf1: 0.889[0m
[92m2020-09-10 19:22:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:22:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:22:24 -- Epoch: 4/20; Train; loss: 0.136; acc: 0.959; precision: 0.952, recall: 0.967, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:22:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:22:32 -- Epoch: 4/20; Valid; loss: 0.270; acc: 0.898; precision: 0.892, recall: 0.905, macrof1: 0.898, weightedf1: 0.898[0m
[92m2020-09-10 19:22:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:22:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:22:39 -- Epoch: 5/20; Train; loss: 0.084; acc: 0.979; precision: 0.975, recall: 0.983, macrof1: 0.979, weightedf1: 0.979[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:22:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:22:47 -- Epoch: 5/20; Valid; loss: 0.267; acc: 0.902; precision: 0.887, recall: 0.922, macrof1: 0.902, weightedf1: 0.902[0m
[92m2020-09-10 19:22:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:22:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:22:54 -- Epoch: 6/20; Train; loss: 0.050; acc: 0.991; precision: 0.989, recall: 0.993, macrof1: 0.991, weightedf1: 0.991[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:23:03 -- Epoch: 6/20; Valid; loss: 0.274; acc: 0.905; precision: 0.904, recall: 0.906, macrof1: 0.905, weightedf1: 0.905[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_lstm_v002_n8000/wikigaz_en_ft_ocr_lstm_v002_n8000.model[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 91.0168


In [61]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05091238021850586[0m
[92m2020-09-10 19:23:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:23:03

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 19:23:05[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:23:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:23:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:23:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:23:06

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:23:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:23:19 -- Epoch: 1/20; Train; loss: 0.557; acc: 0.760; precision: 0.750, recall: 0.780, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:23:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:23:28 -- Epoch: 1/20; Valid; loss: 0.324; acc: 0.866; precision: 0.858, recall: 0.877, macrof1: 0.866, weightedf1: 0.866[0m
[92m2020-09-10 19:23:28[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:23:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:23:41 -- Epoch: 2/20; Train; loss: 0.238; acc: 0.910; precision: 0.901, recall: 0.922, macrof1: 0.910, weightedf1: 0.910[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:23:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:23:49 -- Epoch: 2/20; Valid; loss: 0.242; acc: 0.905; precision: 0.892, recall: 0.922, macrof1: 0.905, weightedf1: 0.905[0m
[92m2020-09-10 19:23:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:24:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:24:02 -- Epoch: 3/20; Train; loss: 0.146; acc: 0.951; precision: 0.942, recall: 0.961, macrof1: 0.951, weightedf1: 0.951[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:24:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:24:11 -- Epoch: 3/20; Valid; loss: 0.217; acc: 0.919; precision: 0.908, recall: 0.932, macrof1: 0.919, weightedf1: 0.919[0m
[92m2020-09-10 19:24:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:24:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:24:24 -- Epoch: 4/20; Train; loss: 0.089; acc: 0.974; precision: 0.968, recall: 0.980, macrof1: 0.974, weightedf1: 0.974[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:24:32 -- Epoch: 4/20; Valid; loss: 0.216; acc: 0.923; precision: 0.921, recall: 0.926, macrof1: 0.923, weightedf1: 0.923[0m
[92m2020-09-10 19:24:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:24:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:24:45 -- Epoch: 5/20; Train; loss: 0.050; acc: 0.988; precision: 0.985, recall: 0.991, macrof1: 0.988, weightedf1: 0.988[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:24:54 -- Epoch: 5/20; Valid; loss: 0.221; acc: 0.926; precision: 0.921, recall: 0.931, macrof1: 0.926, weightedf1: 0.926[0m
[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_lstm_v002_n16000/wikigaz_en_ft_ocr_lstm_v002_n16000.model[0m
[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 108.1344


In [62]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B.yaml[0m
[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:24:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:24:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:24:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:24:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05561685562133789[0m
[92m2020-09-10 19:24:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:24:55

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 19:24:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:24:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:24:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:24:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:24:57

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:25:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:25:24 -- Epoch: 1/20; Train; loss: 0.414; acc: 0.826; precision: 0.814, recall: 0.844, ma

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:25:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:25:33 -- Epoch: 1/20; Valid; loss: 0.241; acc: 0.909; precision: 0.902, recall: 0.918, macrof1: 0.909, weightedf1: 0.909[0m
[92m2020-09-10 19:25:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:25:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:25:59 -- Epoch: 2/20; Train; loss: 0.171; acc: 0.938; precision: 0.928, recall: 0.948, macrof1: 0.938, weightedf1: 0.938[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:26:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:26:08 -- Epoch: 2/20; Valid; loss: 0.187; acc: 0.931; precision: 0.915, recall: 0.951, macrof1: 0.931, weightedf1: 0.931[0m
[92m2020-09-10 19:26:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:26:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:26:34 -- Epoch: 3/20; Train; loss: 0.102; acc: 0.967; precision: 0.958, recall: 0.976, macrof1: 0.967, weightedf1: 0.967[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:26:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:26:43 -- Epoch: 3/20; Valid; loss: 0.179; acc: 0.938; precision: 0.928, recall: 0.950, macrof1: 0.938, weightedf1: 0.938[0m
[92m2020-09-10 19:26:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:27:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:27:09 -- Epoch: 4/20; Train; loss: 0.062; acc: 0.981; precision: 0.976, recall: 0.986, macrof1: 0.981, weightedf1: 0.981[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:27:18 -- Epoch: 4/20; Valid; loss: 0.186; acc: 0.940; precision: 0.932, recall: 0.950, macrof1: 0.940, weightedf1: 0.940[0m
[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_lstm_v002_n32000/wikigaz_en_ft_ocr_lstm_v002_n32000.model[0m
[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 140.5165


In [63]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:27:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:27:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:27:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:27:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05287814140319824[0m
[92m2020-09-10 19:27:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 19:27:19[0m [95mlwm-embeddings[0m [1m[90m[IN

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 19:27:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:27:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:27:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:27:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:27:22

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:28:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:28:14 -- Epoch: 1/10; Train; loss: 0.308; acc: 0.875; precision: 0.863, recall: 0.892, ma

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:28:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:28:21 -- Epoch: 1/10; Valid; loss: 0.167; acc: 0.936; precision: 0.915, recall: 0.962, macrof1: 0.936, weightedf1: 0.936[0m
[92m2020-09-10 19:28:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:29:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:29:12 -- Epoch: 2/10; Train; loss: 0.128; acc: 0.955; precision: 0.947, recall: 0.964, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:29:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:29:19 -- Epoch: 2/10; Valid; loss: 0.138; acc: 0.949; precision: 0.932, recall: 0.969, macrof1: 0.949, weightedf1: 0.949[0m
[92m2020-09-10 19:29:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:30:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:30:10 -- Epoch: 3/10; Train; loss: 0.081; acc: 0.972; precision: 0.966, recall: 0.980, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:30:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:30:18 -- Epoch: 3/10; Valid; loss: 0.122; acc: 0.958; precision: 0.944, recall: 0.973, macrof1: 0.958, weightedf1: 0.958[0m
[92m2020-09-10 19:30:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:31:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:31:09 -- Epoch: 4/10; Train; loss: 0.054; acc: 0.983; precision: 0.979, recall: 0.988, macrof1: 0.983, weightedf1: 0.983[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:31:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:31:16 -- Epoch: 4/10; Valid; loss: 0.127; acc: 0.954; precision: 0.957, recall: 0.950, macrof1: 0.954, weightedf1: 0.954[0m
[92m2020-09-10 19:31:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:32:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:32:08 -- Epoch: 5/10; Train; loss: 0.037; acc: 0.989; precision: 0.987, recall: 0.992, macrof1: 0.989, weightedf1: 0.989[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:32:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:32:16 -- Epoch: 5/10; Valid; loss: 0.134; acc: 0.959; precision: 0.948, recall: 0.972, macrof1: 0.959, weightedf1: 0.959[0m
[92m2020-09-10 19:32:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:33:07[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:33:07 -- Epoch: 6/10; Train; loss: 0.027; acc: 0.992; precision: 0.990, recall: 0.995, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:33:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:33:14 -- Epoch: 6/10; Valid; loss: 0.143; acc: 0.959; precision: 0.958, recall: 0.960, macrof1: 0.959, weightedf1: 0.959[0m
[92m2020-09-10 19:33:14[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:34:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:34:06 -- Epoch: 7/10; Train; loss: 0.021; acc: 0.994; precision: 0.993, recall: 0.995, macrof1: 0.994, weightedf1: 0.994[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:34:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:34:13 -- Epoch: 7/10; Valid; loss: 0.155; acc: 0.960; precision: 0.948, recall: 0.974, macrof1: 0.960, weightedf1: 0.960[0m
[92m2020-09-10 19:34:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:35:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:35:04 -- Epoch: 8/10; Train; loss: 0.019; acc: 0.995; precision: 0.994, recall: 0.996, macrof1: 0.995, weightedf1: 0.995[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:35:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:35:11 -- Epoch: 8/10; Valid; loss: 0.161; acc: 0.960; precision: 0.957, recall: 0.963, macrof1: 0.960, weightedf1: 0.960[0m
[92m2020-09-10 19:35:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:36:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:36:02 -- Epoch: 9/10; Train; loss: 0.015; acc: 0.996; precision: 0.995, recall: 0.997, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:36:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:36:09 -- Epoch: 9/10; Valid; loss: 0.169; acc: 0.960; precision: 0.947, recall: 0.975, macrof1: 0.960, weightedf1: 0.960[0m
[92m2020-09-10 19:36:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:37:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:37:01 -- Epoch: 10/10; Train; loss: 0.016; acc: 0.995; precision: 0.995, recall: 0.996, macrof1: 0.995, weightedf1: 0.995[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:37:08 -- Epoch: 10/10; Valid; loss: 0.186; acc: 0.958; precision: 0.942, recall: 0.975, macrof1: 0.958, weightedf1: 0.958[0m
[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_lstm_v002_n64000/wikigaz_en_ft_ocr_lstm_v002_n64000.model[0m



User time: 586.1828


In [64]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_lstm_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_lstm_001/wikigaz_en_lstm_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_lstm_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:37:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.051722049713134766[0m
[92m2020-09-10 19:37:09[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 19:37:09[0m [95mlwm-embeddings[0m [1m[90m[I

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 19:37:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:37:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:37:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) LSTM ****[0m
[92m2020-09-10 19:37:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:37:12

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 721323

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): LSTM(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((240, 60), (240, 60), (240,), (240,), (240, 60), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,), (240, 120), (240, 60), (240,), (240,)), parameters=145920
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:38:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:38:19 -- Epoch: 1/10; Train; loss: 0.268; acc: 0.894; precision: 0.882, recall: 0.909, ma

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:38:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:38:19 -- Epoch: 1/10; Valid; loss: 0.156; acc: 0.945; precision: 0.921, recall: 0.973, macrof1: 0.945, weightedf1: 0.945[0m
[92m2020-09-10 19:38:19[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:39:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:39:27 -- Epoch: 2/10; Train; loss: 0.114; acc: 0.961; precision: 0.952, recall: 0.970, macrof1: 0.961, weightedf1: 0.961[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:39:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:39:27 -- Epoch: 2/10; Valid; loss: 0.109; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965[0m
[92m2020-09-10 19:39:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:40:35 -- Epoch: 3/10; Train; loss: 0.074; acc: 0.975; precision: 0.969, recall: 0.982, macrof1: 0.975, weightedf1: 0.975[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:40:35 -- Epoch: 3/10; Valid; loss: 0.099; acc: 0.964; precision: 0.957, recall: 0.970, macrof1: 0.964, weightedf1: 0.964[0m
[92m2020-09-10 19:40:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:41:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:41:47 -- Epoch: 4/10; Train; loss: 0.049; acc: 0.984; precision: 0.980, recall: 0.989, macrof1: 0.984, weightedf1: 0.984[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:41:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:41:48 -- Epoch: 4/10; Valid; loss: 0.093; acc: 0.968; precision: 0.961, recall: 0.977, macrof1: 0.968, weightedf1: 0.968[0m
[92m2020-09-10 19:41:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:43:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:43:01 -- Epoch: 5/10; Train; loss: 0.035; acc: 0.989; precision: 0.987, recall: 0.992, macrof1: 0.989, weightedf1: 0.989[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:43:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:43:01 -- Epoch: 5/10; Valid; loss: 0.137; acc: 0.967; precision: 0.958, recall: 0.977, macrof1: 0.967, weightedf1: 0.967[0m
[92m2020-09-10 19:43:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:44:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:44:15 -- Epoch: 6/10; Train; loss: 0.027; acc: 0.992; precision: 0.990, recall: 0.994, macrof1: 0.992, weightedf1: 0.992[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:44:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:44:16 -- Epoch: 6/10; Valid; loss: 0.153; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965[0m
[92m2020-09-10 19:44:16[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:45:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:45:25 -- Epoch: 7/10; Train; loss: 0.023; acc: 0.993; precision: 0.992, recall: 0.995, macrof1: 0.993, weightedf1: 0.993[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:45:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:45:25 -- Epoch: 7/10; Valid; loss: 0.121; acc: 0.967; precision: 0.946, recall: 0.990, macrof1: 0.967, weightedf1: 0.967[0m
[92m2020-09-10 19:45:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:46:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:46:33 -- Epoch: 8/10; Train; loss: 0.019; acc: 0.995; precision: 0.993, recall: 0.996, macrof1: 0.995, weightedf1: 0.995[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:46:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:46:33 -- Epoch: 8/10; Valid; loss: 0.224; acc: 0.965; precision: 0.949, recall: 0.983, macrof1: 0.965, weightedf1: 0.965[0m
[92m2020-09-10 19:46:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:47:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:47:42 -- Epoch: 9/10; Train; loss: 0.018; acc: 0.995; precision: 0.994, recall: 0.996, macrof1: 0.995, weightedf1: 0.995[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:47:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:47:42 -- Epoch: 9/10; Valid; loss: 0.196; acc: 0.965; precision: 0.955, recall: 0.977, macrof1: 0.965, weightedf1: 0.965[0m
[92m2020-09-10 19:47:42[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 19:48:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:48:51 -- Epoch: 10/10; Train; loss: 0.016; acc: 0.996; precision: 0.995, recall: 0.997, macrof1: 0.996, weightedf1: 0.996[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:48:52 -- Epoch: 10/10; Valid; loss: 0.173; acc: 0.964; precision: 0.954, recall: 0.973, macrof1: 0.964, weightedf1: 0.964[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_lstm_v002_n84000/wikigaz_en_ft_ocr_lstm_v002_n84000.model[0m



User time: 699.9799


## Fine-Tune, model A, RNN

In [65]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 250

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05935072898864746[0m
[92m2020-09-10 19:48:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58971
val             25380
train             250
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:48:52[

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:48:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:48:55[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:48:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:48:55 -- Epoch: 1/20; Train; loss: 1.026; acc: 0.520; precision: 0.520, recall: 0.528, macrof1: 0.520, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:03 -- Epoch: 1/20; Valid; loss: 0.999; acc: 0.508; precision: 0.507, recall: 0.582, macrof1: 0.505, weightedf1: 0.505[0m
[92m2020-09-10 19:49:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:49:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:03 -- Epoch: 2/20; Train; loss: 0.662; acc: 0.628; precision: 0.621, recall: 0.656, macrof1: 0.628, weightedf1: 0.628[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:12 -- Epoch: 2/20; Valid; loss: 0.896; acc: 0.528; precision: 0.523, recall: 0.614, macrof1: 0.524, weightedf1: 0.524[0m
[92m2020-09-10 19:49:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:49:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:12 -- Epoch: 3/20; Train; loss: 0.518; acc: 0.740; precision: 0.724, recall: 0.776, macrof1: 0.740, weightedf1: 0.740[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:20 -- Epoch: 3/20; Valid; loss: 0.845; acc: 0.542; precision: 0.536, recall: 0.632, macrof1: 0.538, weightedf1: 0.538[0m
[92m2020-09-10 19:49:20[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:49:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:21 -- Epoch: 4/20; Train; loss: 0.427; acc: 0.776; precision: 0.745, recall: 0.840, macrof1: 0.775, weightedf1: 0.775[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:29 -- Epoch: 4/20; Valid; loss: 0.824; acc: 0.557; precision: 0.548, recall: 0.652, macrof1: 0.553, weightedf1: 0.553[0m
[92m2020-09-10 19:49:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))

[92m2020-09-10 19:49:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:29 -- Epoch: 5/20; Train; loss: 0.364; acc: 0.848; precision: 0.813, recall: 0.904, macrof1: 0.848, weightedf1: 0.848[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:37 -- Epoch: 5/20; Valid; loss: 0.825; acc: 0.574; precision: 0.563, recall: 0.658, macrof1: 0.571, weightedf1: 0.571[0m
[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n250/wikigaz_en_ft_ocr_rnn_v002_n250.model[0m
[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 42.7793


In [66]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 500

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:49:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:49:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:49:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:49:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05241036415100098[0m
[92m2020-09-10 19:49:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58721
val             25380
train             500
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:49:38[

s1 padding:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:49:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:49:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:49:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:49:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:49:40[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:49:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:41 -- Epoch: 1/20; Train; loss: 1.019; acc: 0.500; precision: 0.500, recall: 0.532, macrof1: 0.499, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:49 -- Epoch: 1/20; Valid; loss: 0.844; acc: 0.519; precision: 0.517, recall: 0.593, macrof1: 0.517, weightedf1: 0.517[0m
[92m2020-09-10 19:49:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:49:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:49 -- Epoch: 2/20; Train; loss: 0.661; acc: 0.622; precision: 0.606, recall: 0.700, macrof1: 0.620, weightedf1: 0.620[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:49:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:49:57 -- Epoch: 2/20; Valid; loss: 0.733; acc: 0.553; precision: 0.543, recall: 0.676, macrof1: 0.547, weightedf1: 0.547[0m
[92m2020-09-10 19:49:57[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:49:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:49:58 -- Epoch: 3/20; Train; loss: 0.566; acc: 0.702; precision: 0.674, recall: 0.784, macrof1: 0.700, weightedf1: 0.700[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:06 -- Epoch: 3/20; Valid; loss: 0.698; acc: 0.576; precision: 0.560, recall: 0.707, macrof1: 0.568, weightedf1: 0.568[0m
[92m2020-09-10 19:50:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:50:06[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:06 -- Epoch: 4/20; Train; loss: 0.496; acc: 0.730; precision: 0.705, recall: 0.792, macrof1: 0.729, weightedf1: 0.729[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:15 -- Epoch: 4/20; Valid; loss: 0.688; acc: 0.591; precision: 0.578, recall: 0.673, macrof1: 0.588, weightedf1: 0.588[0m
[92m2020-09-10 19:50:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))

[92m2020-09-10 19:50:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:15 -- Epoch: 5/20; Train; loss: 0.433; acc: 0.782; precision: 0.785, recall: 0.776, macrof1: 0.782, weightedf1: 0.782[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:23 -- Epoch: 5/20; Valid; loss: 0.699; acc: 0.610; precision: 0.600, recall: 0.655, macrof1: 0.609, weightedf1: 0.609[0m
[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n500/wikigaz_en_ft_ocr_rnn_v002_n500.model[0m
[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 42.7849


In [67]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 1000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:50:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:50:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:50:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:50:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05342698097229004[0m
[92m2020-09-10 19:50:24[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    58221
val             25380
train            1000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:50:24[

                                                   

[92m2020-09-10 19:50:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:50:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:50:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:50:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:50:26[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:50:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:27 -- Epoch: 1/20; Train; loss: 0.895; acc: 0.511; precision: 0.510, recall: 0.552, macrof1: 0.510, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:35 -- Epoch: 1/20; Valid; loss: 0.737; acc: 0.550; precision: 0.542, recall: 0.641, macrof1: 0.546, weightedf1: 0.546[0m
[92m2020-09-10 19:50:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:50:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:36 -- Epoch: 2/20; Train; loss: 0.615; acc: 0.657; precision: 0.627, recall: 0.776, macrof1: 0.652, weightedf1: 0.652[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:45 -- Epoch: 2/20; Valid; loss: 0.669; acc: 0.598; precision: 0.578, recall: 0.727, macrof1: 0.591, weightedf1: 0.591[0m
[92m2020-09-10 19:50:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:50:45[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:45 -- Epoch: 3/20; Train; loss: 0.526; acc: 0.702; precision: 0.680, recall: 0.762, macrof1: 0.701, weightedf1: 0.701[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:50:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:50:54 -- Epoch: 3/20; Valid; loss: 0.642; acc: 0.631; precision: 0.619, recall: 0.685, macrof1: 0.630, weightedf1: 0.630[0m
[92m2020-09-10 19:50:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:50:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:50:54 -- Epoch: 4/20; Train; loss: 0.437; acc: 0.777; precision: 0.767, recall: 0.796, macrof1: 0.777, weightedf1: 0.777[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:03 -- Epoch: 4/20; Valid; loss: 0.628; acc: 0.664; precision: 0.653, recall: 0.697, macrof1: 0.663, weightedf1: 0.663[0m
[92m2020-09-10 19:51:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:51:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:04 -- Epoch: 5/20; Train; loss: 0.352; acc: 0.846; precision: 0.837, recall: 0.860, macrof1: 0.846, weightedf1: 0.846[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:12 -- Epoch: 5/20; Valid; loss: 0.624; acc: 0.691; precision: 0.700, recall: 0.668, macrof1: 0.691, weightedf1: 0.691[0m
[92m2020-09-10 19:51:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=16.0), HTML(value='')))

[92m2020-09-10 19:51:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:12 -- Epoch: 6/20; Train; loss: 0.264; acc: 0.894; precision: 0.917, recall: 0.866, macrof1: 0.894, weightedf1: 0.894[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:21 -- Epoch: 6/20; Valid; loss: 0.646; acc: 0.715; precision: 0.706, recall: 0.739, macrof1: 0.715, weightedf1: 0.715[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 5) at ./models/wikigaz_en_ft_ocr_rnn_v002_n1000/wikigaz_en_ft_ocr_rnn_v002_n1000.model[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 6, selected epoch: 5[0m




User time: 54.3822


In [68]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 2000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05343437194824219[0m
[92m2020-09-10 19:51:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    57221
val             25380
train            2000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:51:21[

                                                    

[92m2020-09-10 19:51:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:51:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:51:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:51:23[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:51:23[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:51:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:25 -- Epoch: 1/20; Train; loss: 0.785; acc: 0.559; precision: 0.550, recall: 0.644, macrof1: 0.556, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:33 -- Epoch: 1/20; Valid; loss: 0.654; acc: 0.602; precision: 0.576, recall: 0.775, macrof1: 0.589, weightedf1: 0.589[0m
[92m2020-09-10 19:51:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:51:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:35 -- Epoch: 2/20; Train; loss: 0.559; acc: 0.686; precision: 0.661, recall: 0.766, macrof1: 0.685, weightedf1: 0.685[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:43 -- Epoch: 2/20; Valid; loss: 0.598; acc: 0.682; precision: 0.685, recall: 0.673, macrof1: 0.682, weightedf1: 0.682[0m
[92m2020-09-10 19:51:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:51:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:44 -- Epoch: 3/20; Train; loss: 0.445; acc: 0.777; precision: 0.783, recall: 0.768, macrof1: 0.777, weightedf1: 0.777[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:51:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:51:52 -- Epoch: 3/20; Valid; loss: 0.543; acc: 0.745; precision: 0.751, recall: 0.733, macrof1: 0.745, weightedf1: 0.745[0m
[92m2020-09-10 19:51:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:51:54[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:51:54 -- Epoch: 4/20; Train; loss: 0.315; acc: 0.863; precision: 0.885, recall: 0.833, macrof1: 0.862, weightedf1: 0.862[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:52:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:52:02 -- Epoch: 4/20; Valid; loss: 0.531; acc: 0.767; precision: 0.765, recall: 0.772, macrof1: 0.767, weightedf1: 0.767[0m
[92m2020-09-10 19:52:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=32.0), HTML(value='')))

[92m2020-09-10 19:52:04[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:52:04 -- Epoch: 5/20; Train; loss: 0.218; acc: 0.916; precision: 0.930, recall: 0.901, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:52:12 -- Epoch: 5/20; Valid; loss: 0.542; acc: 0.779; precision: 0.779, recall: 0.779, macrof1: 0.779, weightedf1: 0.779[0m
[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n2000/wikigaz_en_ft_ocr_rnn_v002_n2000.model[0m
[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 48.5973


In [69]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 4000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:52:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:52:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:52:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:52:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.059014320373535156[0m
[92m2020-09-10 19:52:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    55221
val             25380
train            4000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:52:13

length s2:   0%|          | 0/25380 [00:00<?, ?it/s]

[92m2020-09-10 19:52:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:52:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:52:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:52:15[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:52:15[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:52:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:52:18 -- Epoch: 1/20; Train; loss: 0.684; acc: 0.605; precision: 0.590, recall: 0.689, macrof1: 0.602, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:52:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:52:27 -- Epoch: 1/20; Valid; loss: 0.584; acc: 0.696; precision: 0.709, recall: 0.665, macrof1: 0.696, weightedf1: 0.696[0m
[92m2020-09-10 19:52:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:52:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:52:30 -- Epoch: 2/20; Train; loss: 0.475; acc: 0.765; precision: 0.770, recall: 0.756, macrof1: 0.765, weightedf1: 0.765[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:52:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:52:38 -- Epoch: 2/20; Valid; loss: 0.492; acc: 0.770; precision: 0.781, recall: 0.750, macrof1: 0.770, weightedf1: 0.770[0m
[92m2020-09-10 19:52:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:52:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:52:41 -- Epoch: 3/20; Train; loss: 0.341; acc: 0.849; precision: 0.848, recall: 0.851, macrof1: 0.849, weightedf1: 0.849[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:52:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:52:49 -- Epoch: 3/20; Valid; loss: 0.451; acc: 0.800; precision: 0.801, recall: 0.800, macrof1: 0.800, weightedf1: 0.800[0m
[92m2020-09-10 19:52:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=63.0), HTML(value='')))

[92m2020-09-10 19:52:52[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:52:52 -- Epoch: 4/20; Train; loss: 0.230; acc: 0.911; precision: 0.914, recall: 0.908, macrof1: 0.911, weightedf1: 0.911[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:53:00 -- Epoch: 4/20; Valid; loss: 0.459; acc: 0.811; precision: 0.819, recall: 0.799, macrof1: 0.811, weightedf1: 0.811[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n4000/wikigaz_en_ft_ocr_rnn_v002_n4000.model[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 44.3694


In [70]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 8000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.052842140197753906[0m
[92m2020-09-10 19:53:00[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    51221
val             25380
train            8000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:53:00

                                                    

[92m2020-09-10 19:53:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:53:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:53:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:53:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:53:03[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:53:08[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:53:08 -- Epoch: 1/20; Train; loss: 0.616; acc: 0.668; precision: 0.652, recall: 0.721, macrof1: 0.668, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:53:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:53:17 -- Epoch: 1/20; Valid; loss: 0.478; acc: 0.775; precision: 0.744, recall: 0.838, macrof1: 0.774, weightedf1: 0.774[0m
[92m2020-09-10 19:53:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:53:22[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:53:22 -- Epoch: 2/20; Train; loss: 0.385; acc: 0.829; precision: 0.819, recall: 0.844, macrof1: 0.829, weightedf1: 0.829[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:53:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:53:30 -- Epoch: 2/20; Valid; loss: 0.400; acc: 0.821; precision: 0.795, recall: 0.865, macrof1: 0.821, weightedf1: 0.821[0m
[92m2020-09-10 19:53:30[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:53:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:53:36 -- Epoch: 3/20; Train; loss: 0.270; acc: 0.890; precision: 0.882, recall: 0.900, macrof1: 0.890, weightedf1: 0.890[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:53:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:53:44 -- Epoch: 3/20; Valid; loss: 0.370; acc: 0.841; precision: 0.826, recall: 0.865, macrof1: 0.841, weightedf1: 0.841[0m
[92m2020-09-10 19:53:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=125.0), HTML(value='')))

[92m2020-09-10 19:53:50[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:53:50 -- Epoch: 4/20; Train; loss: 0.186; acc: 0.931; precision: 0.925, recall: 0.939, macrof1: 0.931, weightedf1: 0.931[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:53:58 -- Epoch: 4/20; Valid; loss: 0.380; acc: 0.848; precision: 0.818, recall: 0.895, macrof1: 0.847, weightedf1: 0.847[0m
[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n8000/wikigaz_en_ft_ocr_rnn_v002_n8000.model[0m
[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 55.5615


In [71]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 16000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:53:58[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:53:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:53:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:53:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.05814766883850098[0m
[92m2020-09-10 19:53:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
not_assigned    43221
val             25380
train           16000
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:53:59[

s1 padding:   0%|          | 0/16000 [00:00<?, ?it/s]

[92m2020-09-10 19:54:01[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:54:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:54:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:54:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:54:02[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:54:13[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:54:13 -- Epoch: 1/20; Train; loss: 0.528; acc: 0.732; precision: 0.715, recall: 0.772, macrof1: 0.732, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:54:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:54:21 -- Epoch: 1/20; Valid; loss: 0.386; acc: 0.831; precision: 0.807, recall: 0.868, macrof1: 0.830, weightedf1: 0.830[0m
[92m2020-09-10 19:54:21[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:54:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:54:32 -- Epoch: 2/20; Train; loss: 0.310; acc: 0.871; precision: 0.859, recall: 0.888, macrof1: 0.871, weightedf1: 0.871[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:54:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:54:40 -- Epoch: 2/20; Valid; loss: 0.327; acc: 0.863; precision: 0.875, recall: 0.848, macrof1: 0.863, weightedf1: 0.863[0m
[92m2020-09-10 19:54:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:54:51[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:54:51 -- Epoch: 3/20; Train; loss: 0.218; acc: 0.916; precision: 0.908, recall: 0.925, macrof1: 0.916, weightedf1: 0.916[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:54:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:54:59 -- Epoch: 3/20; Valid; loss: 0.309; acc: 0.878; precision: 0.868, recall: 0.891, macrof1: 0.878, weightedf1: 0.878[0m
[92m2020-09-10 19:54:59[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:55:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:55:10 -- Epoch: 4/20; Train; loss: 0.161; acc: 0.941; precision: 0.934, recall: 0.950, macrof1: 0.941, weightedf1: 0.941[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:55:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:55:18 -- Epoch: 4/20; Valid; loss: 0.300; acc: 0.887; precision: 0.886, recall: 0.889, macrof1: 0.887, weightedf1: 0.887[0m
[92m2020-09-10 19:55:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=250.0), HTML(value='')))

[92m2020-09-10 19:55:29[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:55:29 -- Epoch: 5/20; Train; loss: 0.116; acc: 0.958; precision: 0.953, recall: 0.964, macrof1: 0.958, weightedf1: 0.958[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:55:37 -- Epoch: 5/20; Valid; loss: 0.315; acc: 0.888; precision: 0.887, recall: 0.890, macrof1: 0.888, weightedf1: 0.888[0m
[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 4) at ./models/wikigaz_en_ft_ocr_rnn_v002_n16000/wikigaz_en_ft_ocr_rnn_v002_n16000.model[0m
[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 5, selected epoch: 4[0m




User time: 95.3452


In [72]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 32000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B.yaml[0m
[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:55:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:55:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:55:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:55:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.04746866226196289[0m
[92m2020-09-10 19:55:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train           32000
not_assigned    27221
val             25380
test                2
Name: split, dtype: int64[0m
[92m2020-09-10 19:55:38[

s1 padding:   0%|          | 0/32000 [00:00<?, ?it/s]

[92m2020-09-10 19:55:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:55:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:55:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:55:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:55:40[

HBox(children=(FloatProgress(value=0.0, max=20.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:56:02[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:56:02 -- Epoch: 1/20; Train; loss: 0.434; acc: 0.794; precision: 0.778, recall: 0.822, macrof1: 0.794, weig

HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:56:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:56:10 -- Epoch: 1/20; Valid; loss: 0.315; acc: 0.865; precision: 0.835, recall: 0.910, macrof1: 0.865, weightedf1: 0.865[0m
[92m2020-09-10 19:56:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:56:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:56:32 -- Epoch: 2/20; Train; loss: 0.255; acc: 0.898; precision: 0.888, recall: 0.910, macrof1: 0.898, weightedf1: 0.898[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:56:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:56:40 -- Epoch: 2/20; Valid; loss: 0.262; acc: 0.896; precision: 0.886, recall: 0.910, macrof1: 0.896, weightedf1: 0.896[0m
[92m2020-09-10 19:56:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:57:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:57:03 -- Epoch: 3/20; Train; loss: 0.193; acc: 0.926; precision: 0.918, recall: 0.936, macrof1: 0.926, weightedf1: 0.926[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:57:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:57:12 -- Epoch: 3/20; Valid; loss: 0.241; acc: 0.905; precision: 0.898, recall: 0.914, macrof1: 0.905, weightedf1: 0.905[0m
[92m2020-09-10 19:57:12[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=500.0), HTML(value='')))

[92m2020-09-10 19:57:35[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:57:35 -- Epoch: 4/20; Train; loss: 0.154; acc: 0.941; precision: 0.934, recall: 0.949, macrof1: 0.941, weightedf1: 0.941[0m


HBox(children=(FloatProgress(value=0.0, max=397.0), HTML(value='')))

[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:57:43 -- Epoch: 4/20; Valid; loss: 0.241; acc: 0.910; precision: 0.891, recall: 0.933, macrof1: 0.910, weightedf1: 0.910[0m
[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model (early stopped) with least valid loss (checkpoint: 3) at ./models/wikigaz_en_ft_ocr_rnn_v002_n32000/wikigaz_en_ft_ocr_rnn_v002_n32000.model[0m
[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m
[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mEarly stopping at epoch: 4, selected epoch: 3[0m




User time: 123.0453


In [73]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 64000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 19:57:43[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 19:57:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 19:57:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 19:57:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.0603642463684082[0m
[92m2020-09-10 19:57:44[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    64000
val      20603
Name: split, dtype: int64[0m
[92m2020-09-10 19:57:44[0m [95mlwm-embeddings[0m [1m[90m[INFO

length s2:   0%|          | 0/64000 [00:00<?, ?it/s]

[92m2020-09-10 19:57:46[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 19:57:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:57:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 19:57:47[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 19:57:47[

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 19:58:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:58:33 -- Epoch: 1/10; Train; loss: 0.356; acc: 0.842; precision: 0.830, recall: 0.860, macrof1: 0.842, weig

HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:58:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:58:40 -- Epoch: 1/10; Valid; loss: 0.254; acc: 0.897; precision: 0.871, recall: 0.932, macrof1: 0.897, weightedf1: 0.897[0m
[92m2020-09-10 19:58:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 19:59:26[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_19:59:26 -- Epoch: 2/10; Train; loss: 0.210; acc: 0.918; precision: 0.908, recall: 0.929, macrof1: 0.918, weightedf1: 0.918[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 19:59:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_19:59:32 -- Epoch: 2/10; Valid; loss: 0.212; acc: 0.920; precision: 0.912, recall: 0.928, macrof1: 0.920, weightedf1: 0.920[0m
[92m2020-09-10 19:59:32[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:00:18[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:00:18 -- Epoch: 3/10; Train; loss: 0.167; acc: 0.936; precision: 0.928, recall: 0.946, macrof1: 0.936, weightedf1: 0.936[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:00:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:00:25 -- Epoch: 3/10; Valid; loss: 0.202; acc: 0.925; precision: 0.911, recall: 0.941, macrof1: 0.925, weightedf1: 0.925[0m
[92m2020-09-10 20:00:25[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:01:11[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:01:11 -- Epoch: 4/10; Train; loss: 0.140; acc: 0.948; precision: 0.940, recall: 0.957, macrof1: 0.948, weightedf1: 0.948[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:01:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:01:17 -- Epoch: 4/10; Valid; loss: 0.203; acc: 0.922; precision: 0.915, recall: 0.931, macrof1: 0.922, weightedf1: 0.922[0m
[92m2020-09-10 20:01:17[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:02:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:02:03 -- Epoch: 5/10; Train; loss: 0.121; acc: 0.956; precision: 0.950, recall: 0.962, macrof1: 0.956, weightedf1: 0.956[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:02:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:02:10 -- Epoch: 5/10; Valid; loss: 0.185; acc: 0.934; precision: 0.925, recall: 0.944, macrof1: 0.934, weightedf1: 0.934[0m
[92m2020-09-10 20:02:10[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:02:56[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:02:56 -- Epoch: 6/10; Train; loss: 0.108; acc: 0.960; precision: 0.954, recall: 0.966, macrof1: 0.960, weightedf1: 0.960[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:03:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:03:03 -- Epoch: 6/10; Valid; loss: 0.186; acc: 0.935; precision: 0.916, recall: 0.958, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 20:03:03[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:03:49[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:03:49 -- Epoch: 7/10; Train; loss: 0.094; acc: 0.965; precision: 0.960, recall: 0.971, macrof1: 0.965, weightedf1: 0.965[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:03:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:03:55 -- Epoch: 7/10; Valid; loss: 0.188; acc: 0.935; precision: 0.929, recall: 0.942, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 20:03:55[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:04:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:04:41 -- Epoch: 8/10; Train; loss: 0.087; acc: 0.969; precision: 0.964, recall: 0.974, macrof1: 0.969, weightedf1: 0.969[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:04:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:04:48 -- Epoch: 8/10; Valid; loss: 0.180; acc: 0.938; precision: 0.930, recall: 0.947, macrof1: 0.938, weightedf1: 0.938[0m
[92m2020-09-10 20:04:48[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:05:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:05:34 -- Epoch: 9/10; Train; loss: 0.078; acc: 0.972; precision: 0.968, recall: 0.976, macrof1: 0.972, weightedf1: 0.972[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:05:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:05:41 -- Epoch: 9/10; Valid; loss: 0.201; acc: 0.935; precision: 0.941, recall: 0.928, macrof1: 0.935, weightedf1: 0.935[0m
[92m2020-09-10 20:05:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1000.0), HTML(value='')))

[92m2020-09-10 20:06:27[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:06:27 -- Epoch: 10/10; Train; loss: 0.071; acc: 0.974; precision: 0.971, recall: 0.978, macrof1: 0.974, weightedf1: 0.974[0m


HBox(children=(FloatProgress(value=0.0, max=322.0), HTML(value='')))

[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:06:33 -- Epoch: 10/10; Valid; loss: 0.194; acc: 0.936; precision: 0.931, recall: 0.942, macrof1: 0.936, weightedf1: 0.936[0m
[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 8) at ./models/wikigaz_en_ft_ocr_rnn_v002_n64000/wikigaz_en_ft_ocr_rnn_v002_n64000.model[0m



User time: 526.3548


In [74]:
from DeezyMatch import finetune as dm_finetune

n_ft_examples = 84000

# fine-tune a pretrained model stored at pretrained_model_path and pretrained_vocab_path 
dm_finetune(input_file_path="./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml", 
            dataset_path="./dataset/ocr_trainval.txt", 
            model_name=f"wikigaz_en_ft_ocr_rnn_v002_n{n_ft_examples}",
            pretrained_model_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.model", 
            pretrained_vocab_path="./models/wikigaz_en_rnn_001/wikigaz_en_rnn_001.vocab",
            n_train_examples=n_ft_examples
           )

[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread input file: ./inputs/input_dfm_rnn_model_B_no_early_stopping.yaml[0m
[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mpytorch will use: cuda:1[0m
[92m2020-09-10 20:06:33[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mread CSV file: ./dataset/ocr_trainval.txt[0m
[92m2020-09-10 20:06:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32mnumber of labels, True: 42301 and False: 42302[0m
[92m2020-09-10 20:06:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mSplitting the Dataset[0m
[92m2020-09-10 20:06:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mfinish splitting the Dataset. User time: 0.055474281311035156[0m
[92m2020-09-10 20:06:34[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msplits are as follow:
train    84000
val        603
Name: split, dtype: int64[0m
[92m2020-09-10 20:06:34[0m [95mlwm-embeddings[0m [1m[90m[IN

length s1:   0%|          | 0/84000 [00:00<?, ?it/s]

[92m2020-09-10 20:06:36[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [2;32mskipping 0 lines[0m


                                                                     



List all parameters in the model
emb.weight False
rnn_1.weight_ih_l0 True
rnn_1.weight_hh_l0 True
rnn_1.bias_ih_l0 True
rnn_1.bias_hh_l0 True
rnn_1.weight_ih_l0_reverse True
rnn_1.weight_hh_l0_reverse True
rnn_1.bias_ih_l0_reverse True
rnn_1.bias_hh_l0_reverse True
rnn_1.weight_ih_l1 True
rnn_1.weight_hh_l1 True
rnn_1.bias_ih_l1 True
rnn_1.bias_hh_l1 True
rnn_1.weight_ih_l1_reverse True
rnn_1.weight_hh_l1_reverse True
rnn_1.bias_ih_l1_reverse True
rnn_1.bias_hh_l1_reverse True
attn_step1.weight True
attn_step1.bias True
attn_step2.weight True
attn_step2.bias True
fc1.weight True
fc1.bias True
fc2.weight True
fc2.bias True



[92m2020-09-10 20:06:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 20:06:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m**** (Bi-directional) RNN ****[0m
[92m2020-09-10 20:06:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [95m******************************[0m
[92m2020-09-10 20:06:37[

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))




Total number of params: 611883

two_parallel_rnns (
  (emb): Embedding(7542, 60), weights=((7542, 60),), parameters=452520
  (rnn_1): RNN(60, 60, num_layers=2, dropout=0.01, bidirectional=True), weights=((60, 60), (60, 60), (60,), (60,), (60, 60), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,), (60, 120), (60, 60), (60,), (60,)), parameters=36480
  (attn_step1): Linear(in_features=120, out_features=60, bias=True), weights=((60, 120), (60,)), parameters=7260
  (attn_step2): Linear(in_features=60, out_features=1, bias=True), weights=((1, 60), (1,)), parameters=61
  (fc1): Linear(in_features=960, out_features=120, bias=True), weights=((120, 960), (120,)), parameters=115320
  (fc2): Linear(in_features=120, out_features=2, bias=True), weights=((2, 120), (2,)), parameters=242
)


[92m2020-09-10 20:07:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:07:37 -- Epoch: 1/10; Train; loss: 0.326; acc: 0.858; precision: 0.845, recall: 0.877, macrof1: 0.858, weig

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:07:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:07:37 -- Epoch: 1/10; Valid; loss: 0.239; acc: 0.887; precision: 0.852, recall: 0.937, macrof1: 0.887, weightedf1: 0.887[0m
[92m2020-09-10 20:07:37[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:08:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:08:38 -- Epoch: 2/10; Train; loss: 0.193; acc: 0.925; precision: 0.916, recall: 0.936, macrof1: 0.925, weightedf1: 0.925[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:08:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:08:38 -- Epoch: 2/10; Valid; loss: 0.160; acc: 0.930; precision: 0.925, recall: 0.937, macrof1: 0.930, weightedf1: 0.930[0m
[92m2020-09-10 20:08:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:09:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:09:38 -- Epoch: 3/10; Train; loss: 0.158; acc: 0.941; precision: 0.932, recall: 0.951, macrof1: 0.940, weightedf1: 0.940[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:09:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:09:38 -- Epoch: 3/10; Valid; loss: 0.153; acc: 0.927; precision: 0.903, recall: 0.957, macrof1: 0.927, weightedf1: 0.927[0m
[92m2020-09-10 20:09:38[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:10:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:10:39 -- Epoch: 4/10; Train; loss: 0.134; acc: 0.951; precision: 0.944, recall: 0.960, macrof1: 0.951, weightedf1: 0.951[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:10:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:10:39 -- Epoch: 4/10; Valid; loss: 0.145; acc: 0.940; precision: 0.929, recall: 0.953, macrof1: 0.940, weightedf1: 0.940[0m
[92m2020-09-10 20:10:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:11:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:11:39 -- Epoch: 5/10; Train; loss: 0.122; acc: 0.955; precision: 0.949, recall: 0.963, macrof1: 0.955, weightedf1: 0.955[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:11:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:11:39 -- Epoch: 5/10; Valid; loss: 0.141; acc: 0.940; precision: 0.915, recall: 0.970, macrof1: 0.940, weightedf1: 0.940[0m
[92m2020-09-10 20:11:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:12:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:12:39 -- Epoch: 6/10; Train; loss: 0.112; acc: 0.959; precision: 0.953, recall: 0.967, macrof1: 0.959, weightedf1: 0.959[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:12:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:12:39 -- Epoch: 6/10; Valid; loss: 0.121; acc: 0.947; precision: 0.927, recall: 0.970, macrof1: 0.947, weightedf1: 0.947[0m
[92m2020-09-10 20:12:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:13:39[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:13:39 -- Epoch: 7/10; Train; loss: 0.100; acc: 0.964; precision: 0.958, recall: 0.970, macrof1: 0.964, weightedf1: 0.964[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:13:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:13:40 -- Epoch: 7/10; Valid; loss: 0.143; acc: 0.952; precision: 0.939, recall: 0.967, macrof1: 0.952, weightedf1: 0.952[0m
[92m2020-09-10 20:13:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:14:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:14:40 -- Epoch: 8/10; Train; loss: 0.090; acc: 0.967; precision: 0.961, recall: 0.974, macrof1: 0.967, weightedf1: 0.967[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:14:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:14:40 -- Epoch: 8/10; Valid; loss: 0.123; acc: 0.949; precision: 0.935, recall: 0.963, macrof1: 0.949, weightedf1: 0.949[0m
[92m2020-09-10 20:14:40[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:15:41 -- Epoch: 9/10; Train; loss: 0.086; acc: 0.969; precision: 0.964, recall: 0.974, macrof1: 0.969, weightedf1: 0.969[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:15:41 -- Epoch: 9/10; Valid; loss: 0.106; acc: 0.957; precision: 0.948, recall: 0.967, macrof1: 0.957, weightedf1: 0.957[0m
[92m2020-09-10 20:15:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m


HBox(children=(FloatProgress(value=0.0, max=1313.0), HTML(value='')))

[92m2020-09-10 20:16:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [0;33m09/10/2020_20:16:41 -- Epoch: 10/10; Train; loss: 0.080; acc: 0.971; precision: 0.966, recall: 0.976, macrof1: 0.971, weightedf1: 0.971[0m


HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))

[92m2020-09-10 20:16:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;31m09/10/2020_20:16:41 -- Epoch: 10/10; Valid; loss: 0.117; acc: 0.957; precision: 0.954, recall: 0.960, macrof1: 0.957, weightedf1: 0.957[0m
[92m2020-09-10 20:16:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model[0m

[92m2020-09-10 20:16:41[0m [95mlwm-embeddings[0m [1m[90m[INFO][0m [1;32msaving the model with least valid loss (checkpoint: 9) at ./models/wikigaz_en_ft_ocr_rnn_v002_n84000/wikigaz_en_ft_ocr_rnn_v002_n84000.model[0m



User time: 604.0644
