# T-NER: Model Training Example
An example of using [T-NER](https://github.com/asahi417/tner) to finetune & evaluate language model on NER.

***Table of Contents***  
- [Finetuning & Evaluation on single dataset](https://colab.research.google.com/drive/1AlcTbEsp8W11yflT7SyT0L4C4HG6MXYr#scrollTo=23QyG8ypSILQ&line=2&uniqifier=1)
- [Finetuning & Evaluation on multiple datasets](https://colab.research.google.com/drive/1AlcTbEsp8W11yflT7SyT0L4C4HG6MXYr#scrollTo=L7R5qjXRdPWb&line=2&uniqifier=1)
- [Finetuning & Evaluation on a custom dataset](https://colab.research.google.com/drive/1AlcTbEsp8W11yflT7SyT0L4C4HG6MXYr#scrollTo=nB6i22foeCjV&line=1&uniqifier=1)

### Setup

In [1]:
# main package
%pip install tner -U
%pip list | grep tner

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tner
  Downloading tner-0.1.9.tar.gz (2.2 MB)
[K     |████████████████████████████████| 2.2 MB 5.2 MB/s 
Collecting allennlp>=2.0.0
  Downloading allennlp-2.10.0-py3-none-any.whl (729 kB)
[K     |████████████████████████████████| 729 kB 61.8 MB/s 
[?25hCollecting transformers
  Downloading transformers-4.23.1-py3-none-any.whl (5.3 MB)
[K     |████████████████████████████████| 5.3 MB 48.4 MB/s 
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.97-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[K     |████████████████████████████████| 1.3 MB 65.0 MB/s 
[?25hCollecting seqeval
  Downloading seqeval-1.2.2.tar.gz (43 kB)
[K     |████████████████████████████████| 43 kB 2.2 MB/s 
[?25hCollecting datasets
  Downloading datasets-2.5.2-py3-none-any.whl (432 kB)
[K     |████████████████████████████████| 432 kB 4.5 MB/s 
Collecting h5py>=3.6.0
  Do

tner                          0.1.9


In [2]:
import logging
from tner import GridSearcher, TransformersNER

logging.basicConfig(format='%(asctime)s %(levelname)-8s %(message)s', level=logging.INFO, datefmt='%Y-%m-%d %H:%M:%S')
logger = logging.getLogger()
logger.setLevel(logging.INFO)

## Finetuning
Let's finetune `albert-base-v1` on `wnut2017`!


In [3]:
searcher = GridSearcher(
   checkpoint_dir='./ckpt_bert_bionlp2004',
   dataset="tner/btc",  # either of `dataset` (huggingface dataset) or `local_dataset` (custom dataset) should be given
   model="roberta-large",  # language model to fine-tune
   epoch=10,  # the total epoch (`L` in the figure)
   epoch_partial=5,  # the number of epoch at 1st stage (`M` in the figure)
   n_max_config=1,  # the number of models to pass to 2nd stage (`K` in the figure)
   batch_size=32,
   gradient_accumulation_steps=[2],
   crf=[True],
   lr=[1e-3, 1e-4],
   weight_decay=[None],
   random_seed=[42],
   lr_warmup_step_ratio=[0.1],
   max_grad_norm=[None, 10]
)
searcher.train()

INFO:root:INITIALIZE GRID SEARCHER: 4 configs to try
INFO:root:## 1st RUN: Configuration 0/4 ##
INFO:root:hyperparameters
INFO:root:	 * dataset: tner/btc
INFO:root:	 * dataset_split: train
INFO:root:	 * dataset_name: None
INFO:root:	 * local_dataset: None
INFO:root:	 * model: roberta-large
INFO:root:	 * crf: True
INFO:root:	 * max_length: 128
INFO:root:	 * epoch: 10
INFO:root:	 * batch_size: 32
INFO:root:	 * lr: 0.001
INFO:root:	 * random_seed: 42
INFO:root:	 * gradient_accumulation_steps: 2
INFO:root:	 * weight_decay: None
INFO:root:	 * lr_warmup_step_ratio: 0.1
INFO:root:	 * max_grad_norm: None


Downloading builder script:   0%|          | 0.00/3.81k [00:00<?, ?B/s]

Downloading and preparing dataset btc/btc to /root/.cache/huggingface/datasets/tner___btc/btc/1.0.0/ba7dfb92f97e0ae98ff48df9a35a6821285a4b3d3ed284bb177e17f7562c3c37...


Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/464k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/1.33M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/196k [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/3 [00:00<?, ?it/s]

Generating train split: 0 examples [00:00, ? examples/s]

INFO:datasets_modules.datasets.tner--btc.ba7dfb92f97e0ae98ff48df9a35a6821285a4b3d3ed284bb177e17f7562c3c37.btc:generating examples from = /root/.cache/huggingface/datasets/downloads/f2cfc12e556282a47bd84b744336cebe8b91991572a47d159862eadbd863faee


Generating validation split: 0 examples [00:00, ? examples/s]

INFO:datasets_modules.datasets.tner--btc.ba7dfb92f97e0ae98ff48df9a35a6821285a4b3d3ed284bb177e17f7562c3c37.btc:generating examples from = /root/.cache/huggingface/datasets/downloads/95d25fc1504243df57f0989f29e46953fc62187453a4f3bdcfe6ba715687c0a6


Generating test split: 0 examples [00:00, ? examples/s]

INFO:datasets_modules.datasets.tner--btc.ba7dfb92f97e0ae98ff48df9a35a6821285a4b3d3ed284bb177e17f7562c3c37.btc:generating examples from = /root/.cache/huggingface/datasets/downloads/f2be6f07f5dbe588f52886a5f118fb1abebc00759ef7be685832f8347f729e50


Dataset btc downloaded and prepared to /root/.cache/huggingface/datasets/tner___btc/btc/1.0.0/ba7dfb92f97e0ae98ff48df9a35a6821285a4b3d3ed284bb177e17f7562c3c37. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:initialize language model with `roberta-large`


Downloading:   0%|          | 0.00/482 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForTokenClassification: ['lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at roberta-large and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be ab

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

INFO:root:dataset preprocessing
INFO:root:encode all the data: 6338
INFO:root:preprocessed feature is saved at ./ckpt_bert_bionlp2004/model_jwdmql/cache/encoded_feature.pkl
INFO:root:start model training
INFO:root:	 * global step 50: loss: 610.94, lr: 0.0002525252525252525
INFO:root:	 * global step 100: loss: 490.2, lr: 0.000505050505050505
INFO:root:	 * global step 150: loss: 521.49, lr: 0.0007575757575757576
INFO:root:[epoch 0/10] average loss: 536.3, lr: 0.001
INFO:root:model saving at ./ckpt_bert_bionlp2004/model_jwdmql/epoch_1
INFO:root:saving model weight at ./ckpt_bert_bionlp2004/model_jwdmql/epoch_1
INFO:root:saving tokenizer at ./ckpt_bert_bionlp2004/model_jwdmql/epoch_1
INFO:root:optimizer saving at ./ckpt_bert_bionlp2004/model_jwdmql/optimizers/optimizer.1.pt
INFO:root:remove old optimizer files
INFO:root:	 * global step 50: loss: 582.65, lr: 0.0009719416386083053
INFO:root:	 * global step 100: loss: 570.31, lr: 0.0009438832772166106
INFO:root:	 * global step 150: loss: 569.

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:initialize language model with `roberta-large`
Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForTokenClassification: ['lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at roberta-large and are newly initialized: ['classifier.bias', 'classifier.weight']
You should

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:initialize language model with `roberta-large`
Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForTokenClassification: ['lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at roberta-large and are newly initialized: ['classifier.bias', 'classifier.weight']
You should

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:initialize language model with `roberta-large`
Some weights of the model checkpoint at roberta-large were not used when initializing RobertaForTokenClassification: ['lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at roberta-large and are newly initialized: ['classifier.bias', 'classifier.weight']
You should

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:encode all the data: 1001
INFO:root:preprocessed feature is saved at ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.07it/s]
INFO:root:downloading `unified_label2id.json` from https://raw.githubusercontent.com/asahi417/tner/master/unified_label2id.json
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.04it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.06it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.08it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:dataset preprocessing
INFO:root:load optimizer from ./ckpt_bert_bionlp2004/model_hexdvx/optimizers/optimizer.5.pt
INFO:root:optimizer is loading on cuda
INFO:root:load scheduler from ./ckpt_bert_bionlp2004/model_hexdvx/optimizers/optimizer.5.pt
INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/model_hexdvx/cache/encoded_feature.pkl
INFO:root:start model training
INFO:root:	 * global step 50: loss: 31.52, lr: 5.274971941638609e-05
INFO:root:	 * global step 100: loss: 32.02, lr: 4.994388327721661e-05
INFO:root:	 * global step 150: loss: 30.15, lr: 4.713804713804714e-05
INFO:root:[epoch 5/10] average loss: 28.9, lr: 4.4444444444444447e-05
INFO:root:model saving at ./ckpt_bert_bionlp2004/model_hexdvx/epoch_6
INFO:root:saving model weight at ./ckpt_bert_bionlp2004/model_hexdvx/epoch_6
INFO:root:saving tokenizer at ./ckpt_bert_bionlp2004/model_hexdvx/epoch_6
INFO:root:optimizer saving at ./ckpt_bert_bionlp2004/model_hexdvx/optimizers/optimizer.6.pt
INFO:root:remove

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:31<00:00,  1.97it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.07it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.04it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.06it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.05it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:31<00:00,  2.01it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.06it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.07it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:loading preprocessed feature from ./ckpt_bert_bionlp2004/encoded/roberta-large.128.dev.True.validation.pkl
100%|██████████| 63/63 [00:30<00:00,  2.06it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'chara

### Evaluation
Now the best model is stored at `ckpt_bert_wnut2017/best_model`, so let's load the model run evaluation on the test split.

First, we load the model and check the prediction.

In [4]:
model = TransformersNER("ckpt_bert_bionlp2004/best_model")
model.predict(["Jacob Collier is a Grammy awarded English artist from London"]) 

INFO:root:initialize language model with `ckpt_bert_bionlp2004/best_model`
INFO:root:use CRF
INFO:root:loading pre-trained CRF layer
INFO:root:label2id: {'B-LOC': 0, 'B-ORG': 1, 'B-PER': 2, 'I-LOC': 3, 'I-ORG': 4, 'I-PER': 5, 'O': 6}
INFO:root:device   : cuda
INFO:root:gpus     : 1
INFO:root:encode all the data: 1
100%|██████████| 1/1 [00:00<00:00,  6.94it/s]


{'prediction': [['B-PER',
   'I-PER',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'B-LOC']],
 'probability': [[0.9953048229217529,
   0.9996180534362793,
   0.9994888305664062,
   0.999830961227417,
   0.9918205738067627,
   0.9998334646224976,
   0.9804861545562744,
   0.9998149275779724,
   0.9991379976272583,
   0.9992963075637817]],
 'input': [['Jacob',
   'Collier',
   'is',
   'a',
   'Grammy',
   'awarded',
   'English',
   'artist',
   'from',
   'London']],
 'entity_prediction': [[{'type': 'PER',
    'entity': ['Jacob', 'Collier'],
    'position': [0, 1],
    'probability': [0.9953048229217529, 0.9996180534362793]},
   {'type': 'LOC',
    'entity': ['London'],
    'position': [9],
    'probability': [0.9992963075637817]}]]}

Then, the model instance has `evaluate` function where one can run evaluatino on the dataset easily.

In [5]:
metric = model.evaluate('tner/btc', dataset_split='test', batch_size=16)



  0%|          | 0/3 [00:00<?, ?it/s]

INFO:root:encode all the data: 2000
100%|██████████| 125/125 [01:02<00:00,  2.00it/s]
INFO:root:map entity into shared label set {'location': ['LOCATION', 'LOC', 'location', 'Location'], 'organization': ['ORGANIZATION', 'ORG', 'organization'], 'person': ['PERSON', 'PSN', 'person', 'PER'], 'date': ['DATE', 'DAT', 'YEAR', 'Year'], 'time': ['TIME', 'TIM', 'Hours'], 'artifact': ['ARTIFACT', 'ART', 'artifact'], 'percent': ['PERCENT', 'PNT'], 'other': ['OTHER', 'MISC'], 'money': ['MONEY', 'MNY', 'Price'], 'corporation': ['corporation'], 'group': ['group', 'NORP'], 'product': ['product', 'PRODUCT'], 'rating': ['Rating', 'RATING'], 'amenity': ['Amenity'], 'restaurant': ['Restaurant_Name'], 'dish': ['Dish'], 'cuisine': ['Cuisine'], 'actor': ['ACTOR', 'Actor'], 'title': ['TITLE'], 'genre': ['GENRE', 'Genre'], 'director': ['DIRECTOR', 'Director'], 'song': ['SONG'], 'plot': ['PLOT', 'Plot'], 'review': ['REVIEW'], 'character': ['CHARACTER'], 'ratings_average': ['RATINGS_AVERAGE'], 'trailer': ['TRAI

In [6]:
metric

{'micro/f1': 0.8289794496691048,
 'micro/f1_ci': {},
 'micro/recall': 0.8158135283363802,
 'micro/precision': 0.8425772952560774,
 'macro/f1': 0.7746975422092938,
 'macro/f1_ci': {},
 'macro/recall': 0.7608068393822244,
 'macro/precision': 0.7904438081295443,
 'per_entity_metric': {'location': {'f1': 0.736842105263158,
   'f1_ci': {},
   'precision': 0.7362637362637363,
   'recall': 0.7374213836477987},
  'organization': {'f1': 0.6774036115178135,
   'f1_ci': {},
   'precision': 0.7236704900938478,
   'recall': 0.636697247706422},
  'person': {'f1': 0.9098469098469099,
   'f1_ci': {},
   'precision': 0.9113971980310488,
   'recall': 0.9083018867924528}}}

## Finetuning on multiple datasets
To finetune on multiple datasets, we need to give a list to the variable `dataset`.

In [None]:
searcher = GridSearcher(
   checkpoint_dir='./ckpt_bert_multiple_dataset',
   dataset=["tner/wnut2017", "tner/fin"],  # either of `dataset` (huggingface dataset) or `local_dataset` (custom dataset) should be given
   model="distilbert-base-cased",  # language model to fine-tune
   epoch=10,  # the total epoch (`L` in the figure)
   epoch_partial=5,  # the number of epoch at 1st stage (`M` in the figure)
   n_max_config=1,  # the number of models to pass to 2nd stage (`K` in the figure)
   batch_size=32,
   gradient_accumulation_steps=[2],
   crf=[True],
   lr=[1e-3, 1e-4],
   weight_decay=[None],
   random_seed=[42],
   lr_warmup_step_ratio=[0.1],
   max_grad_norm=[None, 10]
)
searcher.train()

In [None]:
model = TransformersNER("ckpt_bert_multiple_dataset/best_model")
metric_wnut = model.evaluate('tner/wnut2017', dataset_split='test', batch_size=16)
metric_fin = model.evaluate('tner/fin', dataset_split='test', batch_size=16)

In [None]:
metric_wnut

In [None]:
metric_fin

## Finetuning on a custom dataset
Finetuning on a [custom dataset](https://github.com/asahi417/tner/tree/master/examples/custom_dataset_sample).

In [None]:
!mkdir ./custom_data
!wget https://raw.githubusercontent.com/asahi417/tner/master/examples/local_dataset_sample/train.txt -O custom_data/train.txt
!wget https://raw.githubusercontent.com/asahi417/tner/master/examples/local_dataset_sample/valid.txt -O custom_data/valid.txt
!wget https://raw.githubusercontent.com/asahi417/tner/master/examples/local_dataset_sample/test.txt -O custom_data/test.txt

In [None]:
!head -n 5 custom_data/train.txt

In [None]:
local_dataset = {"train": "custom_data/train.txt", "validation": "custom_data/valid.txt", "test": "custom_data/test.txt"}

In [None]:
searcher = GridSearcher(
   checkpoint_dir='./ckpt_bert_custom_dataset',
   local_dataset=local_dataset,
   model="distilbert-base-cased",  # language model to fine-tune
   epoch=2,  # the total epoch (`L` in the figure)
   epoch_partial=1,  # the number of epoch at 1st stage (`M` in the figure)
   n_max_config=1,  # the number of models to pass to 2nd stage (`K` in the figure)
   batch_size=4,
   gradient_accumulation_steps=[1],
   crf=[True],
   lr=[1e-4],
   weight_decay=[None],
   random_seed=[42],
   lr_warmup_step_ratio=[0.1],
   max_grad_norm=[None, 10]
)
searcher.train()

In [None]:
model = TransformersNER("ckpt_bert_custom_dataset/best_model")
metric = model.evaluate(local_dataset=local_dataset, dataset_split='test', batch_size=16)

In [None]:
metric