# Named Entity Recognition 

Named-entity recognition (NER) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc. [[Wikipedia: Named Entity Recognition]](https://en.wikipedia.org/wiki/Named-entity_recognition)

We will try to finetune the bert-base-indonesian-522M for the Named Entity Recognition (NER) task. For this purpose we will use the [NERGRIT Corpus](https://github.com/grit-id/nergrit-corpus) which contains 321.757 lines of train, 66.974 lines of test and 64.208 lines of valid dataset. It uses Inside-Outside-Beginning (IOB) format where each line is composed of a word and its label/category. 

The [NERGRIT Corpus](https://github.com/grit-id/nergrit-corpus) is a very valueable dataset for indonesian NLP researcher. Unfortunately there are many typos or errors on the labels, so I spent some times to analyse the errors, make corrections and report the [issue to their Github repository](https://github.com/grit-id/nergrit-corpus/issues/1). Since the license allows us to redistribute the dataset, I will also publish the original dataset including its corrections. Currently the dataset is only available per [request](https://ner.grit.id/index.php/front/about) (klick the "Get NERGRIT Corpus").


## Transformers or Simpletransformers?

We will use simpletransformers in this case to simplify the training and inferencing

In [1]:
from simpletransformers.ner import NERModel, NERArgs
import pandas as pd
import logging
import sys

In [2]:
logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.WARNING)

We use the corrected dataset which has less lines than the originals (train: 309203, valid: 61680, test: 64568)

In [3]:
data_dir = "/dataset/nergrit-corpus/ner/data"
file_train_corrected = f'{data_dir}/train_corrected.txt'
file_valid_corrected = f'{data_dir}/valid_corrected.txt'
file_test_corrected = f'{data_dir}/test_corrected.txt'
file_labels_map = f'{data_dir}/labels-map.csv'

The Simpletransformers requires the dataset either as pandas dataframe with following column/format: **sentence_id**, **words**, **labels**, or Text file in CoNLL format. The **sentence_id** is consecutive number determines which words belong to a given sentence. I.e. the words from the same sequence should be assigned the same unique sentence_id.

In [4]:
train_data = []
sentence_counter = 0
with open(file_train_corrected) as fp:
    for cnt, line in enumerate(fp):
        try:
            texts = line.split()
            sentence_id, words, labels = texts[0], ' '.join(texts[1:-1]), texts[-1]
            train_data.append([int(sentence_id), words, labels])
        except:
            pass

train_data = pd.DataFrame(
    train_data, columns=["sentence_id", "words", "labels"],
)

In [5]:
valid_data = []
sentence_counter = 0
with open(file_valid_corrected) as fp:
    for cnt, line in enumerate(fp):
        try:
            texts = line.split()
            sentence_id, words, labels = texts[0], ' '.join(texts[1:-1]), texts[-1]
            valid_data.append([int(sentence_id), words, labels])
        except:
            pass

valid_data = pd.DataFrame(
    valid_data, columns=["sentence_id", "words", "labels"],
)

In [6]:
test_data = []
sentence_counter = 0
with open(file_test_corrected) as fp:
    for cnt, line in enumerate(fp):
        try:
            texts = line.split()
            sentence_id, words, labels = texts[0], ' '.join(texts[1:-1]), texts[-1]
            test_data.append([int(sentence_id), words, labels])
        except:
            pass

test_data = pd.DataFrame(
    test_data, columns=["sentence_id", "words", "labels"],
)

In [7]:
len(train_data),len(valid_data),len(test_data)

(309203, 61680, 64568)

In [8]:
train_data.head(10)

Unnamed: 0,sentence_id,words,labels
0,0,Indonesia,B-GPE
1,0,mengekspor,O
2,0,produk,O
3,0,industri,O
4,0,skala,O
5,0,besar,O
6,0,ke,O
7,0,Amerika,B-GPE
8,0,Serikat,I-GPE
9,0,.,O


In [9]:
valid_data.head(10)

Unnamed: 0,sentence_id,words,labels
0,0,Rupiah,O
1,0,Terus,O
2,0,Melemah,O
3,0,",",O
4,0,BI,B-ORG
5,0,:,O
6,0,Kebijakan,O
7,0,Moneter,O
8,0,Sudah,O
9,0,Tepat,O


## The Labels

The NERGRIT corpus contains 19 entities, each with Inside- and Beginning-Tag, plus an Outside-Tag. Alltogether become 39 categories. The entities have following meaning:
1. 'CRD' --> Cardinal
1. 'DAT' --> Date
1. 'EVE' --> Event
1. 'FAC' --> Facility
1. 'GPE' --> Geopolitical Entity
1. 'LAW' --> Law Entity (such as Undang-Undang)
1. 'LOC' --> Location
1. 'MON' --> Money
1. 'NOR' --> Political Organization
1. 'ORD' --> Ordinal
1. 'ORG' --> Organization
1. 'PER' --> Person
1. 'PRC' --> Percent
1. 'PRD' --> Product
1. 'QTY' --> Quantity
1. 'REG' --> Religion
1. 'TIM' --> Time
1. 'WOA' --> Work of Art
1. 'LAN' --> Language

In [10]:
df = pd.read_csv(file_labels_map, sep=' ', names=['X','Y'])
labels = list(set(df['Y']))
labels.sort()
len(labels)

39

In [11]:
labels

['B-CRD',
 'B-DAT',
 'B-EVE',
 'B-FAC',
 'B-GPE',
 'B-LAN',
 'B-LAW',
 'B-LOC',
 'B-MON',
 'B-NOR',
 'B-ORD',
 'B-ORG',
 'B-PER',
 'B-PRC',
 'B-PRD',
 'B-QTY',
 'B-REG',
 'B-TIM',
 'B-WOA',
 'I-CRD',
 'I-DAT',
 'I-EVE',
 'I-FAC',
 'I-GPE',
 'I-LAN',
 'I-LAW',
 'I-LOC',
 'I-MON',
 'I-NOR',
 'I-ORD',
 'I-ORG',
 'I-PER',
 'I-PRC',
 'I-PRD',
 'I-QTY',
 'I-REG',
 'I-TIM',
 'I-WOA',
 'O']

## The Training with bert-base-indonesian-522M

Since I have already pre-trained the bert-base with indonesian Wikipedia, I want to try its performance for this task.

In [12]:
# Configure the model
model_args = NERArgs()
model_args.num_train_epochs = 5
model_args.train_batch_size = 32
model_args.evaluate_during_training = True
model_args.output_dir = '/output/ner/bert-base-indonesian'
model_args.best_model_dir = '/output/ner/bert-base-indonesian/best_model'
model_args.overwrite_output_dir = True
model_args.fp16 = False
model_args.labels_list=labels

In [13]:
model_args

NERArgs(adam_epsilon=1e-08, best_model_dir='/output/ner/bert-base-indonesian/best_model', cache_dir='cache_dir/', config={}, custom_layer_parameters=[], custom_parameter_groups=[], dataloader_num_workers=78, do_lower_case=False, dynamic_quantize=False, early_stopping_consider_epochs=False, early_stopping_delta=0, early_stopping_metric='eval_loss', early_stopping_metric_minimize=True, early_stopping_patience=3, encoding=None, eval_batch_size=8, evaluate_during_training=True, evaluate_during_training_silent=True, evaluate_during_training_steps=2000, evaluate_during_training_verbose=False, fp16=False, gradient_accumulation_steps=1, learning_rate=4e-05, local_rank=-1, logging_steps=50, manual_seed=None, max_grad_norm=1.0, max_seq_length=128, model_name=None, model_type=None, multiprocessing_chunksize=500, n_gpu=1, no_cache=False, no_save=False, num_train_epochs=5, output_dir='/output/ner/bert-base-indonesian', overwrite_output_dir=True, process_count=78, quantized_model=False, reprocess_in

In [14]:
model = NERModel(
    "bert", "cahya/bert-base-indonesian-522M", labels=labels, args=model_args
)

Some weights of the model checkpoint at cahya/bert-base-indonesian-522M were not used when initializing BertForTokenClassification: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight', 'cls.predictions.decoder.bias']
- This IS expected if you are initializing BertForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing BertForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of BertForTokenClassification were not initialized from the model checkpoint at cahya/bert-base-indonesia

In [15]:
# Train the model
model.train_model(train_data, eval_data=valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=14586.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=5.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 5', max=456.0, style=ProgressStyle(des…






INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 4 of 5', max=456.0, style=ProgressStyle(des…

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Training of bert model complete. Saved to /output/ner/bert-base-indonesian.


In [16]:
# Evaluate the model with valid dataset
result, model_outputs, preds_list = model.eval_model(valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.39221380330451927, 'precision': 0.5722320247022497, 'recall': 0.6153207636665481, 'f1_score': 0.5929946860179418}


In [17]:
# Evaluate the model with test dataset
result, model_outputs, preds_list = model.eval_model(test_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=3140.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=393.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.3846507728782319, 'precision': 0.5615047557977985, 'recall': 0.5910011248593926, 'f1_score': 0.575875486381323}


The result (F1-score: 57.6 %) is somehow disapointed, because NERGRIT claimed their model achieved F1-score of 80%, which is superior to my model.

## The Training with xlm-roberta-base

I tried a multilanguage model from Facebook: XLM-Roberta-base which was pre-trained on 2.5TB of dataset.

In [18]:
# Configure the model
model_args = NERArgs()
model_args.num_train_epochs = 5
model_args.train_batch_size = 32
model_args.evaluate_during_training = True
model_args.output_dir = '/output/ner/xlm-roberta-base'
model_args.best_model_dir = '/output/ner/xlm-roberta-base/best_model'
model_args.overwrite_output_dir = True
model_args.fp16 = False
model_args.labels_list=labels

In [19]:
model = NERModel(
    "xlmroberta", "xlm-roberta-base", labels=labels, args=model_args
)

Some weights of the model checkpoint at xlm-roberta-base were not used when initializing XLMRobertaForTokenClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of XLMRobertaForTokenClassification were not initialized from the model checkpoint at xlm-roberta-base and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-st

In [20]:
# Train the model
model.train_model(train_data, eval_data=valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=14586.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=5.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 4 of 5', max=456.0, style=ProgressStyle(des…

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Training of xlmroberta model complete. Saved to /output/ner/xlm-roberta-base.


In [21]:
# Evaluate the model with valid dataset
result, model_outputs, preds_list = model.eval_model(valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.2178909464834717, 'precision': 0.823076923076923, 'recall': 0.8377224199288256, 'f1_score': 0.8303350970017637}


In [22]:
# Evaluate the model with test dataset
result, model_outputs, preds_list = model.eval_model(test_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=3140.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=393.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.22545069865849812, 'precision': 0.817158774373259, 'recall': 0.8253432365518794, 'f1_score': 0.8212306141873355}


### Result

The result is great, F1-score 82.1%

## The Training with xlm-roberta-large

Then I tried a second multilanguage model from Facebook: XLM-Roberta-large

In [23]:
# Configure the model
model_args = NERArgs()
model_args.num_train_epochs = 5
model_args.train_batch_size = 32
model_args.evaluate_during_training = True
model_args.output_dir = '/output/ner/xlm-roberta-large'
model_args.best_model_dir = '/output/ner/xlm-roberta-large/best_model'
model_args.overwrite_output_dir = True
model_args.fp16 = False
model_args.labels_list=labels

In [24]:
model = NERModel(
    "xlmroberta", "xlm-roberta-large", labels=labels, args=model_args
)

Some weights of the model checkpoint at xlm-roberta-large were not used when initializing XLMRobertaForTokenClassification: ['lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
- This IS expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPretraining model).
- This IS NOT expected if you are initializing XLMRobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of XLMRobertaForTokenClassification were not initialized from the model checkpoint at xlm-roberta-large and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-

In [25]:
# Train the model
model.train_model(train_data, eval_data=valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=14586.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Epoch', max=5.0, style=ProgressStyle(description_width='i…

HBox(children=(FloatProgress(value=0.0, description='Running Epoch 0 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 1 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 2 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 3 of 5', max=456.0, style=ProgressStyle(des…




INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




HBox(children=(FloatProgress(value=0.0, description='Running Epoch 4 of 5', max=456.0, style=ProgressStyle(des…

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…





INFO:simpletransformers.ner.ner_model: Training of xlmroberta model complete. Saved to /output/ner/xlm-roberta-large.


In [26]:
# Evaluate the model with valid dataset
result, model_outputs, preds_list = model.eval_model(valid_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2885.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=361.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.21326923411646845, 'precision': 0.8359438449936187, 'recall': 0.8546856465005931, 'f1_score': 0.8452108628072027}


In [27]:
# Evaluate the model with test dataset
result, model_outputs, preds_list = model.eval_model(test_data)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=3140.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Evaluation', max=393.0, style=ProgressStyle(descr…




INFO:simpletransformers.ner.ner_model:{'eval_loss': 0.2183187341105143, 'precision': 0.8353854686633388, 'recall': 0.8475129417060545, 'f1_score': 0.8414055080721747}


### Result

Again, the result is great, it achieved F1-score of 84.14%. That means the current language model I trained (bert-base-indonesian-522M) is still not good enough for this task. Most probably it needs more data for pre-training

## Predict some Samples

In [28]:
# Make predictions with the model
texts = [
    "Gubernur Bank Indonesia Agus Martowardojo bersama jajaran deputi Gubernur Bank Indonesia menggelar konferensi pers usai Rapat Dewan Gubernur di Bank Indonesia, Jakarta, Kamis (17/5/2015)",
    "Selama 24 jam puncak Mahameru di Malang kebanjiran pendaki dari Wina",
]
predictions, raw_outputs = model.predict(texts)

INFO:simpletransformers.ner.ner_model: Converting to features started.


HBox(children=(FloatProgress(value=0.0, max=2.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, description='Running Prediction', max=1.0, style=ProgressStyle(descrip…




In [29]:
predictions

[[{'Gubernur': 'B-ORG'},
  {'Bank': 'I-ORG'},
  {'Indonesia': 'I-ORG'},
  {'Agus': 'B-PER'},
  {'Martowardojo': 'I-PER'},
  {'bersama': 'O'},
  {'jajaran': 'O'},
  {'deputi': 'B-ORG'},
  {'Gubernur': 'I-ORG'},
  {'Bank': 'I-ORG'},
  {'Indonesia': 'I-ORG'},
  {'menggelar': 'O'},
  {'konferensi': 'O'},
  {'pers': 'O'},
  {'usai': 'O'},
  {'Rapat': 'B-EVE'},
  {'Dewan': 'I-EVE'},
  {'Gubernur': 'I-EVE'},
  {'di': 'O'},
  {'Bank': 'B-ORG'},
  {'Indonesia,': 'I-ORG'},
  {'Jakarta,': 'B-GPE'},
  {'Kamis': 'B-DAT'},
  {'(17/5/2015)': 'I-DAT'}],
 [{'Selama': 'O'},
  {'24': 'B-QTY'},
  {'jam': 'I-QTY'},
  {'puncak': 'B-LOC'},
  {'Mahameru': 'I-LOC'},
  {'di': 'O'},
  {'Malang': 'B-GPE'},
  {'kebanjiran': 'O'},
  {'pendaki': 'O'},
  {'dari': 'O'},
  {'Wina': 'B-GPE'}]]