# Training and using a NER model

This notebook shows how to train a new toponym recognition (NER) model, via the `transformers` library.

We start by importing some libraries, and the `recogniser` script from the `geoparser` folder:

In [None]:
import os
import sys

from t_res.geoparser import recogniser

Create a `myner` object of the `Recogniser` class.

> **Note:** The train and test sets for training the NER modules are two json files, one for training and one for testing, in which each item/line in the json corresponds to a sentence. Each sentence-dictionary has three key-value pairs (see two examples below): `id` is an ID of the sentence (a string), `tokens` is the list of tokens into which the sentence has been split, and `ner_tags` is the list of annotations per token (in BIO format). The length of `tokens` and `ner_tags` should always be the same.
> ```json
> {"id":"3896239_29","ner_tags":["O","B-STREET","I-STREET","O","O","O","B-BUILDING","I-BUILDING","O","O","O","O","O","O","O","O","O","O"],"tokens":[",","Old","Millgate",",","to","the","Collegiate","Church",",","where","they","arrived","a","little","after","ten","oclock","."]}
> {"id":"8262498_11","ner_tags":["O","O","O","O","O","O","O","O","O","O","O","B-LOC","O","B-LOC","O","O","O","O","O","O"],"tokens":["On","the","\u2018","JSth","November","the","ship","Santo","Christo",",","from","Monteveido","to","Cadiz",",","with","hides","and","copper","."]}
> ```

In [None]:
myner = recogniser.Recogniser(
    model="blb_lwm-ner-fine",
    train_dataset="../experiments/outputs/data/lwm/ner_fine_train.json",  # Path to the json file containing the training set (see note above).
    test_dataset="../experiments/outputs/data/lwm/ner_fine_dev.json",  # Path to the json file containing the test set (see note above).
    pipe=None,  # We'll store the NER pipeline here, leave this empty.
    base_model="Livingwithmachines/bert_1760_1900",  # Base model to fine-tune for NER. The value can be: either 
                                            # your local path to a model or the huggingface path.
                                            # In this case, we use the huggingface path:
                                            # https://huggingface.co/Livingwithmachines/bert_1760_1900). You can
                                            # chose any other model from the HuggingFace hub, as long as it's
                                            # trained on the "Fill-Mask" objective (filter by task).
    model_path="../resources/models/",  # Path where the NER model will be stored.
    training_args={
        "batch_size": 8,
        "num_train_epochs": 10,
        "learning_rate": 0.00005,
        "weight_decay": 0.0,
    }, # Training arguments: you can change them.
    overwrite_training=False,  # Set to True if you want to overwrite an existing model with the same name.
    do_test=True,  # Set to True if you want to perform the training on test mode (the string "_test" will be appended to your model name).
    load_from_hub=False, # Whether the final model should be loaded from the HuggingFace hub
)

Print the Recogniser:

In [None]:
print(myner)

Now train the model:

In [None]:
myner.train()

Now, to use the model you have just trained, you'll need to load a NER pipeline:

In [None]:
myner.pipe = myner.create_pipeline()

And, finally, use the newly trained model to predict the named entities in a sentence.

In [None]:
sentence = "A remarkable case of rattening has just occurred in the building trade at Sheffield."

predictions = myner.ner_predict(sentence)
print([pred for pred in predictions if pred["entity"] != "O"]) # Note that, if you've trained the model in the test mode, the model will probably not identify "Sheffield" as a location.