# Workflow Examples

`NERDA` offers a simple easy-to-use interface for fine-tuning and applying transformer-based models for Named-entity Recognition (='`NERDA` models'). 

`NERDA` can be used in two ways. You can either (1) train your own customized `NERDA` model or (2) download and use one of our precooked `NERDA` models for inference.

## Train Your Own `NERDA` model

We want to fine-tune a transformer for English. 

First, we download an English NER dataset [CoNLL-2003](https://www.clips.uantwerpen.be/conll2003/ner/) with annotated Named Entities, that we will use for training and evaluation of our model.

In [4]:
# don't print warnings for this session
from NERDA.datasets import get_conll_data, download_conll_data
download_conll_data()

ModuleNotFoundError: No module named 'NERDA'


The DaNE dataset looks like this. 

In [3]:
training = get_conll_data('train', 5)
validation = get_conll_data('valid', 5)
# example
sentence = training.get('sentences')[0]
tags = training.get('tags')[0]
print(" ".join(["{}/{}".format(word, tag) for word, tag in zip(sentence, tags)]))

NameError: name 'get_dane_data' is not defined

If you provide your own dataset, it must have the same structure (dictionary with 'sentences' and 'tags'), except it does not have to follow the IOB tagging scheme - words that are beginning of entities are tagged with 'B-' and words 'inside' entities are tagged with 'I-').

Instantiate a `NERDA` model for finetuning an ELECTRA (English) model. Note, this model configuration only uses 5 sentences for model training to minimize execution time. Also the hyperparameters for the model have chosen in order to minimize execution time. This example only serves to illustrate the functionality i.e. the resulting model will suck.

In [None]:
from NERDA.models import NERDA
model = NERDA(dataset_training = training,
              dataset_validation = validation,
              transformer = 'google/electra-small-discriminator',
              hyperparameters = {'epochs' : 1,
                                 'warmup_steps' : 10,
                                 'train_batch_size': 5,
                                 'learning_rate': 0.0001},)

By default the network architecture (built on top of the transformer) is analogous to the models in [Hvingelby et al. 2020](http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.565.pdf). 

The model can then be trained by invoking the `train` method.

In [None]:
model.train()

We can then compute the performance on a test set (limited to 5 sentences):

In [None]:
test = get_conll_data('test', 5)
model.evaluate_performance(test)

Unsurprisingly, the model sucks in this case due to the ludicrous specification.

Named Entities in new texts can be predicted with `predict` functions.

In [2]:
text = "Old MacDonald had a farm"
model.predict_text(text)

Needless to say the predicted entities for this particular model are probably nonsensical.

`NERDA` has more handles, that you can use. You can 

1. provide your own data set 
2. choose whatever pretrained transformer you would like to fine-tune
3. provide your own set of hyperparameters and lastly
4. provide your own `torch` network (architecture)

We have trained a number of more reasonable model configurations, that you can use out-of-the-box. See the chapter below.

## Use a Precooked `NERDA` model

We have precooked a number of `NERDA` models, that you can download 
and use right off the shelf. 

Here is an example.

Instantiate English ELECTRA model, that has been finetuned for NER for English,
`EN_ELECTRA_EN`.

In [None]:

from NERDA.precooked import EN_ELECTRA_EN
model = EN_ELECTRA_EN()



(Down)load network:

In [None]:

model.download_network()
model.load_network()


This model performs much better:

In [None]:
model.evaluate_performance(get_conll_data('test', 100))

Predict named entities in new texts

In [None]:
text = 'Old MacDonald had a farm'
model.predict_text(text)


### List of Precooked Models

The table below shows the precooked `NERDA` models publicly available for download. We have trained models for Danish and English.


| **Model**       | **Language** | **Transformer**   | **Dataset** | **F1-score** |  
|-----------------|--------------|-------------------|---------|-----|
| `DA_BERT_ML`    | Danish       | [Multilingual BERT](https://huggingface.co/bert-base-multilingual-uncased) | [DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/docs/datasets.md#dane) | xx.x  | 
`DA_ELECTRA_DA` | Danish       | [Danish ELECTRA](https://huggingface.co/Maltehb/-l-ctra-danish-electra-small-uncased) | [DaNE](https://github.com/alexandrainst/danlp/blob/master/docs/docs/datasets.md#dane) |yy.y             |
| `EN_BERT_ML`    | English      | [Multilingual BERT](https://huggingface.co/bert-base-multilingual-uncased)| [CoNLL-2003](https://www.clips.uantwerpen.be/conll2003/ner/) | zz.z              |
| `EN_ELECTRA_EN` | Danish       | [English ELECTRA](https://huggingface.co/google/electra-small-discriminator) | [CoNLL-2003](https://www.clips.uantwerpen.be/conll2003/ner/) | pp.p             |

**F1-score** is the micro-averaged F1-score across entity tags and is 
evaluated on the respective tests (that have not been used for training nor
validation of the models).

Note, that we have not spent a lot of time on actually fine-tuning the models,
so there could be room for improvement. If you are able to improve the models,
we will be happy to hear from you and include your `NERDA` model.

## Performance (Obsolete)

The table below summarizes the performance as measured by F1-scores of the model
 configurations, that `NERDA` ships with. 

| **Level**     | **MBERT** | **DABERT** | **ELECTRA** | **XLMROBERTA** | **DISTILMBERT** |
|---------------|-----------|------------|-------------|----------------|-----------------|
| B-PER         | 0.92      | 0.93       | 0.92        | 0.94           | 0.89            |      
| I-PER         | 0.97      | 0.99       | 0.97        | 0.99           | 0.96            |   
| B-ORG         | 0.68      | 0.79       | 0.65        | 0.78           | 0.66            |     
| I-ORG         | 0.67      | 0.79       | 0.72        | 0.77           | 0.61            |   
| B-LOC         | 0.86      | 0.85       | 0.79        | 0.87           | 0.80            |     
| I-LOC         | 0.33      | 0.32       | 0.44        | 0.24           | 0.29            |     
| B-MISC        | 0.73      | 0.74       | 0.61        | 0.77           | 0.70            |     
| I-MISC        | 0.70      | 0.86       | 0.65        | 0.91           | 0.61            |   
| **AVG_MICRO** | 0.81      | 0.85       | 0.79        | 0.86           | 0.78            |      
| **AVG_MACRO** | 0.73      | 0.78       | 0.72        | 0.78           | 0.69            |