# Named entity recognition

Named entity recognition refers to the problem of extracting short fragments of texts and classifying them. Today, we will learn a new framework called FLAIR (we discussed this framework in our lecture). 

First, we will try to use a pretrained model using the FLAIR framework.

**Assignment 1**
Please visit the FLAIR website and read the documentation of FLAIR related to tagging entities: https://flairnlp.github.io/docs/tutorial-basics/tagging-entities . Use the code provided there to tag some example input.

In [4]:
from flair.data import Sentence
from flair.nn import Classifier

# make a sentence
sentence = Sentence('George Washington went to Washington.')

# load the NER tagger
tagger = Classifier.load('ner-large')

# run NER over sentence
tagger.predict(sentence)

# print the sentence with all annotations
print(sentence)

2025-05-06 23:26:44,270 SequenceTagger predicts: Dictionary with 20 tags: <unk>, O, S-ORG, S-MISC, B-PER, E-PER, S-LOC, B-ORG, E-ORG, I-PER, S-PER, B-MISC, I-MISC, E-MISC, I-ORG, B-LOC, E-LOC, I-LOC, <START>, <STOP>
Sentence[6]: "George Washington went to Washington." → ["George Washington"/PER, "Washington"/LOC]


**(Optional)** Of course, most often, we would like to train our own tagger. The description providing details on this process can be found in a great blogpost (if you see a paywall you can open the website in the incognito mode). 

https://medium.com/thecyphy/training-custom-ner-model-using-flair-df1f9ea9c762

However, training a custom FLAIR model is not required in this labs.


**Assignment 2** Named Entity Recognition models can be also prepared using BERT and HuggingFace transformers library!

To see how we can use transformers to solve a NER problem, we will use the notebook provided by Niels Rogge from HuggingFace. https://github.com/NielsRogge/Transformers-Tutorials . The notebook we will use is uploaded to eKursy along this "main" notebook. Please follow the instructions in this other notebook and copy-and-paste appropriate cell output as described below.


One of the code cells provided in this notebook is the following one:
```
from seqeval.metrics import classification_report

print(classification_report(labels, predictions))
```
If you manage to follow this tutorial, this pair of lines will produce evaluation metrics. Copy-and-paste them into the cell below.

                        precision    recall  f1-score   support
         geo             0.79      0.88      0.83      4613
         gpe             0.89      0.89      0.89      1523
         org             0.71      0.56      0.63      2761
         per             0.78      0.81      0.79      2183
         tim             0.82      0.81      0.81      1772
      micro avg          0.79      0.79      0.79     12852
      macro avg          0.80      0.79      0.79     12852
      weighted avg       0.79      0.79      0.79     12852


                precision    recall  f1-score   support
        geo       0.84      0.83      0.84     11232
        gpe       0.93      0.89      0.91      3293
        org       0.61      0.64      0.62      6531
        per       0.76      0.79      0.77      5196
        tim       0.84      0.76      0.80      4360
        micro avg       0.78      0.78      0.78     30612
        macro avg       0.79      0.78      0.79     30612
        weighted avg       0.79      0.78      0.78     30612