# FlairNLP

FlairNLP is a simple Pytorch based framework that facilitates the application of state-of-art natural language processing models such as part of speech tagging, name entity recognition, emerging entity detection , text classification etc to our texts. It also supports growing number of languages as it includes 'one model, many languages' taggers, i.e. single models that predict PoS or NER tags for input text in various languages.

The framework presents a simple interface for different types of word and document embeddings which hides all embedding specific complexit and allows users to mix and match various embeddings as a stacked embedding to improve the performance.

The framework also implements standard model training and hyperparameter selection routines, as well as a data fetching module that can download publicly available NLP datasets and convert them into data structures for quick set up of experiments


### Installation 
For installing flair on your system simply do :

pip install flair

For anaconda environment follow steps in below link :
https://medium.com/@taras.priadka/how-to-install-flair-for-jupyter-notebook-755929c5f04f

Note: This requires you to install pytorch with version 1.2.0. You can use below command for correct version :

conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

##### NLP Base types 

Two major objects discussed here are: Sentence and Token Objects.
The sentence objects holds a sentence that we may want to embed or tag. A sentence is a list of token. 
We can have a sentence already tokenized ( a whitespace tokenized string) or can use a customised tokeniser to tokenise the sentence with the help of flair.  

Both cases are shown below: 
1) A Whitespace tokenised sentence is taken and we count the number of tokens.
2) A untokenised sentence is taken and a customised tokeniser is passed and we count the number of tokens in the string.

In [33]:
from flair.data import Sentence
sentence = Sentence('Whitespaced tokenised string here .')
print(sentence)

#Access token using token id or index
print(sentence.get_token(4))
print(sentence[3])

print("\nPrint all the token in the string")
for token in sentence:
    print(token)

Sentence: "Whitespaced tokenised string here ." - 5 Tokens
Token: 4 here
Token: 4 here

Print all the token in the string
Token: 1 Whitespaced
Token: 2 tokenised
Token: 3 string
Token: 4 here
Token: 5 .


In [21]:
#Untokenised strin -- As there is no space between here and . 
#Using a default tokeniser 
sentence = Sentence('Untokenised string here.', use_tokenizer=True)
print(sentence)

#Adding Custom Tokenizers
from flair.data import Sentence, segtok_tokenizer
sentence = Sentence('Untokenised string here.', use_tokenizer=segtok_tokenizer)
print(sentence)

Sentence: "Untokenised string here ." - 4 Tokens
Sentence: "Untokenised string here ." - 4 Tokens


##### Adding Tags to Tokens
A Token has fields for linguistic annotation, such as lemmas, part-of-speech tags or named entity tags. 
Tags can be added to token by specifying the tag type and the tag value.
In the example below, We are adding a NER (name-entity Recognition) tag of type 'place' to word Texas and tag of type 'person' to Bob.

Each tag is of class Label which consists of value and score. Score here indicates confidence.
In our example below, all the tags have confidence 1 as we have manually added the tag.If a tag is predicted by a sequence labeler, the score indicates classifier confidence.

In [29]:
sentence = Sentence('Bob visited Texas')
sentence[2].add_tag('ner', 'place')
sentence[0].add_tag('ner', 'person')
# print the sentence with all tags.
print(sentence.to_tagged_string())

token = sentence[2]
tag = token.get_tag('ner')

# print token alongwth the tag value and score
print(f'"{token}" is tagged as "{tag.value}" with confidence score "{tag.score}"')

Bob <person> visited Texas <place>
"Token: 3 Texas" is tagged as "place" with confidence score "1.0"


#### Adding Labels to Sentences
We can add labels to our sentences which can be useful in text classfication tasks. 

In [32]:
sentence = Sentence('This is Computer Science Department of Texas A&M Univerisity college Station')

# adding label to a sentence
sentence.add_label('Education')

#adding more than one labels to a sentence
sentence.add_labels(['College', 'Education'])

# Adding labels while initialising a sentence
sentence = Sentence('This is Computer Science Department of Texas A&M Univerisity college Station', labels=['College', 'Education'])

print("Printing the labels")
for label in sentence.labels:
    print(label)

Printing the labels
College (1.0)
Education (1.0)


#### Tagging Texts with Pre-Trained Sequence Tagging Models

Flair provides various pre-trained models for Named Entity Recognition, Syntactic Chunking, Part-of-Speech Tagging, Semantic Frame Detection. These models are trained on a particular dataset. These pre-trained models can be used to tag our texts.Flair also provides some smaller models that can run faster on CPU.

Also, it has multi-lingual support as it distributes models that are capable of handling text in multiple language.
The NER models are trained over 4 languages (English, German, Dutch and Spanish) and the PoS models over 12 languages (English, German, French, Italian, Dutch, Polish, Spanish, Swedish, Danish, Norwegian, Finnish and Czech).

Find the details of these models in below table.

| ID | Task | Training Dataset |
| --- | --- | --- |
| 'ner' | 4-class Named Entity Recognition | Conll-03 |
| 'ner-ontonotes' | 18-class Named Entity Recognition | Ontonotes|
| 'pos' | Part-of-Speech Tagging | Ontonotes|
| 'chunk' | Syntactic Chunking	| Conll-2000 |
| 'frame' | Semantic Frame Detection | Propbank 3.0|
| 'ner-fast' | 4-class Named Entity Recognition | Conll-03 |
| 'ner-multi' | 4-class Named Entity Recognition | Conll-03 (4 languages) |
| 'de-ner' | German Language 4-class Named Entity Recognition | Conll-03(German) |
| 'pos-multi' | Part-of-Speech Tagging for multi-lingual | Universal Dependency Treebank (12 languages) |

In the example below, pre-trained model for named entity recognition (NER) is used. This model is trained over the English CoNLL-03 task and can recognize 4 different entity types.

Different pre-trained model can be used by passing the corresponding ID( from the above table) to the load method in SequenceTagged class. Then we can use predict method of the tagger to add the predicted tags to the token in the given sentence.
Also, some sequence labeller annotate spans which consists of more than one word.Using get_spans method, we get list of these spans. Each Span has a text, a value, position in the sentence and score that indicating prediction confidence. We can get all these details using to_dict method.

In [46]:
from flair.models import SequenceTagger
tagger_model = SequenceTagger.load('ner')

sentence = Sentence('Bob Marley visited Texas .')
tagger.predict(sentence)

print("\nPrint sentence with predicted tags\n")
print(sentence.to_tagged_string())

print("\nPrint Span for each tag type")
for entity in sentence.get_spans('ner'):
    print(entity)
print("\nPrint Span Details ")
print(sentence.to_dict(tag_type='ner'))

2020-03-26 21:34:30,865 loading file C:\Users\DARKHO\.flair\models\en-ner-conll03-v0.4.pt

Print sentence with predicted tags

Bob <B-PER> Marley <E-PER> visited Texas <S-LOC> .

Print Span for each tag type
PER-span [1,2]: "Bob Marley"
LOC-span [4]: "Texas"

Print Span Details 
{'text': 'Bob Marley visited Texas .', 'labels': [], 'entities': [{'text': 'Bob Marley', 'start_pos': 0, 'end_pos': 10, 'type': 'PER', 'confidence': 0.9997925162315369}, {'text': 'Texas', 'start_pos': 19, 'end_pos': 24, 'type': 'LOC', 'confidence': 0.9999349117279053}]}


###### Example for a german sentence : here 'de-ner' model is used. 

In [61]:
tagger = SequenceTagger.load('de-ner')
sentence = Sentence('Bob Marley ging nach Texas .')
tagger.predict(sentence)
print("\nPrint sentence with predicted tags\n")
print(sentence.to_tagged_string())

2020-03-26 22:31:04,539 loading file C:\Users\DARKHO\.flair\models\de-ner-conll03-v0.4.pt

Print sentence with predicted tags

Bob <B-PER> Marley <E-PER> ging nach Texas <S-LOC> .


Example for a multi lingual Text : Here we are taking a sentence with combination of English and German words. For this case we use "pos-multi" model.

In [48]:
tagger = SequenceTagger.load('pos-multi')
sentence = Sentence('Bob Marly visited Texas. Er ging reiten')
tagger.predict(sentence)
print("\nPrint sentence with predicted tags\n")
print(sentence.to_tagged_string())

2020-03-26 21:46:46,593 https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/models-v0.4/release-dodekapos-512-l2-multi/pos-multi-v0.1.pt not found in cache, downloading to C:\Users\DARKHO\AppData\Local\Temp\tmpv_lj_qhe


100%|████████████████████████████████████████████████████████████████| 314055714/314055714 [01:11<00:00, 4369834.26B/s]

2020-03-26 21:47:59,349 copying C:\Users\DARKHO\AppData\Local\Temp\tmpv_lj_qhe to cache at C:\Users\DARKHO\.flair\models\pos-multi-v0.1.pt





2020-03-26 21:47:59,934 removing temp file C:\Users\DARKHO\AppData\Local\Temp\tmpv_lj_qhe
2020-03-26 21:47:59,972 loading file C:\Users\DARKHO\.flair\models\pos-multi-v0.1.pt

Print sentence with predicted tags

Bob <PROPN> Marly <PROPN> visited <VERB> Texas. <PROPN> Er <ADV> ging <VERB> reiten <VERB>


#### Tagging with Pre-Trained Text Classification Models
Flair provides some pre-trained text classifier models also. Details can be found below.

| ID | Task | Training Dataset |
| --- | --- | --- |
| 'en-sentiment' | detecting positive and negative sentiment in English | movie reviews from IMDB |
| ''de-offensive-language' | detecting offensive language in German | GermEval 2018 Task 1|


Here, We have taken an example of detecting negative and positive reviews using 'en-sentiment' model.

In [49]:
from flair.models import TextClassifier
classifier = TextClassifier.load('en-sentiment')
sentence = Sentence('This film doesnt make any sense. It is so bad that I am confused.')

classifier.predict(sentence)

# print sentence with predicted labels
print(sentence.labels)

2020-03-26 21:54:23,348 https://s3.eu-central-1.amazonaws.com/alan-nlp/resources/models-v0.4/classy-imdb-en-rnn-cuda%3A0/imdb-v0.4.pt not found in cache, downloading to C:\Users\DARKHO\AppData\Local\Temp\tmpfzxddapc


100%|██████████████████████████████████████████████████████████████| 1501979561/1501979561 [05:50<00:00, 4279875.25B/s]

2020-03-26 22:00:14,974 copying C:\Users\DARKHO\AppData\Local\Temp\tmpfzxddapc to cache at C:\Users\DARKHO\.flair\models\imdb-v0.4.pt





2020-03-26 22:00:24,394 removing temp file C:\Users\DARKHO\AppData\Local\Temp\tmpfzxddapc
2020-03-26 22:00:24,473 loading file C:\Users\DARKHO\.flair\models\imdb-v0.4.pt
[NEGATIVE (0.9980487823486328)]


#### Word Embeddings with Flair
Word Embeddings are the texts converted into numbers and there may be different numerical representations of the same text.
Flair provides interfaces that allow to use and combine different word and document embedding like Flair embeddings, BERT embeddings etc.

Classic word embeddings are static and word-level, meaning that each distinct word gets exactly one pre-computed embedding.

Contextual string embeddings are powerful embeddings that capture latent syntactic-semantic information that goes beyond standard word embeddings. Key differences are: (1) they are trained without any explicit notion of words and thus fundamentally model words as sequences of characters. And (2) they are contextualized by their surrounding text, meaning that the same word will have different embeddings depending on its contextual use.

Stacked embeddings are one of the most important concepts of this library. You can use them to combine different embeddings together, for instance if you want to use both traditional embeddings together with contextual string embeddings. Stacked embeddings allow you to mix and match. We find that a combination of embeddings often gives best results.

In [60]:
from flair.embeddings import WordEmbeddings, CharacterEmbeddings
from flair.embeddings import WordEmbeddings, FlairEmbeddings, StackedEmbeddings
#clasical word embedding
glove_embedding = WordEmbeddings('glove')

# Flair forward and backwards embeddings
flair_embedding_forward = FlairEmbeddings('news-forward')
flair_embedding_backward = FlairEmbeddings('news-backward')

# create a StackedEmbedding combining glove and forward/backward flair embeddings
stacked_embeddings = StackedEmbeddings([
                                        glove_embedding,
                                        flair_embedding_forward,
                                        flair_embedding_backward,
                                       ])
sentence = Sentence('The grass is green .')
stacked_embeddings.embed(sentence)
for token in sentence:
    print(token)
    print(token.embedding)

Token: 1 The
tensor([-0.0382, -0.2449,  0.7281,  ..., -0.0065, -0.0053,  0.0090])
Token: 2 grass
tensor([-0.8135,  0.9404, -0.2405,  ...,  0.0354, -0.0255, -0.0143])
Token: 3 is
tensor([-5.4264e-01,  4.1476e-01,  1.0322e+00,  ..., -5.3691e-04,
        -9.6750e-03, -2.7541e-02])
Token: 4 green
tensor([-0.6791,  0.3491, -0.2398,  ..., -0.0007, -0.1333,  0.0161])
Token: 5 .
tensor([-0.3398,  0.2094,  0.4635,  ...,  0.0005, -0.0177,  0.0032])


#### Training a Model ( A Text Classifier or Sequence Labeling Model)
First, We need to load our training data. We call collection of text as Corpus which can be splitted into training, validation and test data. Corpus can be used to train and test our model.

Flair provides a list of prepared datasets and automatically downloads and sets up the data on the first call of  the corresponding constructor ID. Example of one such dataset is given below: 

In [67]:
import flair.datasets
corpus = flair.datasets.UD_ENGLISH()
# print the number of Sentences in the train split
print(len(corpus.train))

# print the number of Sentences in the test split
print(len(corpus.test))

# print the number of Sentences in the dev split
print(len(corpus.dev))

2020-03-26 22:41:05,172 Reading data from C:\Users\DARKHO\.flair\datasets\ud_english
2020-03-26 22:41:05,173 Train: C:\Users\DARKHO\.flair\datasets\ud_english\en_ewt-ud-train.conllu
2020-03-26 22:41:05,174 Test: C:\Users\DARKHO\.flair\datasets\ud_english\en_ewt-ud-test.conllu
2020-03-26 22:41:05,175 Dev: C:\Users\DARKHO\.flair\datasets\ud_english\en_ewt-ud-dev.conllu
12543
2077
2002


#### Reading your own dataset 
Instead of using provided dataset, we can also load our own dataset in Flair and use it for training and testing.

We can load our own dataset using ColumnCorpus object. Most sequence labeling datasets in NLP use some sort of column format in which each line is a word and each column is one level of linguistic annotation.For example: The first column can be the word , the second column has PoS tags, and the third can have NER tags. Empty line separates sentences. To read such a dataset,we have to define the column structure as a dictionary and instantiate a ColumnCorpus.

In [None]:
from flair.data import Corpus
from flair.datasets import ColumnCorpus

# define columns
columns = {0: 'text', 1: 'pos', 2: 'ner'}

# provide the path for train, test and validation file.
data_folder = '/home/Users/Darakshan/IRS_spotlight/'
corpus: Corpus = ColumnCorpus(data_folder, columns,
                              train_file='train.txt',
                              test_file='test.txt',
                              dev_file='dev.txt')

#### Training Model
Below we have taken an example of training Sequence Labeling Model of name-entity recognition type. For training, one of the flair provided dataset i.e WNUT_17. This has six classes of enitity i.e Person, Location, Corporation, Product, Creative Work and Group. In our example, dataset is downsampled by 10 percent for fast computation. Glove embedding provided by flair is used here. 

We initiate the model class i.e Sequeunce Tagger and pass the embedding being used,tag type of the model, loss function to be used,hidden layer size and the tag dictionary to be used. We made a tag dictionary from the corpus using method make_tag_dictionary method. We pass tag type as NER as we want to train a Name entity recognition type of squence tagger. 

Then this model is passed the corpus on which it has to be trained and is trained with a fixed hyperparamaters. We can optimise our model by changing these hyperparameters such as learning rate, number of epochs.

In [81]:
from flair.data import Corpus
from flair.datasets import WNUT_17
from flair.embeddings import TokenEmbeddings, WordEmbeddings, StackedEmbeddings
from flair.models import SequenceTagger
from flair.trainers import ModelTrainer
from typing import List
# loading the corpus
corpus: Corpus = WNUT_17().downsample(0.1)
print(corpus)
tag_type = 'ner'
# make the tag dictionary from the corpus
tag_dictionary = corpus.make_tag_dictionary(tag_type=tag_type)
print(tag_dictionary)
# initialize embeddings to be used 
embedding_types: List[TokenEmbeddings] = [

    WordEmbeddings('glove'),
]

embeddings: StackedEmbeddings = StackedEmbeddings(embeddings=embedding_types)

#initialize sequence tagger
tagger: SequenceTagger = SequenceTagger(hidden_size=256,
                                        embeddings=embeddings,
                                        tag_dictionary=tag_dictionary,
                                        tag_type=tag_type,
                                        use_crf=True)
#initialize trainer
trainer: ModelTrainer = ModelTrainer(tagger, corpus)
# start training
trainer.train('resources/taggers/example-ner',
              learning_rate=0.1,
              mini_batch_size=32,
              max_epochs=50)

2020-03-26 23:24:31,074 Reading data from C:\Users\DARKHO\.flair\datasets\wnut_17
2020-03-26 23:24:31,077 Train: C:\Users\DARKHO\.flair\datasets\wnut_17\wnut17train.conll
2020-03-26 23:24:31,080 Dev: C:\Users\DARKHO\.flair\datasets\wnut_17\emerging.dev.conll
2020-03-26 23:24:31,081 Test: C:\Users\DARKHO\.flair\datasets\wnut_17\emerging.test.annotated
Corpus: 339 train + 101 dev + 129 test sentences
Dictionary with 28 tags: <unk>, O, S-person, S-corporation, S-location, B-person, E-person, B-location, E-location, B-corporation, E-corporation, S-creative-work, B-creative-work, I-creative-work, E-creative-work, I-location, S-group, B-product, I-product, E-product, I-person, B-group, E-group, S-product, I-group, I-corporation, <START>, <STOP>
2020-03-26 23:24:35,685 ----------------------------------------------------------------------------------------------------
2020-03-26 23:24:35,686 Model: "SequenceTagger(
  (embeddings): StackedEmbeddings(
    (list_embedding_0): WordEmbeddings('glo

2020-03-26 23:25:23,730 epoch 5 - iter 2/11 - loss 5.98095655 - samples/sec: 45.32
2020-03-26 23:25:24,336 epoch 5 - iter 3/11 - loss 6.34879907 - samples/sec: 63.79
2020-03-26 23:25:24,932 epoch 5 - iter 4/11 - loss 5.93334746 - samples/sec: 63.16
2020-03-26 23:25:25,577 epoch 5 - iter 5/11 - loss 5.54926205 - samples/sec: 57.19
2020-03-26 23:25:26,189 epoch 5 - iter 6/11 - loss 5.47720019 - samples/sec: 61.23
2020-03-26 23:25:27,039 epoch 5 - iter 7/11 - loss 5.27363477 - samples/sec: 43.89
2020-03-26 23:25:27,608 epoch 5 - iter 8/11 - loss 5.20725775 - samples/sec: 67.41
2020-03-26 23:25:28,203 epoch 5 - iter 9/11 - loss 5.37558900 - samples/sec: 63.28
2020-03-26 23:25:28,846 epoch 5 - iter 10/11 - loss 5.55969872 - samples/sec: 58.55
2020-03-26 23:25:29,309 epoch 5 - iter 11/11 - loss 5.65306412 - samples/sec: 86.25
2020-03-26 23:25:29,408 ----------------------------------------------------------------------------------------------------
2020-03-26 23:25:29,410 EPOCH 5 done: loss 

2020-03-26 23:26:35,526 epoch 11 - iter 2/11 - loss 4.38022470 - samples/sec: 52.17
2020-03-26 23:26:36,363 epoch 11 - iter 3/11 - loss 5.07503859 - samples/sec: 43.30
2020-03-26 23:26:37,146 epoch 11 - iter 4/11 - loss 4.58468968 - samples/sec: 46.91
2020-03-26 23:26:37,746 epoch 11 - iter 5/11 - loss 4.08287988 - samples/sec: 63.66
2020-03-26 23:26:38,523 epoch 11 - iter 6/11 - loss 4.30796182 - samples/sec: 47.12
2020-03-26 23:26:39,290 epoch 11 - iter 7/11 - loss 4.67478858 - samples/sec: 47.53
2020-03-26 23:26:40,042 epoch 11 - iter 8/11 - loss 4.85528252 - samples/sec: 49.06
2020-03-26 23:26:40,678 epoch 11 - iter 9/11 - loss 5.03357991 - samples/sec: 58.55
2020-03-26 23:26:41,302 epoch 11 - iter 10/11 - loss 4.95337565 - samples/sec: 59.75
2020-03-26 23:26:41,833 epoch 11 - iter 11/11 - loss 5.01884762 - samples/sec: 72.63
2020-03-26 23:26:41,925 ----------------------------------------------------------------------------------------------------
2020-03-26 23:26:41,927 EPOCH 11 

2020-03-26 23:27:44,868 epoch 17 - iter 2/11 - loss 5.96031761 - samples/sec: 51.92
2020-03-26 23:27:45,565 epoch 17 - iter 3/11 - loss 5.84162712 - samples/sec: 53.57
2020-03-26 23:27:46,142 epoch 17 - iter 4/11 - loss 5.65354919 - samples/sec: 65.35
2020-03-26 23:27:46,720 epoch 17 - iter 5/11 - loss 5.20449018 - samples/sec: 65.48
2020-03-26 23:27:47,260 epoch 17 - iter 6/11 - loss 4.93186577 - samples/sec: 70.83
2020-03-26 23:27:47,852 epoch 17 - iter 7/11 - loss 4.99751588 - samples/sec: 63.79
2020-03-26 23:27:48,468 epoch 17 - iter 8/11 - loss 4.95588726 - samples/sec: 60.77
2020-03-26 23:27:49,110 epoch 17 - iter 9/11 - loss 4.78222182 - samples/sec: 58.23
2020-03-26 23:27:49,936 epoch 17 - iter 10/11 - loss 4.66078951 - samples/sec: 43.71
2020-03-26 23:27:50,610 epoch 17 - iter 11/11 - loss 4.85121053 - samples/sec: 55.51
2020-03-26 23:27:50,711 ----------------------------------------------------------------------------------------------------
2020-03-26 23:27:50,713 EPOCH 17 

2020-03-26 23:29:00,707 epoch 23 - iter 1/11 - loss 4.41720009 - samples/sec: 58.44
2020-03-26 23:29:01,370 epoch 23 - iter 2/11 - loss 4.76208043 - samples/sec: 56.79
2020-03-26 23:29:02,051 epoch 23 - iter 3/11 - loss 5.32447704 - samples/sec: 53.65
2020-03-26 23:29:02,637 epoch 23 - iter 4/11 - loss 5.36186349 - samples/sec: 64.69
2020-03-26 23:29:03,303 epoch 23 - iter 5/11 - loss 5.42072172 - samples/sec: 55.42
2020-03-26 23:29:03,952 epoch 23 - iter 6/11 - loss 5.11152105 - samples/sec: 57.19
2020-03-26 23:29:04,526 epoch 23 - iter 7/11 - loss 5.33630831 - samples/sec: 65.48
2020-03-26 23:29:05,156 epoch 23 - iter 8/11 - loss 5.13060927 - samples/sec: 58.76
2020-03-26 23:29:05,846 epoch 23 - iter 9/11 - loss 4.81907543 - samples/sec: 53.12
2020-03-26 23:29:06,535 epoch 23 - iter 10/11 - loss 4.61036334 - samples/sec: 53.83
2020-03-26 23:29:06,933 epoch 23 - iter 11/11 - loss 4.71408289 - samples/sec: 102.51
2020-03-26 23:29:07,023 -------------------------------------------------

2020-03-26 23:30:09,674 epoch 29 - iter 1/11 - loss 3.06639290 - samples/sec: 67.98
2020-03-26 23:30:10,299 epoch 29 - iter 2/11 - loss 3.69753551 - samples/sec: 59.09
2020-03-26 23:30:10,933 epoch 29 - iter 3/11 - loss 4.16075579 - samples/sec: 59.20
2020-03-26 23:30:11,518 epoch 29 - iter 4/11 - loss 3.63665193 - samples/sec: 64.56
2020-03-26 23:30:12,139 epoch 29 - iter 5/11 - loss 4.37444320 - samples/sec: 59.86
2020-03-26 23:30:12,827 epoch 29 - iter 6/11 - loss 4.52557512 - samples/sec: 53.12
2020-03-26 23:30:13,476 epoch 29 - iter 7/11 - loss 4.39445301 - samples/sec: 57.30
2020-03-26 23:30:14,141 epoch 29 - iter 8/11 - loss 4.45169255 - samples/sec: 55.70
2020-03-26 23:30:14,812 epoch 29 - iter 9/11 - loss 4.68458300 - samples/sec: 55.70
2020-03-26 23:30:15,537 epoch 29 - iter 10/11 - loss 4.77451117 - samples/sec: 50.85
2020-03-26 23:30:16,085 epoch 29 - iter 11/11 - loss 4.64633879 - samples/sec: 69.45
2020-03-26 23:30:16,171 --------------------------------------------------

2020-03-26 23:31:25,069 epoch 35 - iter 1/11 - loss 2.90722179 - samples/sec: 45.25
2020-03-26 23:31:25,818 epoch 35 - iter 2/11 - loss 3.51498508 - samples/sec: 49.29
2020-03-26 23:31:26,508 epoch 35 - iter 3/11 - loss 4.17668851 - samples/sec: 53.57
2020-03-26 23:31:27,195 epoch 35 - iter 4/11 - loss 4.49413133 - samples/sec: 54.11
2020-03-26 23:31:27,848 epoch 35 - iter 5/11 - loss 4.72742071 - samples/sec: 56.89
2020-03-26 23:31:28,571 epoch 35 - iter 6/11 - loss 4.77602228 - samples/sec: 50.45
2020-03-26 23:31:29,304 epoch 35 - iter 7/11 - loss 5.12288386 - samples/sec: 49.59
2020-03-26 23:31:29,982 epoch 35 - iter 8/11 - loss 4.77089012 - samples/sec: 54.02
2020-03-26 23:31:30,654 epoch 35 - iter 9/11 - loss 4.86593676 - samples/sec: 54.85
2020-03-26 23:31:31,303 epoch 35 - iter 10/11 - loss 4.76158216 - samples/sec: 57.50
2020-03-26 23:31:31,802 epoch 35 - iter 11/11 - loss 4.59224051 - samples/sec: 79.42
2020-03-26 23:31:31,895 --------------------------------------------------

2020-03-26 23:32:36,594 epoch 41 - iter 1/11 - loss 3.82429576 - samples/sec: 57.24
2020-03-26 23:32:37,760 epoch 41 - iter 2/11 - loss 4.69402993 - samples/sec: 30.71
2020-03-26 23:32:38,667 epoch 41 - iter 3/11 - loss 4.69954848 - samples/sec: 39.68
2020-03-26 23:32:39,653 epoch 41 - iter 4/11 - loss 4.45952392 - samples/sec: 36.37
2020-03-26 23:32:40,274 epoch 41 - iter 5/11 - loss 4.55916910 - samples/sec: 61.06
2020-03-26 23:32:40,904 epoch 41 - iter 6/11 - loss 4.71630939 - samples/sec: 59.86
2020-03-26 23:32:41,485 epoch 41 - iter 7/11 - loss 4.70949759 - samples/sec: 65.48
2020-03-26 23:32:42,076 epoch 41 - iter 8/11 - loss 4.79695046 - samples/sec: 64.30
2020-03-26 23:32:42,799 epoch 41 - iter 9/11 - loss 4.84136746 - samples/sec: 50.77
2020-03-26 23:32:43,415 epoch 41 - iter 10/11 - loss 4.64022799 - samples/sec: 61.58
2020-03-26 23:32:43,918 epoch 41 - iter 11/11 - loss 4.88631569 - samples/sec: 78.26
2020-03-26 23:32:44,010 --------------------------------------------------

{'test_score': 0.0,
 'dev_score_history': [0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0,
  0.0],
 'train_loss_history': [18.616629340431906,
  6.977936441248113,
  6.103660691868175,
  5.9199065728621045,
  5.653064120899547,
  5.406765482642434,
  5.209765672683716,
  5.154167803851041,
  5.020516308871183,
  5.193517945029519,
  5.018847617236051,
  4.947862300005826,
  5.008049639788541,
  4.682003498077393,
  4.6183999668468125,
  4.758621801029552,
  4.851210529153997,
  4.823190732435747,
  4.818995757536455,
  4.657577211206609,
  4.733274709094655,
  4.691777023402127,
  4.714082891290838,
  4.77879296649586,
  4.851414507085627,
  4.7708283771168105,
  4.7026496367021045,
  4.798913067037409,
  4.646338787945834,
  4.818568749861284,
  4.55134151198

In [84]:
model = SequenceTagger.load('resources/taggers/example-ner/final-model.pt')
sentence = Sentence('Bob Marley visited Texas')
model.predict(sentence)
print(sentence.to_tagged_string())

2020-03-26 23:34:22,497 loading file resources/taggers/example-ner/final-model.pt
Bob <B-PER> Marley <E-PER> visited Texas <S-LOC>


#### References 
1. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP - https://www.aclweb.org/anthology/N19-4010/
2. Introduction to Flair for NLP: A Simple yet Powerful State-of-the-Art NLP Library:
https://www.analyticsvidhya.com/blog/2019/02/flair-nlp-library-python/
3. Flair Tutrial : https://github.com/flairNLP/flair/blob/master/resources/docs/TUTORIAL_3_WORD_EMBEDDING.md
4. WNUT dataset : Leon Derczynski, Eric Nichols, Marieke van Erp, Nut Limsopatham (2017) “Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition”, in Proceedings of the 3rd Workshop on Noisy, User-generated Text.