# NLP Part of Speech Tags with Flair
* Notebook by Adam Lang
* Date: 8/8/2024

# OVERVIEW
* In this notebook we will conduct experiments with the Flair NLP library for POS tagging.
* Flair is an NLP library built on top of PyTorch. It is similar to spacy and NLTK and TextBlob in that it works "out of the box", but it is different in that it allows users to utilize contextual word embeddings for semantic understanding.

In [3]:
!pip install flair

Collecting flair
  Downloading flair-0.14.0-py3-none-any.whl.metadata (12 kB)
Collecting boto3>=1.20.27 (from flair)
  Downloading boto3-1.34.157-py3-none-any.whl.metadata (6.6 kB)
Collecting conllu<5.0.0,>=4.0 (from flair)
  Downloading conllu-4.5.3-py2.py3-none-any.whl.metadata (19 kB)
Collecting deprecated>=1.2.13 (from flair)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl.metadata (5.4 kB)
Collecting ftfy>=6.1.0 (from flair)
  Downloading ftfy-6.2.3-py3-none-any.whl.metadata (7.8 kB)
Collecting langdetect>=1.0.9 (from flair)
  Downloading langdetect-1.0.9.tar.gz (981 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m981.5/981.5 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting mpld3>=0.3 (from flair)
  Downloading mpld3-0.5.10-py3-none-any.whl.metadata (5.1 kB)
Collecting pptree>=3.1 (from flair)
  Downloading pptree-3.1.tar.gz (3.0 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Col

In [4]:
#imports
from flair.models import SequenceTagger
from flair.data import Sentence

## Load a flair model
* We can load the `pos-english` model directly from huggingface as seen here: https://huggingface.co/flair/pos-english
* However, there are numerous other models available via the Flair NLP github: https://github.com/flairNLP/flair/tree/master?tab=readme-ov-file

Note: You need a huggingface token to do load model this way.

In [5]:
# load the flair model POS tagger via huggingface
tagger = SequenceTagger.load('flair/pos-english')

pytorch_model.bin:   0%|          | 0.00/249M [00:00<?, ?B/s]

2024-08-08 20:33:15,128 SequenceTagger predicts: Dictionary with 53 tags: <unk>, O, UH, ,, VBD, PRP, VB, PRP$, NN, RB, ., DT, JJ, VBP, VBG, IN, CD, NNS, NNP, WRB, VBZ, WDT, CC, TO, MD, VBN, WP, :, RP, EX, JJR, FW, XX, HYPH, POS, RBR, JJS, PDT, NNPS, RBS, AFX, WP$, -LRB-, -RRB-, ``, '', LS, $, SYM, ADD


In [6]:
# take user input
user_input = "I lost my debit card. How do I report it?"

In [8]:
# Create a sentence object
sentence = Sentence(user_input)

# predict POS tags
tagger.predict(sentence)

# access POS tags in sentence
for token in sentence:
  print(token)

Token[0]: "I" → PRP (1.0000)
Token[1]: "lost" → VBD (1.0000)
Token[2]: "my" → PRP$ (1.0000)
Token[3]: "debit" → NN (1.0000)
Token[4]: "card" → NN (1.0000)
Token[5]: "." → . (0.9342)
Token[6]: "How" → WRB (1.0000)
Token[7]: "do" → VBP (0.9898)
Token[8]: "I" → PRP (1.0000)
Token[9]: "report" → VB (1.0000)
Token[10]: "it" → PRP (1.0000)
Token[11]: "?" → . (0.9999)


## Using Embeddings with Flair
* There are numerous embedding models available in Flair, see documentation: https://github.com/flairNLP/flair/tree/master/resources/docs/embeddings
* Below we will demo the transformer embeddings.

In [13]:
from flair.embeddings import TransformerDocumentEmbeddings

# init embedding
embedding = TransformerDocumentEmbeddings('roberta-base')

# create a sentence
sentence = Sentence('I lost my debit card. How do I report it?')

# embed words in sentence
embedding.embed(sentence)

[Sentence[12]: "I lost my debit card. How do I report it?"]

In [14]:
## now try predicting pos tag
tagger.predict(sentence)

for token in sentence:
  print(token)

Token[0]: "I" → PRP (1.0000)
Token[1]: "lost" → VBD (1.0000)
Token[2]: "my" → PRP$ (1.0000)
Token[3]: "debit" → NN (1.0000)
Token[4]: "card" → NN (1.0000)
Token[5]: "." → . (0.9342)
Token[6]: "How" → WRB (1.0000)
Token[7]: "do" → VBP (0.9898)
Token[8]: "I" → PRP (1.0000)
Token[9]: "report" → VB (1.0000)
Token[10]: "it" → PRP (1.0000)
Token[11]: "?" → . (0.9999)


# References
* Flair github: https://github.com/flairNLP/flair/tree/master?tab=readme-ov-file
* Flair configs: https://github.com/flairNLP/flair/blob/master/resources/docs/EXPERIMENTS.md
* Pandey, 2023. The Power of NLP with Flair: A Comprehensive Guide and Comparison with other libraries. link: https://medium.com/@pankaj_pandey/the-power-of-nlp-with-flair-a-comprehensive-guide-and-comparison-with-other-libraries-d99875595396