<a href="https://colab.research.google.com/github/souparnabose99/flair-sentiment-analysis/blob/main/Flair_Sentiment_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Installing Flair:

Flair is a simple natural language processing (NLP) library developed and open-sourced by Zalando Research. Flair’s framework builds directly on PyTorch, one of the best deep learning frameworks out there. The Zalando Research team has also released several pre-trained models for the following NLP tasks:

* Name-Entity Recognition (NER): It can recognise whether a word represents a person, location or names in the text.
* Parts-of-Speech Tagging (PoS): Tags all the words in the given text as to which “part of speech” they belong to.
* Text Classification: Classifying text based on the criteria (labels)
* Training Custom Models: Making our own custom models.

In [4]:
!pip install flair

Collecting flair
[?25l  Downloading https://files.pythonhosted.org/packages/f0/3a/1b46a0220d6176b22bcb9336619d1731301bc2c75fa926a9ef953e6e4d58/flair-0.8.0.post1-py3-none-any.whl (284kB)
[K     |█▏                              | 10kB 13.2MB/s eta 0:00:01[K     |██▎                             | 20kB 18.8MB/s eta 0:00:01[K     |███▌                            | 30kB 12.8MB/s eta 0:00:01[K     |████▋                           | 40kB 9.8MB/s eta 0:00:01[K     |█████▊                          | 51kB 4.5MB/s eta 0:00:01[K     |███████                         | 61kB 4.6MB/s eta 0:00:01[K     |████████                        | 71kB 5.0MB/s eta 0:00:01[K     |█████████▏                      | 81kB 5.1MB/s eta 0:00:01[K     |██████████▍                     | 92kB 5.2MB/s eta 0:00:01[K     |███████████▌                    | 102kB 5.3MB/s eta 0:00:01[K     |████████████▋                   | 112kB 5.3MB/s eta 0:00:01[K     |█████████████▉                  | 122kB 5.3MB/s et

Steps:
* Initialize the model
* Tokenizing the data
* Process model with tokenized data
* Format outputs

### Initializing the model:

In [6]:
import flair

model = flair.models.TextClassifier.load("en-sentiment")

2021-07-08 05:43:32,407 https://nlp.informatik.hu-berlin.de/resources/models/sentiment-curated-distilbert/sentiment-en-mix-distillbert_4.pt not found in cache, downloading to /tmp/tmp4iggsh8l


100%|██████████| 265512723/265512723 [00:30<00:00, 8782336.47B/s] 

2021-07-08 05:44:03,487 copying /tmp/tmp4iggsh8l to cache at /root/.flair/models/sentiment-en-mix-distillbert_4.pt





2021-07-08 05:44:04,036 removing temp file /tmp/tmp4iggsh8l
2021-07-08 05:44:04,075 loading file /root/.flair/models/sentiment-en-mix-distillbert_4.pt


HBox(children=(FloatProgress(value=0.0, description='Downloading', max=28.0, style=ProgressStyle(description_w…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=231508.0, style=ProgressStyle(descripti…




HBox(children=(FloatProgress(value=0.0, description='Downloading', max=466062.0, style=ProgressStyle(descripti…




The model used is a Distill-Bert Model

### Tokenizing Text Data:

In [7]:
text = "I love to watch football"

sentence = flair.data.Sentence(text)
sentence

Sentence: "I love to watch football"   [− Tokens: 5]

In [8]:
text = "I love to watch football!!"

sentence = flair.data.Sentence(text)
sentence

Sentence: "I love to watch football ! !"   [− Tokens: 7]

In [9]:
sentence.to_tokenized_string()

'I love to watch football ! !'

### Process Model:

In [10]:
model.predict(sentence)
# Predictions are added to sentence object
sentence

Sentence: "I love to watch football ! !"   [− Tokens: 7  − Sentence-Labels: {'label': [POSITIVE (0.9933)]}]

In [11]:
text_2 = "Everone deserves an opportunity in life"
text_3 = "I don't like eating raw eggs."

sentence_2 = flair.data.Sentence(text_2)
sentence_3 = flair.data.Sentence(text_3)

print(sentence_2)
print(sentence_3)

Sentence: "Everone deserves an opportunity in life"   [− Tokens: 6]
Sentence: "I do n't like eating raw eggs ."   [− Tokens: 8]


In [12]:
model.predict(sentence_2)
model.predict(sentence_3)

print(sentence_2)
print(sentence_3)

Sentence: "Everone deserves an opportunity in life"   [− Tokens: 6  − Sentence-Labels: {'label': [POSITIVE (0.9984)]}]
Sentence: "I do n't like eating raw eggs ."   [− Tokens: 8  − Sentence-Labels: {'label': [NEGATIVE (0.9973)]}]


In [13]:
text_4 = "We work on daily basis"

sentence_4 = flair.data.Sentence(text_4)
print(sentence_4)

model.predict(sentence_4)
print(sentence_4)

Sentence: "We work on daily basis"   [− Tokens: 5]
Sentence: "We work on daily basis"   [− Tokens: 5  − Sentence-Labels: {'label': [POSITIVE (0.9936)]}]


### Output Formatting:

In [14]:
sentence_4.get_labels()

[POSITIVE (0.9936)]

In [15]:
sentence_4.get_labels()[0]

POSITIVE (0.9936)

In [16]:
type(sentence_4.get_labels()[0])

flair.data.Label

In [18]:
sentence_4.get_labels()[0].score

0.9935649037361145

In [19]:
sentence_4.get_labels()[0].value

'POSITIVE'

In [21]:
sentence_4.labels[0].score, sentence_4.labels[0].value

(0.9935649037361145, 'POSITIVE')