# TweetNLP Introduction
This colab notebook brings a short introduction of [`tweetnlp`](https://github.com/cardiffnlp/tweetnlp), a python library of NLP models for tweets. In this tutorial, we explain following applications on tweets:
- [Text Classification](https://colab.research.google.com/drive/1KLMaGFLmbXWeM9eWIYgGkRZS0d85RJLu#scrollTo=KAZYjeskBqL4&line=1&uniqifier=1): Sentiment/Hate/Irony/Emoji/Emotion, etc
- [Information Extraction](https://colab.research.google.com/drive/1KLMaGFLmbXWeM9eWIYgGkRZS0d85RJLu#scrollTo=WeREiLEjBlrj&line=1&uniqifier=1): Named Entity Recognition (NER)
- [Language Modeling](https://colab.research.google.com/drive/1KLMaGFLmbXWeM9eWIYgGkRZS0d85RJLu#scrollTo=COOoZHVAFCIG&line=1&uniqifier=1): Masked token prediction


## Installation
TweetNLP is available on pip or can be installed from source.


In [None]:
# Fix Colab Error
!pip install --upgrade google-cloud-storage

In [None]:
# via pip
!pip install tweetnlp

In [None]:
# # via source 
# !git clone https://github.com/cardiffnlp/tweetnlp
# %cd tweetnlp
# !pip install . -U

In [None]:
! pip list | grep tweetnlp

tweetnlp                      0.0.8


All you need is to import `tweetnlp` !

In [None]:
import tweetnlp

## Tweet/Sentence Classification
The classification module consists of six different tasks (Sentiment Analysis, Irony Detection, Hate Detection, Offensive Detection, Emoji Prediction, and Emotion Analysis).
In each example, the model is instantiated by `tweetnlp.load("task-name")`, and run the prediction by giving a text or a list of 
texts.

### Topic Classification
The aim of this task is, given a tweet to assign topics related to its content. The task is formed as a supervised multi-label classification problem where each tweet is assigned one or more topics from a total of 19 available topics. The topics were carefully curated based on Twitter trends with the aim to be broad and general and consist of classes such as: arts and culture, music, or sports. Our internally-annotated dataset contains over 10K manually-labeled tweets.

In [None]:
model = tweetnlp.load('topic_classification')  # Or `model = tweetnlp.TopicClassification()`
model.topic("Jacob Collier is a Grammy-awarded English artist from London.")  # Or `model.predict`

{'label': ['celebrity_&_pop_culture', 'music'],
 'probability': {'arts_&_culture': 0.26981237530708313,
  'business_&_entrepreneurs': 0.01331131812185049,
  'celebrity_&_pop_culture': 0.9566839933395386,
  'diaries_&_daily_life': 0.021030567586421967,
  'family': 0.011442456394433975,
  'fashion_&_style': 0.06922706216573715,
  'film_tv_&_video': 0.14880865812301636,
  'fitness_&_health': 0.019434381276369095,
  'food_&_dining': 0.008309685625135899,
  'gaming': 0.00622565159574151,
  'learning_&_educational': 0.015360673889517784,
  'music': 0.9405961632728577,
  'news_&_social_concern': 0.4283846616744995,
  'other_hobbies': 0.023135246708989143,
  'relationships': 0.014804222621023655,
  'science_&_technology': 0.008933892473578453,
  'sports': 0.006143193691968918,
  'travel_&_adventure': 0.01694614253938198,
  'youth_&_student_life': 0.008365693502128124}}

In [None]:
# multiple inputs
model.topic(
    ["Yes, including Medicare and social security saving👍", "How many more days until opening day? 😩"],
    batch_size=2)  # Or `model.predict`

[{'label': ['news_&_social_concern'],
  'probability': {'arts_&_culture': 0.005358295049518347,
   'business_&_entrepreneurs': 0.03292378783226013,
   'celebrity_&_pop_culture': 0.005184014327824116,
   'diaries_&_daily_life': 0.11234606057405472,
   'family': 0.0035518307704478502,
   'fashion_&_style': 0.0009627862018533051,
   'film_tv_&_video': 0.007193188648670912,
   'fitness_&_health': 0.005338551942259073,
   'food_&_dining': 0.0016663989517837763,
   'gaming': 0.0020337889436632395,
   'learning_&_educational': 0.007214653305709362,
   'music': 0.0034923008643090725,
   'news_&_social_concern': 0.9829286336898804,
   'other_hobbies': 0.024717630818486214,
   'relationships': 0.003034521359950304,
   'science_&_technology': 0.01635609194636345,
   'sports': 0.0013534919125959277,
   'travel_&_adventure': 0.0031978609040379524,
   'youth_&_student_life': 0.0037030214443802834}},
 {'label': ['diaries_&_daily_life'],
  'probability': {'arts_&_culture': 0.020732583478093147,
   'bu

### Sentiment
The sentiment analysis task integrated in TweetNLP is a simplified version where the goal is to predict the sentiment of a tweet with one of the three following labels: positive, neutral or negative. The base dataset for English is the unified TweetEval version of the Semeval-2017 dataset from the task on Sentiment Analysis in Twitter.

In [None]:
# single input
model = tweetnlp.load('sentiment')  # Or `model = tweetnlp.Sentiment()` 
model.sentiment("Yes, including Medicare and social security saving👍")  # Or `model.predict`

Downloading:   0%|          | 0.00/929 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/478M [00:00<?, ?B/s]

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.weight', 'roberta.pooler.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


{'label': 'positive', 'probability': 0.8018065094947815}

In [None]:
# multiple inputs
model.sentiment(
    ["Yes, including Medicare and social security saving👍", "How many more days until opening day? 😩"],
    batch_size=2)  # Or `model.predict`

[{'label': 'positive', 'probability': 0.8018065690994263},
 {'label': 'neutral', 'probability': 0.4665716290473938}]

### Sentiment (Multilingual)
Sentiment for the languages other than English, we include the datasets integrated in UMSAB, namely Arabic, French, German, Hindu, Italian, Portuguese, and Spanish.

In [None]:
# single input
model = tweetnlp.load('sentiment_multilingual')  # Or `model = tweetnlp.SentimentMultilungual()` 
model.sentiment("天気が良いとやっぱり気持ち良いなあ✨")  # Or `model.predict`

Downloading:   0%|          | 0.00/841 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/4.83M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/150 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.04G [00:00<?, ?B/s]

{'label': 'positive', 'probability': 0.8903420567512512}

In [None]:
# multiple inputs
model = tweetnlp.load('sentiment_multilingual')  # Or `model = tweetnlp.SentimentMultilungual()` 
model.sentiment(["天気が良いとやっぱり気持ち良いなあ✨", "¡Hoy me he levantado contento y creo que va a ser un gran día! 😀"])  # Or `model.predict`

[{'label': 'positive', 'probability': 0.8903421759605408},
 {'label': 'positive', 'probability': 0.9387574791908264}]

### Irony Detection
This is a binary classification task where given a tweet, the goal is to detect whether it is ironic or not. It is based on the Irony Detection dataset from the SemEval 2018 task.

In [None]:
# single input
model = tweetnlp.load('irony')  # Or `model = tweetnlp.Irony()` 
model.irony('If you wanna look like a badass, have drama on social media')  # Or `model.predict`

{'label': 'irony', 'probability': 0.9160911440849304}

In [None]:
# multiple inputs
model.irony(
    ['If you wanna look like a badass, have drama on social media', '@user No sugar during christmas time? :( '],
    batch_size=2)  # Or `model.predict`

[{'label': 'irony', 'probability': 0.9160910248756409},
 {'label': 'irony', 'probability': 0.8415006995201111}]

### Hate Speech Detection
The hate speech dataset consists of detecting whether a tweet is hateful towards women or immigrants. It is based on the Detection of Hate Speech task at SemEval 2019.

In [None]:
# single input
model = tweetnlp.load('hate')  # Or `model = tweetnlp.Hate()` 
model.hate('Whoever just unfollowed me you a bitch')  # Or `model.predict`

{'label': 'not-hate', 'probability': 0.7263831496238708}

In [None]:
# multiple inputs
model.hate(
    ['I want their puma shoes but a bitch is broke and cant afford them', 'Whoever just unfollowed me you a bitch'],
    batch_size=2)  # Or `model.predict`

[{'label': 'hate', 'probability': 0.8846580386161804},
 {'label': 'not-hate', 'probability': 0.7263831496238708}]

### Offensive Language Identification
This task consists in identifying whether some form of offensive language is present in a tweet. For our benchmark we rely on the SemEval2019 OffensEval dataset.

In [None]:
# single input
model = tweetnlp.load('offensive')  # Or `model = tweetnlp.Offensive()` 
model.offensive("All two of them taste like ass. ")  # Or `model.predict`

{'label': 'offensive', 'probability': 0.8600459098815918}

In [None]:
# multiple inputs
model.offensive(
    ["All two of them taste like ass.", "Are we all ready to sit and watch Indakurate Passcott play football?"],
    batch_size=2)  # Or `model.predict`

[{'label': 'offensive', 'probability': 0.8357967138290405},
 {'label': 'not-offensive', 'probability': 0.9063022136688232}]

### Emoji
The goal of emoji prediction is to predict the final emoji on a given tweet. The dataset used to fine-tune our models is the TweetEval adaptation from the SemEval 2018 task on Emoji Prediction, including 20 emoji as labels (❤, 😍, 😂, 💕, 🔥, 😊, 😎, ✨, 💙, 😘, 📷, 🇺🇸, ☀, 💜, 😉, 💯, 😁, 🎄, 📸, 😜).

In [None]:
# single input
model = tweetnlp.load('emoji')  # Or `model = tweetnlp.Emoji()` 
model.emoji('Beautiful sunset last night from the pontoon @TupperLakeNY')  # Or `model.predict`

{'label': '😊', 'probability': 0.3179638981819153}

In [None]:
# multiple inputs
model.emoji(
    ['Beautiful sunset last night from the pontoon @TupperLakeNY', "Selfie Saturday @HarryCarays 7th Inning Stretch"],
    batch_size=2)  # Or `model.predict`

[{'label': '😊', 'probability': 0.31796368956565857},
 {'label': '😜', 'probability': 0.17138013243675232}]

### Emotion Recognition
Given a tweet, this task consists of associating it with its most appropriate emotion. As a reference dataset we use the SemEval 2018 task on Affect in Tweets, simplified to only four emotions used in TweetEval: anger, joy, sadness and optimism.

In [None]:
# single input
model = tweetnlp.load('emotion')  # Or `model = tweetnlp.Emotion()` 
model.emotion('I love swimming for the same reason I love meditating...the feeling of weightlessness.')  # Or `model.predict`

{'label': 'joy', 'probability': 0.7345258593559265}

In [None]:
# multiple inputs
model.emotion(
    ['I love swimming for the same reason I love meditating...the feeling of weightlessness.', 'May or may not have just pulled the legal card on these folks. #irritated'],
    batch_size=2)  # Or `model.predict`

[{'label': 'joy', 'probability': 0.7345259189605713},
 {'label': 'anger', 'probability': 0.9860954880714417}]

## Named Entity Recognition
This module consists of named-entity recognition (NER) model specifically trained for tweets.
The model is instantiated by `tweetnlp.load("ner")`, and run the prediction by giving a text or a list of texts.

In [None]:
# single input
model = tweetnlp.load('ner')  # Or `model = tweetnlp.NER()` 
model.ner('Jacob Collier is a Grammy-awarded English artist from London.')  # Or `model.predict`

Downloading:   0%|          | 0.00/12.9k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/473M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/395 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/780k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/2.01M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

{'entity_prediction': [{'entity': ['Jacob', 'Collier'],
   'position': [0, 1],
   'probability': [0.960687518119812, 0.983401894569397],
   'type': 'person'},
  {'entity': ['London.'],
   'position': [8],
   'probability': [0.9398059248924255],
   'type': 'location'}],
 'input': ['Jacob',
  'Collier',
  'is',
  'a',
  'Grammy-awarded',
  'English',
  'artist',
  'from',
  'London.'],
 'prediction': ['B-person',
  'I-person',
  'O',
  'O',
  'O',
  'O',
  'O',
  'O',
  'B-location'],
 'probability': [0.960687518119812,
  0.983401894569397,
  0.9816870093345642,
  0.9896020293235779,
  0.44137847423553467,
  0.3758114278316498,
  0.8757674098014832,
  0.9786785244941711,
  0.9398059248924255]}

In [None]:
# multiple  input
model.ner(
    ['Jacob Collier is a Grammy-awarded English artist from London.', 'Tweet NLP is a website to enable users to use cutting-edge language technologies in social media, irrespective of their level of expertise.'],
    batch_size=2)  # Or `model.predict`

{'entity_prediction': [[{'entity': ['Jacob', 'Collier'],
    'position': [0, 1],
    'probability': [0.960687518119812, 0.983401894569397],
    'type': 'person'},
   {'entity': ['London.'],
    'position': [8],
    'probability': [0.9398059248924255],
    'type': 'location'}],
  []],
 'input': [['Jacob',
   'Collier',
   'is',
   'a',
   'Grammy-awarded',
   'English',
   'artist',
   'from',
   'London.'],
  ['Tweet',
   'NLP',
   'is',
   'a',
   'website',
   'to',
   'enable',
   'users',
   'to',
   'use',
   'cutting-edge',
   'language',
   'technologies',
   'in',
   'social',
   'media,',
   'irrespective',
   'of',
   'their',
   'level',
   'of',
   'expertise.']],
 'prediction': [['B-person',
   'I-person',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'B-location'],
  ['O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O',
   'O']],
 'probability': [[0.9606875

## Language Modeling
Masked language model predicts masked token in the given sentence. This is instantiated by `tweetnlp.load('language_model')`, and run the prediction by giving a text or a list of texts. Please make sure that each text has `<mask>` token, that is the objective of the model to predict.

In [None]:
# single input
model = tweetnlp.load('language_model')  # Or `model = tweetnlp.LanguageModel()` 
model.mask_prediction("So glad I'm <mask> vaccinated.")  # Or `model.predict`

{'best_scores': [1.036614630939292e-11,
  1.635246625608655e-11,
  1.981991480659584e-11,
  4.520017971021417e-10,
  1.0578216460999101e-05,
  0.0002495308290235698,
  2.3952572519192472e-05,
  1.8536458810558543e-05,
  2.8879259843961336e-05,
  5.781218987976899e-06],
 'best_sentences': ["So glad I'm fully vaccinated.",
  "So glad I'm getting vaccinated.",
  "So glad I'm not vaccinated.",
  "So glad I'm still vaccinated.",
  "So glad I'm already vaccinated.",
  "So glad I'm all vaccinated.",
  "So glad I'm being vaccinated.",
  "So glad I'm completely vaccinated.",
  "So glad I'm now vaccinated.",
  "So glad I'm finally vaccinated."],
 'best_tokens': ['fully',
  'getting',
  'not',
  'still',
  'already',
  'all',
  'being',
  'completely',
  'now',
  'finally']}

In [None]:
# multiple input
model.mask_prediction(
    ["So glad I'm <mask> vaccinated.", "I keep forgetting to bring a <mask>.", "Looking forward to watching <mask> Game tonight!"],
    batch_size=2)  # Or `model.predict`

[{'best_scores': [1.0366217433055436e-11,
   1.6352485338044787e-11,
   1.9819861030168084e-11,
   4.520022689469272e-10,
   1.0578208275546785e-05,
   0.00024953039246611297,
   2.395257615717128e-05,
   1.8536389688961208e-05,
   2.8879208912258036e-05,
   5.7812144405033905e-06],
  'best_sentences': ["So glad I'm fully vaccinated.",
   "So glad I'm getting vaccinated.",
   "So glad I'm not vaccinated.",
   "So glad I'm still vaccinated.",
   "So glad I'm already vaccinated.",
   "So glad I'm all vaccinated.",
   "So glad I'm being vaccinated.",
   "So glad I'm completely vaccinated.",
   "So glad I'm now vaccinated.",
   "So glad I'm finally vaccinated."],
  'best_tokens': ['fully',
   'getting',
   'not',
   'still',
   'already',
   'all',
   'being',
   'completely',
   'now',
   'finally']},
 {'best_scores': [3.027061240556961e-11,
   1.1211621214757272e-10,
   1.4037968704139203e-11,
   6.757234771725962e-09,
   3.2744906093284953e-06,
   2.747155576798832e-06,
   1.82777284862

## Tweet/Sentence Embedding
Tweet embedding model produces a fixed length embedding for a tweet. The embedding represents the semantics of the tweet, and this can be used a semantic search of tweets by using the similarity in betweein the embeddings. Model is instantiated by `tweet_nlp.load('sentence_embedding')`, and run the prediction by giving a text or a list of texts.

In [None]:
model = tweetnlp.load('sentence_embedding')

In [None]:
# Get sentence embedding
tweet = "I will never understand the decision making of the people of Alabama. Their new Senator is a definite downgrade. You have served with honor.  Well done."
vectors = model.embedding(tweet)
vectors.shape


(768,)

In [None]:
# Get sentence embedding (multiple inputs)
tweet_corpus = [
    "Free, fair elections are the lifeblood of our democracy. Charges of unfairness are serious. But calling an election unfair does not make it so. Charges require specific allegations and then proof. We have neither here.",
    "Trump appointed judge Stephanos Bibas ",
    "If your members can go to Puerto Rico they can get their asses back in the classroom. @CTULocal1",
    "@PolitiBunny @CTULocal1 Political leverage, science said schools could reopen, teachers and unions protested to keep'em closed and made demands for higher wages and benefits, they're usin Covid as a crutch at the expense of life and education.",
    "Congratulations to all the exporters on achieving record exports in Dec 2020 with a growth of 18 % over the previous year. Well done &amp; keep up this trend. A major pillar of our govt's economic policy is export enhancement &amp; we will provide full support to promote export culture.",
    "@ImranKhanPTI Pakistan seems a worst country in term of exporting facilities. I am a small business man and if I have to export a t-shirt having worth of $5 to USA or Europe. Postal cost will be around $30. How can we grow as an exporting country if this situation prevails. Think about it. #PM",
    "The thing that doesn’t sit right with me about “nothing good happened in 2020” is that it ignores the largest protest movement in our history. The beautiful, powerful Black Lives Matter uprising reached every corner of the country and should be central to our look back at 2020.",
    "@JoshuaPotash I kinda said that in the 2020 look back for @washingtonpost",
    "Is this a confirmation from Q that Lin is leaking declassified intelligence to the public? I believe so. If @realDonaldTrump didn’t approve of what @LLinWood is doing he would have let us know a lonnnnnng time ago. I’ve always wondered why Lin’s Twitter handle started with “LLin” https://t.co/0G7zClOmi2",
    "@ice_qued @realDonaldTrump @LLinWood Yeah 100%",
    "Tomorrow is my last day as Senator from Alabama.  I believe our opportunities are boundless when we find common ground. As we swear in a new Congress &amp; a new President, demand from them that they do just that &amp; build a stronger, more just society.  It’s been an honor to serve you." 
    "The mask cult can’t ever admit masks don’t work because their ideology is based on feeling like a “good person”  Wearing a mask makes them a “good person” &amp; anyone who disagrees w/them isn’t  They can’t tolerate any idea that makes them feel like their self-importance is unearned",
    "@ianmSC Beyond that, they put such huge confidence in masks so early with no strong evidence that they have any meaningful benefit, they don’t want to backtrack or admit they were wrong. They put the cart before the horse, now desperate to find any results that match their hypothesis.",
]
vectors = model.embedding(tweet_corpus, batch_size=3)
vectors.shape

(12, 768)

In [None]:
# Similarity search
sims = []
for n, i in enumerate(tweet_corpus):
  _sim = model.similarity(tweet, i)
  sims.append([n, _sim])
print(f'anchor tweet: {tweet}\n')
for m, (n, s) in enumerate(sorted(sims, key=lambda x: x[1], reverse=True)):
  print(f' - top {m}: {tweet_corpus[n]}\n - similaty: {s}\n')

anchor tweet: I will never understand the decision making of the people of Alabama. Their new Senator is a definite downgrade. You have served with honor.  Well done.

 - top 0: Is this a confirmation from Q that Lin is leaking declassified intelligence to the public? I believe so. If @realDonaldTrump didn’t approve of what @LLinWood is doing he would have let us know a lonnnnnng time ago. I’ve always wondered why Lin’s Twitter handle started with “LLin” https://t.co/0G7zClOmi2
 - similaty: 1.0787510714776494

 - top 1: Tomorrow is my last day as Senator from Alabama.  I believe our opportunities are boundless when we find common ground. As we swear in a new Congress &amp; a new President, demand from them that they do just that &amp; build a stronger, more just society.  It’s been an honor to serve you.The mask cult can’t ever admit masks don’t work because their ideology is based on feeling like a “good person”  Wearing a mask makes them a “good person” &amp; anyone who disagrees w/t

### Use Custom Model
To use other model from local/huggingface modelhub, one can simply provide model path/alias at the model loading.

`tweetnlp.load('task', model='model-path/alias')`

Or any classification model can be used without specifying the task.


In [None]:
# classification model
model = tweetnlp.load(model='cardiffnlp/tweet-topic-19-single')
model.topic("Jacob Collier is a Grammy-awarded English artist from London.")

Downloading:   0%|          | 0.00/1.13k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.27k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/780k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/476M [00:00<?, ?B/s]

{'label': ['pop_culture'],
 'probability': {'arts_&_culture': 0.2651212811470032,
  'business_&_entrepreneurs': 0.2154519110918045,
  'daily_life': 0.48174718022346497,
  'pop_culture': 0.9943158030509949,
  'science_&_technology': 0.25025588274002075,
  'sports_&_gaming': 0.39875540137290955}}

In [None]:
# other task eg) NER
model = tweetnlp.load('ner', model='tner/xlm-roberta-base-conll2003')
model.ner("Jacob Collier is a Grammy-awarded English artist from London.")

{'entity_prediction': [{'entity': ['Jacob', 'Collier'],
   'position': [0, 1],
   'probability': [0.9996787309646606, 0.9997997879981995],
   'type': 'person'},
  {'entity': ['Grammy-awarded'],
   'position': [4],
   'probability': [0.9995971322059631],
   'type': 'other'},
  {'entity': ['English'],
   'position': [5],
   'probability': [0.9997655749320984],
   'type': 'other'},
  {'entity': ['London.'],
   'position': [8],
   'probability': [0.9997679591178894],
   'type': 'location'}],
 'input': ['Jacob',
  'Collier',
  'is',
  'a',
  'Grammy-awarded',
  'English',
  'artist',
  'from',
  'London.'],
 'prediction': ['B-person',
  'I-person',
  'O',
  'O',
  'B-other',
  'B-other',
  'O',
  'O',
  'B-location'],
 'probability': [0.9996787309646606,
  0.9997997879981995,
  0.9999878406524658,
  0.9999885559082031,
  0.9995971322059631,
  0.9997655749320984,
  0.9999864101409912,
  0.9999876022338867,
  0.9997679591178894]}