## Text Classification

In [1]:
from datasets import load_dataset

# Load our data
data = load_dataset("rotten_tomatoes")
data

README.md: 0.00B [00:00, ?B/s]

train.parquet:   0%|          | 0.00/699k [00:00<?, ?B/s]

validation.parquet:   0%|          | 0.00/90.0k [00:00<?, ?B/s]

test.parquet:   0%|          | 0.00/92.2k [00:00<?, ?B/s]

Generating train split:   0%|          | 0/8530 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/1066 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1066 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 8530
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
})

In [2]:
data

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 8530
    })
    validation: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 1066
    })
})

In [3]:
data['train'][0,-1]

{'text': ['the rock is destined to be the 21st century\'s new " conan " and that he\'s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .',
  'things really get weird , though not particularly scary : the movie is all portent and no content .'],
 'label': [1, 0]}

- These short reviews are either labeled as positive (1) or negative (0). This means that we will focus on binary sentiment classification
- BERT, a well-known encoder-only architecture, is a popular choice for creating task-specific
and embedding models.
- In this task we will use RoBERTa model fine-tuned on tweets for sentiment analysis for movie review.

In [5]:
from transformers import pipeline
# Path to our HF model
model_path = "cardiffnlp/twitter-roberta-base-sentiment-latest"
# Load model into pipeline
pipe = pipeline(
    model=model_path,
    tokenizer=model_path,
    return_all_scores=True,
    device="cuda"
)

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Device set to use cuda


- The model load the tokenizer, which is responsible for converting input text into individual tokens.


![Alt text for the image](images/bert.png)

In [6]:
import numpy as np
from tqdm import tqdm
from transformers.pipelines.pt_utils import KeyDataset

y_pred=[]
for output in tqdm(pipe(KeyDataset(data['test'],'text')),
                  total=len(data['test'])):
    ## output is typically a list of dictionaries, where each dictionary represents a possible label
    ## and its score. For example, [{'label': 'NEGATIVE', 'score': 0.95}, {'label': 'NEUTRAL', 
    ## 'score': 0.03}, {'label': 'POSITIVE', 'score': 0.02}].
    print(output)
    negative_score=output[0]['score']
    positive_score=output[2]['score']
    assignment=np.argmax([negative_score,positive_score])
    y_pred.append(assignment)

  0%|▎                                                                                 | 4/1066 [00:00<02:11,  8.11it/s]

[{'label': 'negative', 'score': 0.005161251872777939}, {'label': 'neutral', 'score': 0.04023365676403046}, {'label': 'positive', 'score': 0.9546050429344177}]
[{'label': 'negative', 'score': 0.007706430274993181}, {'label': 'neutral', 'score': 0.10390972346067429}, {'label': 'positive', 'score': 0.8883838653564453}]
[{'label': 'negative', 'score': 0.7359185218811035}, {'label': 'neutral', 'score': 0.24242739379405975}, {'label': 'positive', 'score': 0.021654037758708}]
[{'label': 'negative', 'score': 0.0030380780808627605}, {'label': 'neutral', 'score': 0.06267133355140686}, {'label': 'positive', 'score': 0.9342905282974243}]
[{'label': 'negative', 'score': 0.11185602098703384}, {'label': 'neutral', 'score': 0.7006853818893433}, {'label': 'positive', 'score': 0.18745863437652588}]


  2%|█▊                                                                               | 24/1066 [00:00<00:20, 50.99it/s]

[{'label': 'negative', 'score': 0.6089192032814026}, {'label': 'neutral', 'score': 0.3732690215110779}, {'label': 'positive', 'score': 0.01781178079545498}]
[{'label': 'negative', 'score': 0.020790275186300278}, {'label': 'neutral', 'score': 0.45805037021636963}, {'label': 'positive', 'score': 0.5211593508720398}]
[{'label': 'negative', 'score': 0.009388203732669353}, {'label': 'neutral', 'score': 0.15621817111968994}, {'label': 'positive', 'score': 0.8343936204910278}]
[{'label': 'negative', 'score': 0.009773931466042995}, {'label': 'neutral', 'score': 0.18585321307182312}, {'label': 'positive', 'score': 0.8043728470802307}]
[{'label': 'negative', 'score': 0.02712375484406948}, {'label': 'neutral', 'score': 0.3788433074951172}, {'label': 'positive', 'score': 0.5940329432487488}]
[{'label': 'negative', 'score': 0.27410057187080383}, {'label': 'neutral', 'score': 0.5992846488952637}, {'label': 'positive', 'score': 0.1266147643327713}]
[{'label': 'negative', 'score': 0.8024442791938782},

  5%|████                                                                             | 53/1066 [00:01<00:10, 93.72it/s]

[{'label': 'negative', 'score': 0.0209027212113142}, {'label': 'neutral', 'score': 0.16473214328289032}, {'label': 'positive', 'score': 0.8143650889396667}]
[{'label': 'negative', 'score': 0.010330358520150185}, {'label': 'neutral', 'score': 0.0685872882604599}, {'label': 'positive', 'score': 0.9210823178291321}]
[{'label': 'negative', 'score': 0.03833693638443947}, {'label': 'neutral', 'score': 0.21484056115150452}, {'label': 'positive', 'score': 0.7468225359916687}]
[{'label': 'negative', 'score': 0.305485337972641}, {'label': 'neutral', 'score': 0.6537950038909912}, {'label': 'positive', 'score': 0.0407196506857872}]
[{'label': 'negative', 'score': 0.004782760050147772}, {'label': 'neutral', 'score': 0.017156554386019707}, {'label': 'positive', 'score': 0.9780606627464294}]
[{'label': 'negative', 'score': 0.04764952138066292}, {'label': 'neutral', 'score': 0.5597975850105286}, {'label': 'positive', 'score': 0.3925528824329376}]
[{'label': 'negative', 'score': 0.19740580022335052}, {

  8%|██████▏                                                                         | 83/1066 [00:01<00:08, 117.65it/s]

[{'label': 'negative', 'score': 0.2717113494873047}, {'label': 'neutral', 'score': 0.6888982653617859}, {'label': 'positive', 'score': 0.039390407502651215}]
[{'label': 'negative', 'score': 0.18909429013729095}, {'label': 'neutral', 'score': 0.7820719480514526}, {'label': 'positive', 'score': 0.028833748772740364}]
[{'label': 'negative', 'score': 0.02948356233537197}, {'label': 'neutral', 'score': 0.38098934292793274}, {'label': 'positive', 'score': 0.5895270705223083}]
[{'label': 'negative', 'score': 0.41154682636260986}, {'label': 'neutral', 'score': 0.5452569723129272}, {'label': 'positive', 'score': 0.0431961789727211}]
[{'label': 'negative', 'score': 0.10857415199279785}, {'label': 'neutral', 'score': 0.561680257320404}, {'label': 'positive', 'score': 0.32974565029144287}]
[{'label': 'negative', 'score': 0.06855397671461105}, {'label': 'neutral', 'score': 0.4107339680194855}, {'label': 'positive', 'score': 0.5207120776176453}]
[{'label': 'negative', 'score': 0.04627417027950287}, 

 11%|████████▎                                                                      | 112/1066 [00:01<00:07, 126.99it/s]

[{'label': 'negative', 'score': 0.13074719905853271}, {'label': 'neutral', 'score': 0.7013716697692871}, {'label': 'positive', 'score': 0.16788116097450256}]
[{'label': 'negative', 'score': 0.0030644936487078667}, {'label': 'neutral', 'score': 0.020648162811994553}, {'label': 'positive', 'score': 0.976287305355072}]
[{'label': 'negative', 'score': 0.04281027987599373}, {'label': 'neutral', 'score': 0.1313139945268631}, {'label': 'positive', 'score': 0.8258756995201111}]
[{'label': 'negative', 'score': 0.03481525555253029}, {'label': 'neutral', 'score': 0.6846102476119995}, {'label': 'positive', 'score': 0.2805745303630829}]
[{'label': 'negative', 'score': 0.009305852465331554}, {'label': 'neutral', 'score': 0.13147276639938354}, {'label': 'positive', 'score': 0.859221339225769}]
[{'label': 'negative', 'score': 0.0033629077952355146}, {'label': 'neutral', 'score': 0.045367103070020676}, {'label': 'positive', 'score': 0.9512699842453003}]
[{'label': 'negative', 'score': 0.011693106964230

 13%|██████████▌                                                                    | 142/1066 [00:01<00:06, 136.63it/s]

[{'label': 'negative', 'score': 0.015317432582378387}, {'label': 'neutral', 'score': 0.19699066877365112}, {'label': 'positive', 'score': 0.7876918911933899}]
[{'label': 'negative', 'score': 0.003256217809394002}, {'label': 'neutral', 'score': 0.01620897836983204}, {'label': 'positive', 'score': 0.9805348515510559}]
[{'label': 'negative', 'score': 0.005583322141319513}, {'label': 'neutral', 'score': 0.10532031953334808}, {'label': 'positive', 'score': 0.8890963196754456}]
[{'label': 'negative', 'score': 0.013830126263201237}, {'label': 'neutral', 'score': 0.42009997367858887}, {'label': 'positive', 'score': 0.5660699009895325}]
[{'label': 'negative', 'score': 0.02798166126012802}, {'label': 'neutral', 'score': 0.21580210328102112}, {'label': 'positive', 'score': 0.7562162280082703}]
[{'label': 'negative', 'score': 0.018090499565005302}, {'label': 'neutral', 'score': 0.15041808784008026}, {'label': 'positive', 'score': 0.8314914703369141}]
[{'label': 'negative', 'score': 0.0183976124972

 16%|████████████▉                                                                  | 174/1066 [00:01<00:06, 147.38it/s]

[{'label': 'negative', 'score': 0.0021882958244532347}, {'label': 'neutral', 'score': 0.009893073700368404}, {'label': 'positive', 'score': 0.9879186749458313}]
[{'label': 'negative', 'score': 0.25968366861343384}, {'label': 'neutral', 'score': 0.5958534479141235}, {'label': 'positive', 'score': 0.14446288347244263}]
[{'label': 'negative', 'score': 0.004380583297461271}, {'label': 'neutral', 'score': 0.33811846375465393}, {'label': 'positive', 'score': 0.6575009226799011}]
[{'label': 'negative', 'score': 0.004396276548504829}, {'label': 'neutral', 'score': 0.01671166531741619}, {'label': 'positive', 'score': 0.9788920879364014}]
[{'label': 'negative', 'score': 0.003879372263327241}, {'label': 'neutral', 'score': 0.04756718873977661}, {'label': 'positive', 'score': 0.9485534429550171}]
[{'label': 'negative', 'score': 0.006630325689911842}, {'label': 'neutral', 'score': 0.06300967931747437}, {'label': 'positive', 'score': 0.9303600192070007}]
[{'label': 'negative', 'score': 0.59531927108

 19%|███████████████▏                                                               | 205/1066 [00:02<00:06, 142.02it/s]

[{'label': 'negative', 'score': 0.3081599175930023}, {'label': 'neutral', 'score': 0.4173687696456909}, {'label': 'positive', 'score': 0.27447131276130676}]
[{'label': 'negative', 'score': 0.03043781965970993}, {'label': 'neutral', 'score': 0.6996710300445557}, {'label': 'positive', 'score': 0.2698911428451538}]
[{'label': 'negative', 'score': 0.780485212802887}, {'label': 'neutral', 'score': 0.19558438658714294}, {'label': 'positive', 'score': 0.023930449038743973}]
[{'label': 'negative', 'score': 0.02932385727763176}, {'label': 'neutral', 'score': 0.42932945489883423}, {'label': 'positive', 'score': 0.5413466691970825}]
[{'label': 'negative', 'score': 0.027935685589909554}, {'label': 'neutral', 'score': 0.2788229286670685}, {'label': 'positive', 'score': 0.6932414174079895}]
[{'label': 'negative', 'score': 0.003693870734423399}, {'label': 'neutral', 'score': 0.1041688323020935}, {'label': 'positive', 'score': 0.8921372890472412}]
[{'label': 'negative', 'score': 0.004766086582094431},

 22%|█████████████████▋                                                             | 239/1066 [00:02<00:05, 152.58it/s]

[{'label': 'negative', 'score': 0.0063148182816803455}, {'label': 'neutral', 'score': 0.24485249817371368}, {'label': 'positive', 'score': 0.7488327026367188}]
[{'label': 'negative', 'score': 0.22043894231319427}, {'label': 'neutral', 'score': 0.6574551463127136}, {'label': 'positive', 'score': 0.1221059262752533}]
[{'label': 'negative', 'score': 0.002017838880419731}, {'label': 'neutral', 'score': 0.01638144813477993}, {'label': 'positive', 'score': 0.9816007614135742}]
[{'label': 'negative', 'score': 0.006905257236212492}, {'label': 'neutral', 'score': 0.4288763999938965}, {'label': 'positive', 'score': 0.5642183423042297}]
[{'label': 'negative', 'score': 0.823688268661499}, {'label': 'neutral', 'score': 0.16636903584003448}, {'label': 'positive', 'score': 0.009942637756466866}]
[{'label': 'negative', 'score': 0.1180327832698822}, {'label': 'neutral', 'score': 0.641356885433197}, {'label': 'positive', 'score': 0.24061034619808197}]
[{'label': 'negative', 'score': 0.007162044756114483

 26%|████████████████████▏                                                          | 272/1066 [00:02<00:05, 156.93it/s]

[{'label': 'negative', 'score': 0.0025971957948058844}, {'label': 'neutral', 'score': 0.016512207686901093}, {'label': 'positive', 'score': 0.9808906316757202}]
[{'label': 'negative', 'score': 0.0054416111670434475}, {'label': 'neutral', 'score': 0.0877169668674469}, {'label': 'positive', 'score': 0.9068414568901062}]
[{'label': 'negative', 'score': 0.006749436259269714}, {'label': 'neutral', 'score': 0.037666019052267075}, {'label': 'positive', 'score': 0.9555844664573669}]
[{'label': 'negative', 'score': 0.71491938829422}, {'label': 'neutral', 'score': 0.2528252601623535}, {'label': 'positive', 'score': 0.03225535899400711}]
[{'label': 'negative', 'score': 0.7403231859207153}, {'label': 'neutral', 'score': 0.22019025683403015}, {'label': 'positive', 'score': 0.03948654606938362}]
[{'label': 'negative', 'score': 0.05837957561016083}, {'label': 'neutral', 'score': 0.4013151228427887}, {'label': 'positive', 'score': 0.5403052568435669}]
[{'label': 'negative', 'score': 0.4550011456012726

 28%|██████████████████████▍                                                        | 303/1066 [00:02<00:05, 144.30it/s]

[{'label': 'negative', 'score': 0.0037758012767881155}, {'label': 'neutral', 'score': 0.03283487632870674}, {'label': 'positive', 'score': 0.9633892774581909}]
[{'label': 'negative', 'score': 0.27416208386421204}, {'label': 'neutral', 'score': 0.5766221284866333}, {'label': 'positive', 'score': 0.14921577274799347}]
[{'label': 'negative', 'score': 0.5145046710968018}, {'label': 'neutral', 'score': 0.4153865575790405}, {'label': 'positive', 'score': 0.0701088160276413}]
[{'label': 'negative', 'score': 0.27733030915260315}, {'label': 'neutral', 'score': 0.6756449341773987}, {'label': 'positive', 'score': 0.04702481999993324}]
[{'label': 'negative', 'score': 0.749194860458374}, {'label': 'neutral', 'score': 0.22606992721557617}, {'label': 'positive', 'score': 0.024735232815146446}]
[{'label': 'negative', 'score': 0.017580019310116768}, {'label': 'neutral', 'score': 0.22045856714248657}, {'label': 'positive', 'score': 0.7619614005088806}]
[{'label': 'negative', 'score': 0.15060962736606598

 31%|████████████████████████▋                                                      | 333/1066 [00:02<00:05, 143.28it/s]

[{'label': 'negative', 'score': 0.7112635374069214}, {'label': 'neutral', 'score': 0.2582603991031647}, {'label': 'positive', 'score': 0.030476002022624016}]
[{'label': 'negative', 'score': 0.13076680898666382}, {'label': 'neutral', 'score': 0.33312296867370605}, {'label': 'positive', 'score': 0.5361102223396301}]
[{'label': 'negative', 'score': 0.270781934261322}, {'label': 'neutral', 'score': 0.5251577496528625}, {'label': 'positive', 'score': 0.204060360789299}]
[{'label': 'negative', 'score': 0.004454797599464655}, {'label': 'neutral', 'score': 0.012720096856355667}, {'label': 'positive', 'score': 0.9828251004219055}]
[{'label': 'negative', 'score': 0.010817443020641804}, {'label': 'neutral', 'score': 0.2584380805492401}, {'label': 'positive', 'score': 0.730744481086731}]
[{'label': 'negative', 'score': 0.48533421754837036}, {'label': 'neutral', 'score': 0.4561953842639923}, {'label': 'positive', 'score': 0.05847032740712166}]
[{'label': 'negative', 'score': 0.5821385383605957}, {'

 33%|█████████████████████████▊                                                     | 348/1066 [00:03<00:05, 137.06it/s]

[{'label': 'negative', 'score': 0.0029720570892095566}, {'label': 'neutral', 'score': 0.06224452704191208}, {'label': 'positive', 'score': 0.934783399105072}]
[{'label': 'negative', 'score': 0.0030918624252080917}, {'label': 'neutral', 'score': 0.010525970719754696}, {'label': 'positive', 'score': 0.9863821268081665}]
[{'label': 'negative', 'score': 0.017076384276151657}, {'label': 'neutral', 'score': 0.1433981955051422}, {'label': 'positive', 'score': 0.8395253419876099}]
[{'label': 'negative', 'score': 0.18495327234268188}, {'label': 'neutral', 'score': 0.7888339757919312}, {'label': 'positive', 'score': 0.026212744414806366}]
[{'label': 'negative', 'score': 0.038930851966142654}, {'label': 'neutral', 'score': 0.8072524070739746}, {'label': 'positive', 'score': 0.15381675958633423}]
[{'label': 'negative', 'score': 0.0040991404093801975}, {'label': 'neutral', 'score': 0.02481647953391075}, {'label': 'positive', 'score': 0.9710844159126282}]
[{'label': 'negative', 'score': 0.0261002518

 36%|████████████████████████████▏                                                  | 380/1066 [00:03<00:04, 146.65it/s]

[{'label': 'negative', 'score': 0.0025165723636746407}, {'label': 'neutral', 'score': 0.017115898430347443}, {'label': 'positive', 'score': 0.9803675413131714}]
[{'label': 'negative', 'score': 0.007975148037075996}, {'label': 'neutral', 'score': 0.07213577628135681}, {'label': 'positive', 'score': 0.9198890328407288}]
[{'label': 'negative', 'score': 0.016555922105908394}, {'label': 'neutral', 'score': 0.18027529120445251}, {'label': 'positive', 'score': 0.803168773651123}]
[{'label': 'negative', 'score': 0.8120522499084473}, {'label': 'neutral', 'score': 0.16700123250484467}, {'label': 'positive', 'score': 0.02094646915793419}]
[{'label': 'negative', 'score': 0.04620588198304176}, {'label': 'neutral', 'score': 0.3000643849372864}, {'label': 'positive', 'score': 0.6537297368049622}]
[{'label': 'negative', 'score': 0.004261634778231382}, {'label': 'neutral', 'score': 0.07409254461526871}, {'label': 'positive', 'score': 0.9216457605361938}]
[{'label': 'negative', 'score': 0.00955542176961

 39%|██████████████████████████████▍                                                | 411/1066 [00:03<00:04, 144.90it/s]

[{'label': 'negative', 'score': 0.006262263748794794}, {'label': 'neutral', 'score': 0.12789958715438843}, {'label': 'positive', 'score': 0.8658381700515747}]
[{'label': 'negative', 'score': 0.0449431948363781}, {'label': 'neutral', 'score': 0.5479300618171692}, {'label': 'positive', 'score': 0.4071267247200012}]
[{'label': 'negative', 'score': 0.016403496265411377}, {'label': 'neutral', 'score': 0.2127133458852768}, {'label': 'positive', 'score': 0.7708831429481506}]
[{'label': 'negative', 'score': 0.010089115239679813}, {'label': 'neutral', 'score': 0.13889046013355255}, {'label': 'positive', 'score': 0.8510204553604126}]
[{'label': 'negative', 'score': 0.0065215108916163445}, {'label': 'neutral', 'score': 0.03757330775260925}, {'label': 'positive', 'score': 0.9559051990509033}]
[{'label': 'negative', 'score': 0.006782887037843466}, {'label': 'neutral', 'score': 0.3585025370121002}, {'label': 'positive', 'score': 0.6347146034240723}]
[{'label': 'negative', 'score': 0.0855298563838005

 41%|████████████████████████████████▋                                              | 441/1066 [00:03<00:04, 137.94it/s]

[{'label': 'negative', 'score': 0.02515038289129734}, {'label': 'neutral', 'score': 0.5397147536277771}, {'label': 'positive', 'score': 0.4351348578929901}]
[{'label': 'negative', 'score': 0.006789745297282934}, {'label': 'neutral', 'score': 0.0856093019247055}, {'label': 'positive', 'score': 0.9076009392738342}]
[{'label': 'negative', 'score': 0.21675445139408112}, {'label': 'neutral', 'score': 0.6432271003723145}, {'label': 'positive', 'score': 0.14001841843128204}]
[{'label': 'negative', 'score': 0.0057824342511594296}, {'label': 'neutral', 'score': 0.11104562133550644}, {'label': 'positive', 'score': 0.8831719160079956}]
[{'label': 'negative', 'score': 0.35810986161231995}, {'label': 'neutral', 'score': 0.3582701086997986}, {'label': 'positive', 'score': 0.28362002968788147}]
[{'label': 'negative', 'score': 0.007684982847422361}, {'label': 'neutral', 'score': 0.1399124562740326}, {'label': 'positive', 'score': 0.8524025678634644}]
[{'label': 'negative', 'score': 0.0263079721480608}

 44%|██████████████████████████████████▉                                            | 471/1066 [00:03<00:04, 141.85it/s]

[{'label': 'negative', 'score': 0.0929294154047966}, {'label': 'neutral', 'score': 0.30487602949142456}, {'label': 'positive', 'score': 0.6021945476531982}]
[{'label': 'negative', 'score': 0.04283696785569191}, {'label': 'neutral', 'score': 0.4259970784187317}, {'label': 'positive', 'score': 0.5311659574508667}]
[{'label': 'negative', 'score': 0.10098391771316528}, {'label': 'neutral', 'score': 0.6660746335983276}, {'label': 'positive', 'score': 0.2329414188861847}]
[{'label': 'negative', 'score': 0.027556218206882477}, {'label': 'neutral', 'score': 0.25148284435272217}, {'label': 'positive', 'score': 0.7209609150886536}]
[{'label': 'negative', 'score': 0.2965144217014313}, {'label': 'neutral', 'score': 0.5964170694351196}, {'label': 'positive', 'score': 0.1070685014128685}]
[{'label': 'negative', 'score': 0.749253511428833}, {'label': 'neutral', 'score': 0.23072859644889832}, {'label': 'positive', 'score': 0.020017875358462334}]
[{'label': 'negative', 'score': 0.5994868278503418}, {'l

 47%|█████████████████████████████████████▏                                         | 502/1066 [00:04<00:03, 145.44it/s]

[{'label': 'negative', 'score': 0.5266533493995667}, {'label': 'neutral', 'score': 0.3628622889518738}, {'label': 'positive', 'score': 0.11048432439565659}]
[{'label': 'negative', 'score': 0.002427205676212907}, {'label': 'neutral', 'score': 0.008911432698369026}, {'label': 'positive', 'score': 0.9886614084243774}]
[{'label': 'negative', 'score': 0.009605524130165577}, {'label': 'neutral', 'score': 0.27368536591529846}, {'label': 'positive', 'score': 0.7167090773582458}]
[{'label': 'negative', 'score': 0.12459595501422882}, {'label': 'neutral', 'score': 0.7112128138542175}, {'label': 'positive', 'score': 0.16419120132923126}]
[{'label': 'negative', 'score': 0.6102340817451477}, {'label': 'neutral', 'score': 0.36727550625801086}, {'label': 'positive', 'score': 0.02249039337038994}]
[{'label': 'negative', 'score': 0.0064029330387711525}, {'label': 'neutral', 'score': 0.09585414081811905}, {'label': 'positive', 'score': 0.8977429270744324}]
[{'label': 'negative', 'score': 0.07439070194959

 50%|███████████████████████████████████████▋                                       | 535/1066 [00:04<00:03, 142.16it/s]

[{'label': 'negative', 'score': 0.02254241518676281}, {'label': 'neutral', 'score': 0.7241371273994446}, {'label': 'positive', 'score': 0.25332045555114746}]
[{'label': 'negative', 'score': 0.006823651026934385}, {'label': 'neutral', 'score': 0.08737628906965256}, {'label': 'positive', 'score': 0.9058000445365906}]
[{'label': 'negative', 'score': 0.15396898984909058}, {'label': 'neutral', 'score': 0.6720110177993774}, {'label': 'positive', 'score': 0.17401999235153198}]
[{'label': 'negative', 'score': 0.010947859846055508}, {'label': 'neutral', 'score': 0.20722608268260956}, {'label': 'positive', 'score': 0.7818261384963989}]
[{'label': 'negative', 'score': 0.20999455451965332}, {'label': 'neutral', 'score': 0.6360502243041992}, {'label': 'positive', 'score': 0.15395526587963104}]
[{'label': 'negative', 'score': 0.16171546280384064}, {'label': 'neutral', 'score': 0.44378918409347534}, {'label': 'positive', 'score': 0.3944953680038452}]
[{'label': 'negative', 'score': 0.0057824277319014

 52%|████████████████████████████████████████▊                                      | 551/1066 [00:04<00:03, 144.89it/s]

[{'label': 'negative', 'score': 0.20883767306804657}, {'label': 'neutral', 'score': 0.6762638688087463}, {'label': 'positive', 'score': 0.1148984432220459}]
[{'label': 'negative', 'score': 0.3564300835132599}, {'label': 'neutral', 'score': 0.5906151533126831}, {'label': 'positive', 'score': 0.052954692393541336}]
[{'label': 'negative', 'score': 0.9043664336204529}, {'label': 'neutral', 'score': 0.08999861776828766}, {'label': 'positive', 'score': 0.005634929519146681}]
[{'label': 'negative', 'score': 0.23732617497444153}, {'label': 'neutral', 'score': 0.5875405669212341}, {'label': 'positive', 'score': 0.17513324320316315}]
[{'label': 'negative', 'score': 0.849965512752533}, {'label': 'neutral', 'score': 0.14141716063022614}, {'label': 'positive', 'score': 0.008617316372692585}]
[{'label': 'negative', 'score': 0.18202278017997742}, {'label': 'neutral', 'score': 0.7454066872596741}, {'label': 'positive', 'score': 0.07257046550512314}]
[{'label': 'negative', 'score': 0.3359195291996002},

 55%|███████████████████████████████████████████                                    | 581/1066 [00:04<00:03, 138.08it/s]

[{'label': 'negative', 'score': 0.9267798662185669}, {'label': 'neutral', 'score': 0.06543707102537155}, {'label': 'positive', 'score': 0.007783018983900547}]
[{'label': 'negative', 'score': 0.18755917251110077}, {'label': 'neutral', 'score': 0.7297358512878418}, {'label': 'positive', 'score': 0.08270501345396042}]
[{'label': 'negative', 'score': 0.9011212587356567}, {'label': 'neutral', 'score': 0.09305215626955032}, {'label': 'positive', 'score': 0.0058265570551157}]
[{'label': 'negative', 'score': 0.10798872262239456}, {'label': 'neutral', 'score': 0.7271686792373657}, {'label': 'positive', 'score': 0.16484256088733673}]
[{'label': 'negative', 'score': 0.6104421019554138}, {'label': 'neutral', 'score': 0.36057087779045105}, {'label': 'positive', 'score': 0.028987009078264236}]
[{'label': 'negative', 'score': 0.6926247477531433}, {'label': 'neutral', 'score': 0.27163729071617126}, {'label': 'positive', 'score': 0.03573795035481453}]
[{'label': 'negative', 'score': 0.8523380756378174}

 57%|█████████████████████████████████████████████▎                                 | 611/1066 [00:04<00:03, 140.16it/s]

[{'label': 'negative', 'score': 0.07598680257797241}, {'label': 'neutral', 'score': 0.3734651207923889}, {'label': 'positive', 'score': 0.5505480170249939}]
[{'label': 'negative', 'score': 0.9488274455070496}, {'label': 'neutral', 'score': 0.046017397195100784}, {'label': 'positive', 'score': 0.005155146587640047}]
[{'label': 'negative', 'score': 0.2415386438369751}, {'label': 'neutral', 'score': 0.6687232255935669}, {'label': 'positive', 'score': 0.08973807841539383}]
[{'label': 'negative', 'score': 0.24839739501476288}, {'label': 'neutral', 'score': 0.7244324684143066}, {'label': 'positive', 'score': 0.02717011608183384}]
[{'label': 'negative', 'score': 0.7118676900863647}, {'label': 'neutral', 'score': 0.203474760055542}, {'label': 'positive', 'score': 0.08465758711099625}]
[{'label': 'negative', 'score': 0.5370849967002869}, {'label': 'neutral', 'score': 0.39637622237205505}, {'label': 'positive', 'score': 0.06653878837823868}]
[{'label': 'negative', 'score': 0.866519570350647}, {'

 60%|███████████████████████████████████████████████▌                               | 641/1066 [00:05<00:03, 140.76it/s]

[{'label': 'negative', 'score': 0.837778627872467}, {'label': 'neutral', 'score': 0.15211798250675201}, {'label': 'positive', 'score': 0.010103398934006691}]
[{'label': 'negative', 'score': 0.1923004388809204}, {'label': 'neutral', 'score': 0.645233690738678}, {'label': 'positive', 'score': 0.1624658703804016}]
[{'label': 'negative', 'score': 0.9315587282180786}, {'label': 'neutral', 'score': 0.06048503890633583}, {'label': 'positive', 'score': 0.007956251502037048}]
[{'label': 'negative', 'score': 0.8844042420387268}, {'label': 'neutral', 'score': 0.10695963352918625}, {'label': 'positive', 'score': 0.00863617192953825}]
[{'label': 'negative', 'score': 0.9194133877754211}, {'label': 'neutral', 'score': 0.07382629811763763}, {'label': 'positive', 'score': 0.0067603145726025105}]
[{'label': 'negative', 'score': 0.6087252497673035}, {'label': 'neutral', 'score': 0.33064812421798706}, {'label': 'positive', 'score': 0.060626622289419174}]
[{'label': 'negative', 'score': 0.932238757610321},

 63%|█████████████████████████████████████████████████▋                             | 670/1066 [00:05<00:02, 132.81it/s]

[{'label': 'negative', 'score': 0.8781972527503967}, {'label': 'neutral', 'score': 0.11092232912778854}, {'label': 'positive', 'score': 0.010880311951041222}]
[{'label': 'negative', 'score': 0.6987184882164001}, {'label': 'neutral', 'score': 0.2617523968219757}, {'label': 'positive', 'score': 0.03952915221452713}]
[{'label': 'negative', 'score': 0.8012266755104065}, {'label': 'neutral', 'score': 0.1803460568189621}, {'label': 'positive', 'score': 0.018427295610308647}]
[{'label': 'negative', 'score': 0.5651645660400391}, {'label': 'neutral', 'score': 0.3893389105796814}, {'label': 'positive', 'score': 0.04549654945731163}]
[{'label': 'negative', 'score': 0.9295215010643005}, {'label': 'neutral', 'score': 0.06570136547088623}, {'label': 'positive', 'score': 0.004777169786393642}]
[{'label': 'negative', 'score': 0.8467281460762024}, {'label': 'neutral', 'score': 0.145737424492836}, {'label': 'positive', 'score': 0.007534411270171404}]
[{'label': 'negative', 'score': 0.8853058218955994}, 

 64%|██████████████████████████████████████████████████▋                            | 684/1066 [00:05<00:03, 126.01it/s]

[{'label': 'negative', 'score': 0.1583186388015747}, {'label': 'neutral', 'score': 0.7681375741958618}, {'label': 'positive', 'score': 0.07354378700256348}]
[{'label': 'negative', 'score': 0.7858617305755615}, {'label': 'neutral', 'score': 0.1940809041261673}, {'label': 'positive', 'score': 0.02005741186439991}]
[{'label': 'negative', 'score': 0.9385552406311035}, {'label': 'neutral', 'score': 0.055656448006629944}, {'label': 'positive', 'score': 0.005788223817944527}]
[{'label': 'negative', 'score': 0.7441864609718323}, {'label': 'neutral', 'score': 0.23447377979755402}, {'label': 'positive', 'score': 0.02133980393409729}]
[{'label': 'negative', 'score': 0.8871991634368896}, {'label': 'neutral', 'score': 0.10457240045070648}, {'label': 'positive', 'score': 0.008228395134210587}]
[{'label': 'negative', 'score': 0.8899974822998047}, {'label': 'neutral', 'score': 0.10261186957359314}, {'label': 'positive', 'score': 0.007390683051198721}]
[{'label': 'negative', 'score': 0.6043701767921448

 67%|████████████████████████████████████████████████████▋                          | 711/1066 [00:05<00:02, 126.09it/s]

[{'label': 'negative', 'score': 0.03113637864589691}, {'label': 'neutral', 'score': 0.8406358957290649}, {'label': 'positive', 'score': 0.12822778522968292}]
[{'label': 'negative', 'score': 0.1924670785665512}, {'label': 'neutral', 'score': 0.6417447924613953}, {'label': 'positive', 'score': 0.16578808426856995}]
[{'label': 'negative', 'score': 0.8603036999702454}, {'label': 'neutral', 'score': 0.12881672382354736}, {'label': 'positive', 'score': 0.010879541747272015}]
[{'label': 'negative', 'score': 0.8802533745765686}, {'label': 'neutral', 'score': 0.11371339857578278}, {'label': 'positive', 'score': 0.006033177487552166}]
[{'label': 'negative', 'score': 0.8915068507194519}, {'label': 'neutral', 'score': 0.10153082758188248}, {'label': 'positive', 'score': 0.006962370593100786}]
[{'label': 'negative', 'score': 0.08965207636356354}, {'label': 'neutral', 'score': 0.8643286824226379}, {'label': 'positive', 'score': 0.04601931944489479}]
[{'label': 'negative', 'score': 0.6282245516777039

 70%|██████████████████████████████████████████████████████▉                        | 741/1066 [00:05<00:02, 135.97it/s]

[{'label': 'negative', 'score': 0.022054627537727356}, {'label': 'neutral', 'score': 0.18396145105361938}, {'label': 'positive', 'score': 0.7939838767051697}]
[{'label': 'negative', 'score': 0.3731757700443268}, {'label': 'neutral', 'score': 0.5640848875045776}, {'label': 'positive', 'score': 0.06273934990167618}]
[{'label': 'negative', 'score': 0.43160074949264526}, {'label': 'neutral', 'score': 0.5286926627159119}, {'label': 'positive', 'score': 0.03970656916499138}]
[{'label': 'negative', 'score': 0.7326361536979675}, {'label': 'neutral', 'score': 0.23786142468452454}, {'label': 'positive', 'score': 0.02950245887041092}]
[{'label': 'negative', 'score': 0.26384273171424866}, {'label': 'neutral', 'score': 0.6797470450401306}, {'label': 'positive', 'score': 0.05641019344329834}]
[{'label': 'negative', 'score': 0.045635636895895004}, {'label': 'neutral', 'score': 0.38673922419548035}, {'label': 'positive', 'score': 0.567625105381012}]
[{'label': 'negative', 'score': 0.7985164523124695},

 72%|█████████████████████████████████████████████████████████                      | 770/1066 [00:06<00:02, 136.60it/s]

[{'label': 'negative', 'score': 0.8820939064025879}, {'label': 'neutral', 'score': 0.10741200298070908}, {'label': 'positive', 'score': 0.010494130663573742}]
[{'label': 'negative', 'score': 0.9003522396087646}, {'label': 'neutral', 'score': 0.0898905023932457}, {'label': 'positive', 'score': 0.009757296182215214}]
[{'label': 'negative', 'score': 0.8820740580558777}, {'label': 'neutral', 'score': 0.11158443987369537}, {'label': 'positive', 'score': 0.006341481581330299}]
[{'label': 'negative', 'score': 0.8912043571472168}, {'label': 'neutral', 'score': 0.10155647993087769}, {'label': 'positive', 'score': 0.007239179220050573}]
[{'label': 'negative', 'score': 0.8454972505569458}, {'label': 'neutral', 'score': 0.1486123651266098}, {'label': 'positive', 'score': 0.005890291649848223}]
[{'label': 'negative', 'score': 0.9303057789802551}, {'label': 'neutral', 'score': 0.06383926421403885}, {'label': 'positive', 'score': 0.005855020135641098}]
[{'label': 'negative', 'score': 0.92867869138717

 75%|███████████████████████████████████████████████████████████▏                   | 798/1066 [00:06<00:02, 129.11it/s]

[{'label': 'negative', 'score': 0.7245178818702698}, {'label': 'neutral', 'score': 0.2544167637825012}, {'label': 'positive', 'score': 0.02106533758342266}]
[{'label': 'negative', 'score': 0.770908534526825}, {'label': 'neutral', 'score': 0.2184525430202484}, {'label': 'positive', 'score': 0.010638881474733353}]
[{'label': 'negative', 'score': 0.9310694336891174}, {'label': 'neutral', 'score': 0.06131569296121597}, {'label': 'positive', 'score': 0.007614858448505402}]
[{'label': 'negative', 'score': 0.7240939140319824}, {'label': 'neutral', 'score': 0.25437530875205994}, {'label': 'positive', 'score': 0.021530751138925552}]
[{'label': 'negative', 'score': 0.4320542812347412}, {'label': 'neutral', 'score': 0.5376880168914795}, {'label': 'positive', 'score': 0.03025766648352146}]
[{'label': 'negative', 'score': 0.9448360800743103}, {'label': 'neutral', 'score': 0.04915868863463402}, {'label': 'positive', 'score': 0.006005282048135996}]
[{'label': 'negative', 'score': 0.8152976632118225},

 78%|█████████████████████████████████████████████████████████████▎                 | 827/1066 [00:06<00:01, 133.93it/s]

[{'label': 'negative', 'score': 0.8073555827140808}, {'label': 'neutral', 'score': 0.18571099638938904}, {'label': 'positive', 'score': 0.0069334604777395725}]
[{'label': 'negative', 'score': 0.015369881875813007}, {'label': 'neutral', 'score': 0.1592089682817459}, {'label': 'positive', 'score': 0.825421154499054}]
[{'label': 'negative', 'score': 0.7932720184326172}, {'label': 'neutral', 'score': 0.18703711032867432}, {'label': 'positive', 'score': 0.01969083398580551}]
[{'label': 'negative', 'score': 0.9422308802604675}, {'label': 'neutral', 'score': 0.05104883015155792}, {'label': 'positive', 'score': 0.006720321718603373}]
[{'label': 'negative', 'score': 0.0792994499206543}, {'label': 'neutral', 'score': 0.8753748536109924}, {'label': 'positive', 'score': 0.04532565549015999}]
[{'label': 'negative', 'score': 0.6911808848381042}, {'label': 'neutral', 'score': 0.2944207787513733}, {'label': 'positive', 'score': 0.014398427680134773}]
[{'label': 'negative', 'score': 0.8276733160018921}

 80%|███████████████████████████████████████████████████████████████▎               | 855/1066 [00:06<00:01, 129.78it/s]

[{'label': 'negative', 'score': 0.7800931334495544}, {'label': 'neutral', 'score': 0.20839384198188782}, {'label': 'positive', 'score': 0.01151303667575121}]
[{'label': 'negative', 'score': 0.7403817176818848}, {'label': 'neutral', 'score': 0.24144652485847473}, {'label': 'positive', 'score': 0.018171783536672592}]
[{'label': 'negative', 'score': 0.7552460432052612}, {'label': 'neutral', 'score': 0.23207485675811768}, {'label': 'positive', 'score': 0.012679052539169788}]
[{'label': 'negative', 'score': 0.3441169261932373}, {'label': 'neutral', 'score': 0.34691622853279114}, {'label': 'positive', 'score': 0.3089667856693268}]
[{'label': 'negative', 'score': 0.36228787899017334}, {'label': 'neutral', 'score': 0.49279865622520447}, {'label': 'positive', 'score': 0.14491340517997742}]
[{'label': 'negative', 'score': 0.9382104277610779}, {'label': 'neutral', 'score': 0.056459102779626846}, {'label': 'positive', 'score': 0.005330448038876057}]
[{'label': 'negative', 'score': 0.92742085456848

 83%|█████████████████████████████████████████████████████████████████▌             | 885/1066 [00:06<00:01, 138.62it/s]

[{'label': 'negative', 'score': 0.42185908555984497}, {'label': 'neutral', 'score': 0.46174052357673645}, {'label': 'positive', 'score': 0.11640039086341858}]
[{'label': 'negative', 'score': 0.03281262144446373}, {'label': 'neutral', 'score': 0.4124685227870941}, {'label': 'positive', 'score': 0.5547188520431519}]
[{'label': 'negative', 'score': 0.8400384783744812}, {'label': 'neutral', 'score': 0.13985437154769897}, {'label': 'positive', 'score': 0.020107217133045197}]
[{'label': 'negative', 'score': 0.5507612228393555}, {'label': 'neutral', 'score': 0.42891499400138855}, {'label': 'positive', 'score': 0.02032385766506195}]
[{'label': 'negative', 'score': 0.7694588303565979}, {'label': 'neutral', 'score': 0.21835735440254211}, {'label': 'positive', 'score': 0.012183811515569687}]
[{'label': 'negative', 'score': 0.8131421208381653}, {'label': 'neutral', 'score': 0.17745447158813477}, {'label': 'positive', 'score': 0.009403386153280735}]
[{'label': 'negative', 'score': 0.696773171424865

 86%|███████████████████████████████████████████████████████████████████▊           | 915/1066 [00:07<00:01, 137.45it/s]

[{'label': 'negative', 'score': 0.8793390393257141}, {'label': 'neutral', 'score': 0.1123805046081543}, {'label': 'positive', 'score': 0.008280430920422077}]
[{'label': 'negative', 'score': 0.9229700565338135}, {'label': 'neutral', 'score': 0.06925465166568756}, {'label': 'positive', 'score': 0.00777525594457984}]
[{'label': 'negative', 'score': 0.7009298801422119}, {'label': 'neutral', 'score': 0.28685885667800903}, {'label': 'positive', 'score': 0.012211265973746777}]
[{'label': 'negative', 'score': 0.030512990429997444}, {'label': 'neutral', 'score': 0.8974032402038574}, {'label': 'positive', 'score': 0.0720837265253067}]
[{'label': 'negative', 'score': 0.04378809407353401}, {'label': 'neutral', 'score': 0.48134371638298035}, {'label': 'positive', 'score': 0.47486814856529236}]
[{'label': 'negative', 'score': 0.3734916150569916}, {'label': 'neutral', 'score': 0.5057672262191772}, {'label': 'positive', 'score': 0.12074112147092819}]
[{'label': 'negative', 'score': 0.9326493144035339}

 88%|█████████████████████████████████████████████████████████████████████▉         | 943/1066 [00:07<00:00, 130.04it/s]

[{'label': 'negative', 'score': 0.9297144412994385}, {'label': 'neutral', 'score': 0.06451170891523361}, {'label': 'positive', 'score': 0.005773785058408976}]
[{'label': 'negative', 'score': 0.9485235810279846}, {'label': 'neutral', 'score': 0.047265663743019104}, {'label': 'positive', 'score': 0.0042107743211090565}]
[{'label': 'negative', 'score': 0.874767005443573}, {'label': 'neutral', 'score': 0.11902928352355957}, {'label': 'positive', 'score': 0.006203759927302599}]
[{'label': 'negative', 'score': 0.3949020802974701}, {'label': 'neutral', 'score': 0.5929179787635803}, {'label': 'positive', 'score': 0.012179940938949585}]
[{'label': 'negative', 'score': 0.0033645848743617535}, {'label': 'neutral', 'score': 0.26747509837150574}, {'label': 'positive', 'score': 0.7291603684425354}]
[{'label': 'negative', 'score': 0.6281706094741821}, {'label': 'neutral', 'score': 0.34716978669166565}, {'label': 'positive', 'score': 0.024659590795636177}]
[{'label': 'negative', 'score': 0.81156313419

 91%|████████████████████████████████████████████████████████████████████████▏      | 974/1066 [00:07<00:00, 138.81it/s]

[{'label': 'negative', 'score': 0.771221399307251}, {'label': 'neutral', 'score': 0.21172121167182922}, {'label': 'positive', 'score': 0.017057359218597412}]
[{'label': 'negative', 'score': 0.9438266158103943}, {'label': 'neutral', 'score': 0.050421807914972305}, {'label': 'positive', 'score': 0.005751685705035925}]
[{'label': 'negative', 'score': 0.9347560405731201}, {'label': 'neutral', 'score': 0.05975926294922829}, {'label': 'positive', 'score': 0.005484733264893293}]
[{'label': 'negative', 'score': 0.7568826675415039}, {'label': 'neutral', 'score': 0.22913117706775665}, {'label': 'positive', 'score': 0.013986160978674889}]
[{'label': 'negative', 'score': 0.38485249876976013}, {'label': 'neutral', 'score': 0.5800131559371948}, {'label': 'positive', 'score': 0.03513437882065773}]
[{'label': 'negative', 'score': 0.06337720900774002}, {'label': 'neutral', 'score': 0.8507140278816223}, {'label': 'positive', 'score': 0.08590869605541229}]
[{'label': 'negative', 'score': 0.81882596015930

 93%|█████████████████████████████████████████████████████████████████████████▏     | 988/1066 [00:07<00:00, 132.92it/s]

[{'label': 'negative', 'score': 0.46135032176971436}, {'label': 'neutral', 'score': 0.464282751083374}, {'label': 'positive', 'score': 0.07436694949865341}]
[{'label': 'negative', 'score': 0.10312633216381073}, {'label': 'neutral', 'score': 0.7551907300949097}, {'label': 'positive', 'score': 0.141682967543602}]
[{'label': 'negative', 'score': 0.779708743095398}, {'label': 'neutral', 'score': 0.20697245001792908}, {'label': 'positive', 'score': 0.013318746350705624}]
[{'label': 'negative', 'score': 0.8369063138961792}, {'label': 'neutral', 'score': 0.1543949842453003}, {'label': 'positive', 'score': 0.00869865994900465}]
[{'label': 'negative', 'score': 0.8966015577316284}, {'label': 'neutral', 'score': 0.0966915488243103}, {'label': 'positive', 'score': 0.006706927437335253}]
[{'label': 'negative', 'score': 0.28382372856140137}, {'label': 'neutral', 'score': 0.5606268644332886}, {'label': 'positive', 'score': 0.15554948151111603}]
[{'label': 'negative', 'score': 0.6144213676452637}, {'l

 95%|██████████████████████████████████████████████████████████████████████████▍   | 1017/1066 [00:07<00:00, 131.98it/s]

[{'label': 'negative', 'score': 0.9509932994842529}, {'label': 'neutral', 'score': 0.044022247195243835}, {'label': 'positive', 'score': 0.004984575789421797}]
[{'label': 'negative', 'score': 0.21134527027606964}, {'label': 'neutral', 'score': 0.6590261459350586}, {'label': 'positive', 'score': 0.12962861359119415}]
[{'label': 'negative', 'score': 0.8323310017585754}, {'label': 'neutral', 'score': 0.15459269285202026}, {'label': 'positive', 'score': 0.013076309114694595}]
[{'label': 'negative', 'score': 0.767214834690094}, {'label': 'neutral', 'score': 0.22272558510303497}, {'label': 'positive', 'score': 0.010059584863483906}]
[{'label': 'negative', 'score': 0.9139156937599182}, {'label': 'neutral', 'score': 0.07926183938980103}, {'label': 'positive', 'score': 0.006822498980909586}]
[{'label': 'negative', 'score': 0.4860875904560089}, {'label': 'neutral', 'score': 0.48318973183631897}, {'label': 'positive', 'score': 0.030722694471478462}]
[{'label': 'negative', 'score': 0.2312509864568

 98%|████████████████████████████████████████████████████████████████████████████▍ | 1045/1066 [00:08<00:00, 131.06it/s]

[{'label': 'negative', 'score': 0.06191986799240112}, {'label': 'neutral', 'score': 0.8112593293190002}, {'label': 'positive', 'score': 0.12682075798511505}]
[{'label': 'negative', 'score': 0.8747042417526245}, {'label': 'neutral', 'score': 0.11452040821313858}, {'label': 'positive', 'score': 0.010775321163237095}]
[{'label': 'negative', 'score': 0.18314765393733978}, {'label': 'neutral', 'score': 0.7090993523597717}, {'label': 'positive', 'score': 0.10775300115346909}]
[{'label': 'negative', 'score': 0.7373170256614685}, {'label': 'neutral', 'score': 0.24702365696430206}, {'label': 'positive', 'score': 0.015659259632229805}]
[{'label': 'negative', 'score': 0.8234097361564636}, {'label': 'neutral', 'score': 0.16186194121837616}, {'label': 'positive', 'score': 0.01472831517457962}]
[{'label': 'negative', 'score': 0.7951980829238892}, {'label': 'neutral', 'score': 0.16795265674591064}, {'label': 'positive', 'score': 0.03684930503368378}]
[{'label': 'negative', 'score': 0.6539247632026672

100%|██████████████████████████████████████████████████████████████████████████████| 1066/1066 [00:08<00:00, 128.01it/s]

[{'label': 'negative', 'score': 0.13469651341438293}, {'label': 'neutral', 'score': 0.4916401505470276}, {'label': 'positive', 'score': 0.37366339564323425}]
[{'label': 'negative', 'score': 0.9165812134742737}, {'label': 'neutral', 'score': 0.07223008573055267}, {'label': 'positive', 'score': 0.011188692413270473}]
[{'label': 'negative', 'score': 0.6963598132133484}, {'label': 'neutral', 'score': 0.28407543897628784}, {'label': 'positive', 'score': 0.019564734771847725}]
[{'label': 'negative', 'score': 0.9171780943870544}, {'label': 'neutral', 'score': 0.07556598633527756}, {'label': 'positive', 'score': 0.007256021723151207}]
[{'label': 'negative', 'score': 0.4067016541957855}, {'label': 'neutral', 'score': 0.473389208316803}, {'label': 'positive', 'score': 0.11990917474031448}]
[{'label': 'negative', 'score': 0.783981442451477}, {'label': 'neutral', 'score': 0.20310033857822418}, {'label': 'positive', 'score': 0.012918251566588879}]
[{'label': 'negative', 'score': 0.42400074005126953




In [8]:
##evaluation
from sklearn.metrics import classification_report
def evaluate_performance(y_true,y_pred):
    performance=classification_report(
        y_true,y_pred,target_names=['Negative Review','Positive Review']
    )
    print(performance)
evaluate_performance(data['test']['label'],y_pred)

                 precision    recall  f1-score   support

Negative Review       0.76      0.88      0.81       533
Positive Review       0.86      0.72      0.78       533

       accuracy                           0.80      1066
      macro avg       0.81      0.80      0.80      1066
   weighted avg       0.81      0.80      0.80      1066



![Alt text for the image](images/confusion_matrix.png)

- **Precison**
- Precision: Of all the emails the model predicted as spam, how many were actually spam?
- Precision=TP/(TP+FP)
- A measure of exactness or quality of your positive predictions. High precision means when your model says something is positive, it's very likely correct.
- If your spam filter flagged 10 emails as spam, and 8 of them were actually spam (2 were legitimate emails wrongly flagged), then your precision would be 8/(8+2)=0.8 or 80%.
- When is it crucial? When the cost of a **False Positive** is high.
- Spam detection: You want high precision because you don't want important emails (legitimate) to be marked as spam (false positive).
- Medical diagnosis (for a serious, rare disease): If a positive prediction means invasive and potentially harmful treatment, you want high precision to minimize false alarms.

- **Recall**:
- Of all the emails that were actually spam, how many did the model correctly identify as spam?
- Recall = TP/(TP+FN)
- A measure of completeness or coverage of your positive predictions. High recall means your model caught most of the actual positive cases.
- When the cost of a **False Negative** is high.
- You want high recall because you want to catch as many fraudulent transactions as possible, even if it means flagging some legitimate ones for review (false positives). Missing actual fraud (false negative) is very costly.

- **Accuracy**:
- Number of correctly positive predicted out of all predictions, which indicates the overall correctness of the model.

- **F1 score**
- balances both precision and recall to create a model's overall performance.



## Classification Tasks that Leverage Embeddings

- **Supervised classification**
- Instead of directly using the representation model for classification, we will use an embedding model for generating features. Those features can then be fed into a classifier, thereby creating a two-step approach.
-  A major benefit of this separation is that we do not need to fine-tune our embedding model, which can be costly. In contrast, we can train a classifier, like a logistic regression, on the CPU instead.
-  The model is kept frozen when the embeddings are generated.

In [9]:
from sentence_transformers import SentenceTransformer

# Load model
model = SentenceTransformer('sentence-transformers/all-mpnet-base-v2')

# Convert text to embeddings
train_embeddings = model.encode(data["train"]["text"], show_progress_bar=True)
test_embeddings = model.encode(data["test"]["text"], show_progress_bar=True)

Batches:   0%|          | 0/267 [00:00<?, ?it/s]

Batches:   0%|          | 0/34 [00:00<?, ?it/s]

In [10]:
train_embeddings.shape

(8530, 768)

- This shows that each of our 8,530 input documents has an embedding dimension of 768 and therefore each embedding contains 768 numerical values.
- Then these embedding are fed into the classifier (use logistic regression) 

In [11]:
from sklearn.linear_model import LogisticRegression

# Train a Logistic Regression on our train embeddings
clf = LogisticRegression(random_state=42)
clf.fit(train_embeddings, data["train"]["label"])

In [12]:
y_pred=clf.predict(test_embeddings)

In [13]:
evaluate_performance(data['test']['label'],y_pred)

                 precision    recall  f1-score   support

Negative Review       0.85      0.86      0.85       533
Positive Review       0.86      0.85      0.85       533

       accuracy                           0.85      1066
      macro avg       0.85      0.85      0.85      1066
   weighted avg       0.85      0.85      0.85      1066



By training a classifier on top of our embeddings, we managed to get an F1 score of 0.85! This demonstrates the possibilities of training a lightweight classifier while keeping the underlying embedding model frozen.

## What If We Do Not Have Labeled Data?