In [64]:
from transformers import pipeline
from datasets import load_dataset
from tqdm import tqdm
import pandas as pd
import plotly.express as px
import textwrap

In this notebook I'll showcase basic sentiment analysis on tweets using the Huggingface's `Pipeline` interface.

In [83]:
classifier = pipeline(task="sentiment-analysis",model='distilbert/distilbert-base-uncased-finetuned-sst-2-english')

Device set to use cuda:0


Let's load a few tweets and classify them

In [84]:
dataset = load_dataset("carblacac/twitter-sentiment-analysis", split="test")

In [85]:
inputs = dataset['text'][:200]
truth = dataset['feeling'][:200]

In [86]:
answers = classifier(inputs)

In [88]:
df = pd.DataFrame([inputs,truth,answers]).T.rename(columns={0:'text',1:'truth',2:'output'})
df['truth'] = df['truth'].replace({0:'NEGATIVE',1:'POSITIVE'})

In [94]:
df['sentiment'] = df['output'].apply(lambda t: t['label'])
df['score'] = df['output'].apply(lambda t: t['score'] if t['label'] == 'NEGATIVE' else 1-t['score'])
df

Unnamed: 0,text,truth,output,sentiment,score,wrapped_text
0,@justineville ...yeahhh. ) i'm 39 tweets from ...,POSITIVE,"{'label': 'NEGATIVE', 'score': 0.9856041669845...",NEGATIVE,0.985604,@justineville ...yeahhh. ) i'm 39 tweets from<...
1,@ApplesnFeathers aww. Poor baby! On your only ...,NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.9975548386573...",NEGATIVE,0.997555,@ApplesnFeathers aww. Poor baby! On your only ...
2,@joeymcintyre With my refunded $225 (Australia...,NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.7486612200737}",NEGATIVE,0.748661,@joeymcintyre With my refunded $225 (Australia...
3,It's fine. Today sucks just because me those t...,NEGATIVE,"{'label': 'POSITIVE', 'score': 0.9985688924789...",POSITIVE,0.001431,It's fine. Today sucks just because me those<b...
4,"Im just chilling on psp and stuff, but sitting...",NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.9953784942626...",NEGATIVE,0.995378,"Im just chilling on psp and stuff, but sitting..."
...,...,...,...,...,...,...
195,my bloodsugar is low but im late for class..s...,NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.9951849579811...",NEGATIVE,0.995185,my bloodsugar is low but im late for class..s...
196,"Homw from camping, laundry and dinner done. T...",NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.9930850267410...",NEGATIVE,0.993085,"Homw from camping, laundry and dinner done. T..."
197,@gilliganpierce Im sorry. at least u dont hav...,NEGATIVE,"{'label': 'NEGATIVE', 'score': 0.9988734126091...",NEGATIVE,0.998873,@gilliganpierce Im sorry. at least u dont hav...
198,@tearsasmith I saw your men's sandal question....,POSITIVE,"{'label': 'POSITIVE', 'score': 0.9949159622192...",POSITIVE,0.005084,@tearsasmith I saw your men's sandal question....


The tooltip in Plotly express does not wrap text, so we have to do it by hand to prevent it from being awkwardly cut in the visualization.

In [95]:
df['wrapped_text'] = df['text'].apply(lambda x: '<br>'.join(textwrap.wrap(x, width=50)))

In [109]:
from sklearn.metrics import classification_report,ConfusionMatrixDisplay, confusion_matrix

report = classification_report(df['truth'], df['sentiment'])
print(report)

              precision    recall  f1-score   support

    NEGATIVE       0.65      0.85      0.74        98
    POSITIVE       0.79      0.57      0.66       102

    accuracy                           0.70       200
   macro avg       0.72      0.71      0.70       200
weighted avg       0.73      0.70      0.70       200



It looks like model mostly performs well (F1=0.7), but it seems poorly calibrated: it predicts incorrectly with high confidence. For ambiguous cases (e.g. "busy busy! getting ready for my open house!") it should predict closer to 0.5. 

In [96]:
px.scatter(df, x='score', hover_data=['wrapped_text'],color='truth')

The model really likes to make extreme predictions. It's hard to give a quantitative estimate of the calibration because we have only binary label, adding the "NEUTRAL" or "AMBIGUOUS" class to the dataset might help mitigate this problem.

In [108]:
px.histogram(df,x='score',nbins=20)