Here's a different model that is able to predict 7 different emotions:
* anger 🤬
* disgust 🤢
* fear 😨
* joy 😀
* neutral 😐
* sadness 😭
* surprise 😲


In [52]:
from transformers import pipeline
classifier = pipeline("text-classification", model="j-hartmann/emotion-english-distilroberta-base", top_k=None) # https://huggingface.co/j-hartmann/emotion-english-distilroberta-base
classifier("I love this!")

[[{'label': 'joy', 'score': 0.9771687984466553},
  {'label': 'surprise', 'score': 0.008528688922524452},
  {'label': 'neutral', 'score': 0.005764583125710487},
  {'label': 'anger', 'score': 0.004419781267642975},
  {'label': 'sadness', 'score': 0.002092392183840275},
  {'label': 'disgust', 'score': 0.0016119900392368436},
  {'label': 'fear', 'score': 0.0004138521908316761}]]

In [53]:
classifier("That was absolutely disgusting")

[[{'label': 'disgust', 'score': 0.9869235157966614},
  {'label': 'anger', 'score': 0.004495014902204275},
  {'label': 'fear', 'score': 0.003422108246013522},
  {'label': 'neutral', 'score': 0.0023785908706486225},
  {'label': 'sadness', 'score': 0.0016001597978174686},
  {'label': 'surprise', 'score': 0.0008600382134318352},
  {'label': 'joy', 'score': 0.0003206325345672667}]]

### Exercise 3. Explore Twitter Sentiment - How was ChatGPT perceived at the start of 2023?
\**Because you'd each need to create individual accounts on twitter to scrape data, we'll instead use an existing dataset of real tweets*

\**Because running these pretrained models takes quite a bit of time on CPU, we'll only look at samples of 100 tweets in our investigations. If you have access to a GPU and want to run the investigation on all data, feel free to do so.*



1. Create a dataframe called first_100_en_tweets that will store the first 100 tweets that are in English.
2. Create a dataframe called first_100_de_tweets that will store the first 100 tweets that are in German.
3. Add the sentiment values for each tweets in these dataframes (in a new "sentiment" column).

 *Head out to https://huggingface.co/models?pipeline_tag=text-classification and find a German language sentiment model for the DE tweets*. For the English ones you can use the model commented below.
4. Calculate separately the average sentiment for the 100 EN tweets and for the 100 DE tweets and see which audience seems to like ChatGPT more.
5. Find the user with the highest total LikeCount (filter by Language = en), check if his tweets are on average positive or negative. 

In [54]:
from transformers import pipeline
# English Tweets Sentiment Model
# sentiment_pipeline = pipeline("text-classification", model='ProsusAI/finbert')#, top_k=None)

In [55]:
sentiment_pipeline = pipeline("text-classification", model='cardiffnlp/twitter-roberta-base-sentiment')#, top_k=None)

In [56]:
import pandas as pd
tweets = pd.read_csv('chatgpt_tweets.csv')
tweets

Unnamed: 0,Datetime,Text,Username,ReplyCount,RetweetCount,LikeCount,Language
0,2023-01-22 13:44:34+00:00,ChatGPTで遊ぶの忘れてた！！\n書類作るコード書いてみてほしいのと、\nどこまで思考整...,mochico0123,1,0,5,ja
1,2023-01-22 13:44:39+00:00,@AlexandrovnaIng Prohibition of ChatGPT has be...,Caput_LupinumSG,1,0,5,en
2,2023-01-22 13:44:44+00:00,"Schaut Euch an, was @fobizz @DianaKnodel alles...",ciffi,0,0,4,de
3,2023-01-22 13:44:49+00:00,Bow down to chatGPT 🫡..... https://t.co/ENTSzi...,Vishwasrisiri,0,0,2,en
4,2023-01-22 13:44:52+00:00,"Profilinde vatan, Türkiye falan yazan bireyler...",0xGenetikciniz,0,0,4,tr
...,...,...,...,...,...,...,...
49996,2023-01-24 06:57:56+00:00,"#ChatGPT ist ein #Chatbot, der durch künstlich...",HorstKrieger,0,0,0,de
49997,2023-01-24 06:57:59+00:00,@r8r Ich hab mal die AI dazu befragt (ChatGPT)...,werpu,0,0,0,de
49998,2023-01-24 06:58:00+00:00,5 minuti di #chatGPT e ho capito che apprende ...,marcopiccinini,0,0,0,it
49999,2023-01-24 06:58:01+00:00,Portland Shop Uses ChatGPT To Tell Family Stor...,EuniceNyandat,0,0,0,en


In [57]:
tweets_en = tweets[tweets['Language']=='en'].sort_values(by=['LikeCount'],ascending=False).head(100)
tweets_en = tweets_en.reset_index(drop=True)
tweets_en.index+=1
tweets_en['Sentiment'] = tweets_en['Text'].apply(sentiment_pipeline).str[0]
tweets_en['Sentiment'] = tweets_en['Sentiment'].apply(lambda x: x['label'])
tweets_en=tweets_en.replace('LABEL_0', 'negative').replace('LABEL_1', 'neutral').replace('LABEL_2', 'positive')
tweets_en

In [58]:
results_en=tweets_en['Sentiment'].value_counts()
results_en

neutral     38
positive    36
negative    26
Name: Sentiment, dtype: int64

In [91]:
average_en = (results_en['negative']*1 + results_en['neutral']*3 + results_en['positive']*5) / 100
print("Score: " + str(average_en)+" / 5.0")

Score: 3.2 / 5.0


In [60]:
sentiment_pipeline = pipeline("text-classification", model='JP040/bert-german-sentiment-twitter')#, top_k=None)

In [61]:
tweets_de = tweets[tweets['Language']=='de'].sort_values(by=['LikeCount'],ascending=False).head(100)
tweets_de = tweets_de.reset_index(drop=True)
tweets_de.index+=1
tweets_de['Sentiment'] = tweets_de['Text'].apply(sentiment_pipeline).str[0]
tweets_de['Sentiment'] = tweets_de['Sentiment'].apply(lambda x: x['label'])
tweets_de

Unnamed: 0,Datetime,Text,Username,ReplyCount,RetweetCount,LikeCount,Language,Sentiment
1,2023-01-22 17:37:34+00:00,"Lehrer wollen mit Bedenken, dass Schüler ihre ...",hrtgn,374,25,1608,de,neutral
2,2023-01-22 17:18:00+00:00,Ich habe ChatGPT einen Dialog zwischen Frank T...,janskudlarek,22,49,759,de,neutral
3,2023-01-22 15:56:39+00:00,Dieser Leitfaden zur Nutzung von #ChatGPT mit ...,an_annago,7,68,273,de,neutral
4,2023-01-23 17:42:59+00:00,#OpenAI (wichtigste Geldgeber: #ElonMusk &amp;...,wolff_ernst,21,56,203,de,neutral
5,2023-01-22 18:31:12+00:00,@hrtgn Da würde ich mir als Autor und Satirike...,retrostyle3000,2,3,176,de,neutral
...,...,...,...,...,...,...,...,...
96,2023-01-23 13:00:35+00:00,Da hat man mal Zeit und schon ist #ChatGPT dau...,BioSchwan,3,0,13,de,negative
97,2023-01-22 18:05:02+00:00,Der Gesundheitsminister wurde durch ChatGPT er...,johnny_raccoon,0,0,12,de,neutral
98,2023-01-23 09:50:51+00:00,Der Ausschuss hat eine Infosammlung zum Thema ...,AusschussG,0,3,12,de,neutral
99,2023-01-22 15:21:40+00:00,Ich habe #ChatGPT auch nach den Argumenten geg...,multitalentfrey,1,2,12,de,neutral


In [62]:
results_de=tweets_de['Sentiment'].value_counts()
results_de

neutral     80
negative    19
positive     1
Name: Sentiment, dtype: int64

In [63]:
average_de = (results_de['negative']*1 + results_de['neutral']*3 + results_de['positive']*5) / 100
print("Score: " + str(average_de)+" / 5.0")

Score: 2.64 / 5.0


In [72]:
tweets_en.groupby('Username')['LikeCount'].sum().sort_values(ascending=False)  #.idxmax()

Username
GRDecter           72183
WatcherGuru        21673
mccormick_ted      16856
noor_siddiqui_     13290
emollick            9946
                   ...  
Ed_FilmBooth         339
MaxWinebach          334
mukundabhinav        334
DarrenJBeattie       327
iansmithfitness      326
Name: LikeCount, Length: 95, dtype: int64

In [85]:
results=tweets_en[tweets_en['Username']=='GRDecter']['Sentiment'].value_counts()
if 'negative' not in results:
    results['negative']=0
results

positive    2
neutral     1
negative    0
Name: Sentiment, dtype: int64

In [89]:
average = round((results['negative']*1 + results['neutral']*3 + results['positive']*5) / 3, 2)
print("Score: " + str(average)+" / 5.0")

Score: 4.33 / 5.0
