# Euphoria Twitter Analysis

## Scope:
Things I want to analyze/achieve with this notebook 

- Drop duplicates
- WordCloud
- Most famous/tweeted about characters 
- Sentiment towards main characters  
- Most famous tweets about the show 


### Pre-Processing Tweets Steps

Tweets will be cleaned up using the following steps 
1. Removing Punctuations 
2. Tokenization - converting a sentence (full-text tweet) into a list of words
3. Removing Stopwords
4. Lemmatization/Stemming - Transforming any form of a word to its root word 

In [1]:
# import libraries 
import pandas as pd 
import regex as re

In [2]:
pd.set_option('display.max_colwidth', None)

In [3]:
tweets_df = pd.read_csv('euphoria_tweets_trimmed.csv', index_col=0)

In [4]:
tweets_df.head()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,
4,YESSSIR DOING A QUICK STREAM BEFOE TONIGHT'S #euphoria Episode LETS BECOME THE ULTIMATE WEAPON LIVE @ 5:00PM #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate https://t.co/HMTCSzdC5Y,2022-01-23 23:58:56,1.485402e+18,Chris RossVlogs,ChrisRossvlogs,1.347083e+18,Mr. Hollywood,False,116.0,276.0,1.0,3.0,United States,USA


In [5]:
tweets_df.tail()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code
16976,I hate that I have a gut feeling that something awful is gonna happen with the Lexi and Fez situation 😭😭they’re so cute #Euphoria,2022-02-15 00:00:31,1.493375e+18,olivia,Liv_Jord,1.27821e+18,No Location,False,26.0,367.0,0.0,1.0,,
16977,What’s with all the adults on Euphoria just casually drinking with minors 😂 #EuphoriaHBO,2022-02-15 00:00:21,1.493375e+18,(toe•lu)🇳🇬,skiipTOmyLU,24566620.0,"Houston, TX",False,412.0,366.0,0.0,0.0,United States,USA
16978,“Episode eight is where we’ll get that sense of redemption.” Only two more episodes left this season of #Euphoria! https://t.co/kq0KAscb2P,2022-02-15 00:00:20,1.493375e+18,GRAZIA,graziatweets,20419040.0,New York,True,21726.0,794.0,1.0,0.0,United States,USA
16979,Rue be like #Euphoria https://t.co/E0Z6AEYyEb,2022-02-15 00:00:15,1.493375e+18,the business bitch,_babuba_,4765358000.0,No Location,False,1390.0,335.0,0.0,1.0,,
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA


In [6]:
tweets_df.shape

(186693, 14)

Next Steps: 
- get number of unique hastags
- Then get most popular hashtags and the count for those
- Get most popular characters that have been tweeted about 
    - who are tweeting more about?
- Define stopwords 
- Then move onto:
1. Removing Punctuations
2. Tokenization - converting a sentence (full-text tweet) into a list of words
3. Removing Stopwords
4. Lemmatization/Stemming - Transforming any form of a word to its root word

Don't forget to remove Punctuations 

Then after all that above, move onto Data Exploration and Building Visualizations in Python

In [7]:
def getHashtags(tweet):
    tags = re.findall('#\w+',tweet)
    un_listed_tags = " ".join(tags)
    return un_listed_tags

In [8]:
tweets_df['hash_tags'] = tweets_df['tweet_full_text'].apply(getHashtags)

In [9]:
tweets_df.head()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND,#Euphoria
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,,#Euphoria
4,YESSSIR DOING A QUICK STREAM BEFOE TONIGHT'S #euphoria Episode LETS BECOME THE ULTIMATE WEAPON LIVE @ 5:00PM #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate https://t.co/HMTCSzdC5Y,2022-01-23 23:58:56,1.485402e+18,Chris RossVlogs,ChrisRossvlogs,1.347083e+18,Mr. Hollywood,False,116.0,276.0,1.0,3.0,United States,USA,#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate


How many unique hashtags are there?

In [10]:
tweets_df['hash_tags'].nunique()

17508

In [11]:
tweets_df['hash_tags'].unique().tolist()

['#Euphoria',
 '#AndJustLikeThat #Euphoria',
 '#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate',
 '#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat',
 '#MADDYPEREZ #maddy #euphoria #edit',
 '#EuphoriaHBOMax',
 '#EuphoriaDay #Euphoria',
 '#euphoria',
 '#EUPHORIA #Euphoria #EuphoriaHBOMax #EuphoriaSeason2',
 '#Euphoria #EuphoriaSeason2',
 '#LARvsTB #Euphoria #EuphoriaHBOMax',
 '#EuphoriaDay #EuphoriaHBOMax #Euphoria #EuphoriaSeason2',
 '#BuffaloBills #BillsvsChiefs #Bitcoin #Etherum #EuphoriaHBOMax #Euphoria',
 '#Euphoria #Ozark',
 '#EuphoriaHBOMax #Euphoria',
 '#LanaDelRey #WatercolorEyes #EuphoriaHBOMax',
 '#EuphoriaHBOMax #EuphoriaSeason2',
 '#21JumpStreet #Euphoria',
 '#EuphoriaHBOMax #americandad #FrancineSmith #EuphoriaDay #CantUnsee',
 '#EuphoriaHBOMax #PowerGhost',
 '#abouttime #equality #EuphoriaHBOMax #tiredofthetitties',
 '#OBSESSED #EuphoriaHBOMax',
 '#Euph

In [12]:
tweets_df[['hash_tags']].head(20)

Unnamed: 0,hash_tags
0,#Euphoria
1,#Euphoria
2,#AndJustLikeThat #Euphoria
3,#Euphoria
4,#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate
5,#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat
6,#Euphoria
7,#MADDYPEREZ #maddy #euphoria #edit
8,#Euphoria
9,#EuphoriaHBOMax


Taking a look at the sample df above there are values within that combine different hashtags into one item/value. Example, row 2 when you have `#AndJustLikeThat #Euphoria` but those two items should be separate and not together 

In [13]:
tweets_df['updated_tags']=tweets_df['hash_tags'].str.split(' ')

In [14]:
tweets_df.head()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags,updated_tags
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria,[#Euphoria]
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND,#Euphoria,[#Euphoria]
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria,"[#AndJustLikeThat, #Euphoria]"
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,,#Euphoria,[#Euphoria]
4,YESSSIR DOING A QUICK STREAM BEFOE TONIGHT'S #euphoria Episode LETS BECOME THE ULTIMATE WEAPON LIVE @ 5:00PM #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate https://t.co/HMTCSzdC5Y,2022-01-23 23:58:56,1.485402e+18,Chris RossVlogs,ChrisRossvlogs,1.347083e+18,Mr. Hollywood,False,116.0,276.0,1.0,3.0,United States,USA,#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate,"[#euphoria, #prototypegame, #gamer, #gaming, #StreamersConnected, #Streamcaster, #SupportSmallerStreamers, #SupportSmallStreams, #Twitch, #Youtube, #twitchstreamer, #roadtoaffiliate]"


In [15]:
tweets_df[['updated_tags']].head()

Unnamed: 0,updated_tags
0,[#Euphoria]
1,[#Euphoria]
2,"[#AndJustLikeThat, #Euphoria]"
3,[#Euphoria]
4,"[#euphoria, #prototypegame, #gamer, #gaming, #StreamersConnected, #Streamcaster, #SupportSmallerStreamers, #SupportSmallStreams, #Twitch, #Youtube, #twitchstreamer, #roadtoaffiliate]"


In [16]:
tweets_df = tweets_df.explode("updated_tags")

In [17]:
tweets_df.head()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags,updated_tags
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria,#Euphoria
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND,#Euphoria,#Euphoria
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria,#AndJustLikeThat
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria,#Euphoria
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,,#Euphoria,#Euphoria


In [18]:
tweets_df.tail()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags,updated_tags
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA,#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends,#Valentine
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA,#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends,#Euphoria
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA,#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends,#Odell
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA,#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends,#lssonlinemart
16980,Unique Rotating Ultra-Thin Steel Watch👉https://t.co/M3LvHmeHAf\nChoose The Best Gift For Your Valentine👉https://t.co/xoGrEKdDoi\nVisit Our website👉https://t.co/TYPGNjIVy1\nSubscribe Our Link👉https://t.co/HKL1DTiG4S\n\n#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends https://t.co/z2wKCRxAUv,2022-02-15 00:00:01,1.493375e+18,Lucky Super Store,LSS_Store,1.062288e+18,"715 Main St, Asbury Park, NJ",False,1507.0,1477.0,0.0,0.0,United States,USA,#watches #Valentine #Euphoria #Odell #lssonlinemart #lsstrends,#lsstrends


In [19]:
tweets_df['updated_tags'].nunique()

17348

In [20]:
tweets_df[['updated_tags']].head(20)

Unnamed: 0,updated_tags
0,#Euphoria
1,#Euphoria
2,#AndJustLikeThat
2,#Euphoria
3,#Euphoria
4,#euphoria
4,#prototypegame
4,#gamer
4,#gaming
4,#StreamersConnected


In [21]:
tweets_df.shape

(305309, 16)

In [22]:
tweets_df['updated_tags'].value_counts().head(10)

#Euphoria           141639
#EuphoriaHBOMax      25743
#euphoria            17957
#EuphoriaHBO         14327
#EuphoriaFinale       5015
#EuphoriaSeason2      4973
#tvtime               3195
#EUPHORIA             2725
#EuphoriaDay          1914
#Zendaya              1425
Name: updated_tags, dtype: int64

In [23]:
tag_counts = tweets_df.updated_tags.value_counts(normalize = True).mul(100).round(2)

In [24]:
tag_stats = pd.concat([tweets_df['updated_tags'].value_counts(), tag_counts], axis = 1)

In [25]:
tag_stats

Unnamed: 0,updated_tags,updated_tags.1
#Euphoria,141639,46.39
#EuphoriaHBOMax,25743,8.43
#euphoria,17957,5.88
#EuphoriaHBO,14327,4.69
#EuphoriaFinale,5015,1.64
...,...,...
#hannahmontana,1,0.00
#BheemlaNaayak,1,0.00
#ES_F,1,0.00
#dildaara,1,0.00


In [26]:
tag_stats = tag_stats.reset_index()

In [27]:
tag_stats.columns = ['hashtag', 'value_count', 'percentage']

In [28]:
top_ten_tags=tag_stats.head(10)

In [29]:
top_ten_tags

Unnamed: 0,hashtag,value_count,percentage
0,#Euphoria,141639,46.39
1,#EuphoriaHBOMax,25743,8.43
2,#euphoria,17957,5.88
3,#EuphoriaHBO,14327,4.69
4,#EuphoriaFinale,5015,1.64
5,#EuphoriaSeason2,4973,1.63
6,#tvtime,3195,1.05
7,#EUPHORIA,2725,0.89
8,#EuphoriaDay,1914,0.63
9,#Zendaya,1425,0.47


In [30]:
tweets_df = tweets_df.drop(columns = ['updated_tags'])

In [31]:
tweets_df.duplicated().sum()

118616

In [32]:
tweets_df.shape

(305309, 15)

In [33]:
tweets_df = tweets_df.drop_duplicates()

In [34]:
tweets_df.head()

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND,#Euphoria
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,,#Euphoria
4,YESSSIR DOING A QUICK STREAM BEFOE TONIGHT'S #euphoria Episode LETS BECOME THE ULTIMATE WEAPON LIVE @ 5:00PM #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate https://t.co/HMTCSzdC5Y,2022-01-23 23:58:56,1.485402e+18,Chris RossVlogs,ChrisRossvlogs,1.347083e+18,Mr. Hollywood,False,116.0,276.0,1.0,3.0,United States,USA,#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate


In [35]:
tweets_df.shape

(186693, 15)

In [36]:
tweets_df.dtypes

tweet_full_text                object
tweet_created_at               object
tweet_id                      float64
tweet_username                 object
tweet_user_screename           object
tweet_userid_str              float64
tweet_user_location            object
tweet_user_verified              bool
tweet_user_followers_count    float64
tweet_user_friends_count      float64
tweet_retweet_count           float64
tweet_favorite_count          float64
country_name                   object
country_code                   object
hash_tags                      object
dtype: object

## Cast Member Name Extraction

In [37]:
# Top Cast as per IMDB 
cast_names = ["rue", 'jules', 'fez', 'fezco', 'nate', 'lexi', 'maddy', 'kat', 'cassie', 'ash', 'ashtray', 'leslie',"cal", 'suze']

In [38]:
# Testing RegEx on sample text
reg_test = re.compile(r"\L<options>",re.I, options=cast_names)

In [39]:
text_sample = "#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight"

In [40]:
reg_test.findall(text_sample)

['Maddy', 'cal']

The example above using the variable `text_sample` shows that my regex code isn't going to work, one of Euphoria characters is named Cal and the regex interpreted 'cal' in scalp for Cal

Need to do some testing in order to figure out how to get the results back without a list

In [41]:
x = re.compile(r'maddy|cal')

In [42]:
y = x.search("#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight")

In [43]:
y.group()

'cal'

In [44]:
# New Try
reg_test_update = re.compile(r"\b\L<options>\b",re.I, options=cast_names)

In [45]:
reg_test_update.findall(text_sample)

['Maddy']

The modifications made to the code are good to go now 

In [52]:
# loop through everything in tweet_full_text and append all names that appear into a list and then get the 
#frequency of names in that list to ascertain who appears the most
char_names = []
def most_popular_character(tweet):
    cast_names = ["rue", 'jules', 'fez', 'fezco', 'nate', 'lexi', 'maddy', 'kat', 'cassie', 'ash', 
                  'ashtray', 'leslie',"cal", 'suze']
    reg_char = re.compile(r"\b\L<options>\b",re.I, options=cast_names)
    
    
    
    #for twee in tweet.lower:
        #twee_lower = twee.lower()
    names = reg_char.findall(tweet)
    #char_names.append(names)
    
    test = ', '.join(names)
    return test

In [53]:
tweets_df['characters'] = tweets_df['tweet_full_text'].apply(most_popular_character)

In [59]:
tweets_df.head(10)

Unnamed: 0,tweet_full_text,tweet_created_at,tweet_id,tweet_username,tweet_user_screename,tweet_userid_str,tweet_user_location,tweet_user_verified,tweet_user_followers_count,tweet_user_friends_count,tweet_retweet_count,tweet_favorite_count,country_name,country_code,hash_tags,characters
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria,"fez, Lexi"
0,Really want a fez / Lexi parallel scene to when Spencer / Toby from PLL played Scrabble for the first time in the motel room and slept in the same bed. Is that just too much to ask ??? #Euphoria,2022-01-23 23:59:50,1.485402e+18,Katy,katykitty628,1.471191e+18,No Location,False,36.0,82.0,0.0,0.0,,,#Euphoria,"fez, Lexi"
1,#Euphoria I hope Maddy pulls the hair out that girl's scalp tonight,2022-01-23 23:59:33,1.485402e+18,𝐕𝐢𝐫𝐠𝐞𝐚𝐮𝐱,_njauu,9.510643e+17,here,False,1586.0,221.0,0.0,0.0,India,IND,#Euphoria,Maddy
2,Between #AndJustLikeThat and #Euphoria @HBO is insisting on showing me more white dick then I've ever cared to see in my life. Sheesh.,2022-01-23 23:59:04,1.485402e+18,Steph Swinton,StephSwinton,23906170.0,New York,False,339.0,729.0,0.0,2.0,United States,USA,#AndJustLikeThat #Euphoria,
3,its #Euphoria day!!!!,2022-01-23 23:59:04,1.485402e+18,felic-titty,lilbootybbg,1.173761e+18,in therapy,False,49.0,145.0,0.0,2.0,,,#Euphoria,
4,YESSSIR DOING A QUICK STREAM BEFOE TONIGHT'S #euphoria Episode LETS BECOME THE ULTIMATE WEAPON LIVE @ 5:00PM #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate https://t.co/HMTCSzdC5Y,2022-01-23 23:58:56,1.485402e+18,Chris RossVlogs,ChrisRossvlogs,1.347083e+18,Mr. Hollywood,False,116.0,276.0,1.0,3.0,United States,USA,#euphoria #prototypegame #gamer #gaming #StreamersConnected #Streamcaster #SupportSmallerStreamers #SupportSmallStreams #Twitch #Youtube #twitchstreamer #roadtoaffiliate,
5,That scene where Kat imagines those girls is literally the comment section when someone opens up about their lack of self esteem 😭😭like just yelling LOVE YOURSELF! doesn't change anything\n#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat,2022-01-23 23:58:39,1.485402e+18,Ga€ll€,herbabymama,1.3802e+18,No Location,False,2.0,34.0,0.0,1.0,,,#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat,"Kat, kat"
5,That scene where Kat imagines those girls is literally the comment section when someone opens up about their lack of self esteem 😭😭like just yelling LOVE YOURSELF! doesn't change anything\n#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat,2022-01-23 23:58:39,1.485402e+18,Ga€ll€,herbabymama,1.3802e+18,No Location,False,2.0,34.0,0.0,1.0,,,#EuphoriaHBOMax #Euphoria #EuphoriaSeason2 #kat,"Kat, kat"
6,Just a few more hours till ep 3 comes out 🥳 #Euphoria,2022-01-23 23:58:29,1.485402e+18,🤍,ayeshhaa_al,1.373359e+18,"Detroit, MI",False,101.0,98.0,0.0,0.0,United States,USA,#Euphoria,
7,DONT U DARE MENTION THE 3D TUNNEL I KNOW IT SUCKS. #MADDYPEREZ #maddy #euphoria #edit https://t.co/BtoDe7G4Ww,2022-01-23 23:58:10,1.485402e+18,noelle (taylor's version),sbtuarry,1.295025e+18,genderfluid :) + it / it's,False,578.0,455.0,1.0,2.0,,,#MADDYPEREZ #maddy #euphoria #edit,maddy


In [58]:
tweets_df=tweets_df.explode('characters')


In [56]:
tweets_df['characters'].value_counts()

          99715
Rue       11411
Cassie     7218
Cal        5032
Nate       4307
          ...  
RuE           1
cASSIE        1
JuLES         1
mADDY         1
caSsiE        1
Name: characters, Length: 4991, dtype: int64

In [60]:
tweets_df['characters'].tolist()

['fez, Lexi',
 'fez, Lexi',
 'Maddy',
 '',
 '',
 '',
 'Kat, kat',
 'Kat, kat',
 '',
 'maddy',
 '',
 '',
 'Nate, Ash',
 'Nate, Ash',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 'Nate, Jules',
 'Nate, Jules',
 '',
 '',
 '',
 '',
 'Rue',
 '',
 'Kat',
 '',
 '',
 '',
 'rue',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 'Nate',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 'maddy',
 '',
 '',
 '',
 '',
 '',
 'Fez',
 '',
 '',
 '',
 'Lexi',
 '',
 '',
 'Fez, Lexi',
 'Fez, Lexi',
 'Nate',
 '',
 '',
 '',
 '',
 'fezco, lexi, fez',
 'fezco, lexi, fez',
 'fezco, lexi, fez',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 'maddy, cassie',
 'maddy, cassie',
 '',
 '',
 '',
 '',
 '',
 'fezco',
 'Lexi, Fez, Maddy, Fez',
 'Lexi, Fez, Maddy, Fez',
 'Lexi, Fez, Maddy, Fez',
 'Lexi, Fez, Maddy, Fez',
 '',
 '',
 'Nate, Cassie, Nate, Maddy',
 'Nate, Cassie, Nate, Maddy',
 'Nate, Cassie, Nate, Maddy',
 'Nate, Cassie, Nate, Maddy',
 'Lexi, Fez',
 'Lexi, Fez',
 '',
 'fez, lexi',
 'fez, lexi',
 '',
 '',


In [None]:
def clean_alt_list(list_):
    list_ = list_.replace(', ', '","')
    list_ = list_.replace('[', '["')
    list_ = list_.replace(']', '"]')
    return list_

In [None]:
tweets_df['col'] = tweets_df['characters'].apply(clean_alt_list)


In [None]:
tweets_df["col"] = tweets_df["characters"].apply(eval)


run function to get two character columns, one for finding out most popular character and another for positive negative character sentiment 

explode column to get most common names and then have another column which simply has the names (out of a list) this column will come in handy during time for sentiment in order to ascertain which character has most positve/negative setiment 

## Most Viral Tweets 

## How many negative tweets did Cassie get as opposed to Nate?