## Morbius - Twitter data

In [1]:
import numpy as np
import pandas as pd
import tweepy

In [2]:
# insert the keys here
consumer_key = '' 
consumer_secret = ''
access_token = ''
access_token_secret = ''

The next step is creating an OAuthHandler instance. We pass our consumer key and access token which we defined above.

In [3]:
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

Next, we pass the OAuthHandler instance into the API method.

In [4]:
api = tweepy.API(auth)

Next, we initialize lists for fields we are interested in analyzing. For now, we can look at the tweet strings, users, and the time of the tweet. Next, we write a for loop over a tweepy ‘Cursor’ object. Within the ‘Cursor’ object we pass the ‘api.search’ method, set the query string (q= "Morbius" ) for what we would like to search for, and set ‘count’ = 1000 so that we don’t exceed the twitter rate limit. We also use the ‘item()’ method to convert the ‘Cursor’ object into an iterable.

In order to simplify the query we can remove retweets and only include tweets in English.

In [6]:
'''
twitter_users = []
tweet_time = []
tweet_string = []

for tweet in tweepy.Cursor(api.search_tweets, q="Morbius", count=1000).items(1000):
    if (not tweet.retweeted) and ('RT @' not in tweet.text):
        if tweet.lang == "en":
            twitter_users.append(tweet.user.name)
            tweet_time.append(tweet.created_at)
            tweet_string.append(tweet.text)

print([tweet.user.name,tweet.created_at,tweet.text])
'''

['Jingly-Jangly🚶\u200d♀️🐕🐦🐦', datetime.datetime(2022, 3, 31, 10, 19, 1, tzinfo=datetime.timezone.utc), 'RT @jrvsscarlet: Be careful out there everyone. I had 2 Morbius tickets in my car and someone broke in and left 4 more. https://t.co/AUtwDe…']


The next thing we can do is store the query results in a dataframe. To do this let’s define a function that takes a key word as an argument and returns a dataframe with 1000 tweets related to the keyword.

In [6]:
def get_related_tweets(keyword):
    twitter_users = []
    tweet_time = []
    tweet_string = []

    for tweet in tweepy.Cursor(api.search_tweets, q=keyword, count=1000).items(1000):
        if (not tweet.retweeted) and ('RT @' not in tweet.text):
            if tweet.lang == "en":
                twitter_users.append(tweet.user.name)
                tweet_time.append(tweet.created_at)
                tweet_string.append(tweet.text)

    df = pd.DataFrame({'Name':twitter_users, 'Time':tweet_time, 'Tweet':tweet_string})
    df.to_csv(f'../data/{keyword}.csv')
    return df

When we call the function with "Morbius", define a dataframe as the function’s return value and print its first five rows we get 

In [7]:
df_morbius = get_related_tweets("Morbius")
df_morbius.head()

Unnamed: 0,Name,Time,Tweet
0,The Ohioan,2022-04-03 09:26:45+00:00,. @beaconjournal movie reviewer @ByGeorgeThoma...
1,no one you’re following.,2022-04-03 09:26:41+00:00,Es poo tryna get my to die via morbius screening
2,Randy Gummerman,2022-04-03 09:26:41+00:00,Morbius Post Credit Scene and Spider-Man No Wa...
3,Leslie Wai,2022-04-03 09:26:28+00:00,"@KermodeMovie @ScalaRadio Hi Mark, will you be..."
4,stinky little man,2022-04-03 09:26:10+00:00,MORBIUS SUCKED\nWTF


The same but for the TV show "Moon Knight"

In [7]:
df_moon_knight = get_related_tweets("Moon Knight")
df_moon_knight.head()

Unnamed: 0,Name,Time,Tweet
0,Jason: Human Retweet Machine,2022-04-01 11:51:00+00:00,@dinoMADN3SS Never forget only Moon Knight wou...
1,lily | moon knight era,2022-04-01 11:50:53+00:00,people are talking shit about oscar isaac’s br...
2,Avia,2022-04-01 11:50:38+00:00,Moon Knight is pretty boring so far. The only ...
3,Jacob Kirk || REBORN: FROM DARKNESS,2022-04-01 11:50:13+00:00,Moon Knight ep.1 is probably the best thing in...
4,Jetpack_steve,2022-04-01 11:49:58+00:00,@StephenByrne86 they are so cute and i really ...
