## Wilde and Styles: A Twitter Text Report on the "Don't Worry Darling" Relationship
#### Jenna Bal

"Don't Worry Darling" is a movie starring Harry Styles, Florence Pugh, and Chris Pine and directed by Olivia Wilde. When the film was released late September, drama surrounding the cast took social media by storm when it became evident that Harry Styles and Olivia Wilde were [dating](https://www.usmagazine.com/celebrity-news/pictures/olivia-wilde-and-harry-styles-relationship-timeline/). The two had met on the set of the movie three months earlier, one month before Wilde split from her then fiancé. 

Now that the drama surrounding the film has settled, I want to know what people are saying about the pair. Does the general public still have a strong opinion? Or did the fascination die down alongside the movie's hype?


**I started by importing all the needed information.**

In [1]:
import pandas as pd
import json
import requests
import urllib

In [2]:
bearer_token = pd.read_csv("Twitter_Token_9-22", header = 0, sep ='\t')

**I called the bearer token to make sure it was working correctly, but then I cleared it since bearer tokens should stay private.**

In [None]:
bearer_token['Bearer_token'].iloc[0]

In [3]:
headers = {'Authorization':'Bearer {}'.format(bearer_token['Bearer_token'].iloc[0])}

**Used recent endpoints to collect the data as the project requirements state.**

In [4]:
endpoint = 'https://api.twitter.com/2/tweets/search/recent'

My query involves the names of both Olivia Wilde and Harry Styles because I want to know about the pair together, not apart. I also gave the query several words relating to relationships, love, dating, etc. From how difficult it was for me to receive the required tweets for this project (300) I could already tell this was not as hot of a topic as it was when the movie first came out. However, after adding a few more words such as "married" and "together," I reached the required tweet count meaning that people are still talking about the couple.

In [5]:
query_param = urllib.parse.quote('("Olivia Wilde" OR "Wilde") ("Harry Styles" OR "Styles") ("romance" OR "romantic" OR "relationship" OR "dating" OR "date" OR "together" OR "love" OR "in love" OR "married") -is:retweet')

**I added both tweet fields and user fields to receive the required information for this project. However, I could not seem to add the author name. Since I wanted to look at what people were tweeting about the couple, I included the text option so I would be able to see content of the tweets in my dataframe.**

In [6]:
tweet_fields = 'author_id,text,public_metrics,created_at'

In [7]:
user_fields = 'username,id'

In [8]:
query_url = endpoint + '?query={}&tweet.fields={}&user.fields'.format(query_param, tweet_fields, user_fields)

In [9]:
query_url

'https://api.twitter.com/2/tweets/search/recent?query=%28%22Olivia%20Wilde%22%20OR%20%22Wilde%22%29%20%28%22Harry%20Styles%22%20OR%20%22Styles%22%29%20%28%22romance%22%20OR%20%22romantic%22%20OR%20%22relationship%22%20OR%20%22dating%22%20OR%20%22date%22%20OR%20%22together%22%20OR%20%22love%22%20OR%20%22in%20love%22%20OR%20%22married%22%29%20-is%3Aretweet&tweet.fields=author_id,text,public_metrics,created_at&user.fields'

In [10]:
response = requests.get(query_url, headers = headers)

**Checking to make sure everything is working correctly with "status code."**

In [64]:
response.status_code

200

In [None]:
response.text

**I used code I wrote down during our 10-18 class to explore the dictionaries and keys of the content that would eventually be put into dataframes.**

In [66]:
response_dict = json.loads(response.text)

In [67]:
response_dict.keys()

dict_keys(['data', 'meta'])

In [68]:
type(response_dict['data'][0])

dict

In [16]:
response_dict['data'][0].keys()

dict_keys(['edit_history_tweet_ids', 'text', 'author_id', 'created_at', 'id', 'public_metrics'])

In [17]:
response_dict['data'][0]['text']

'Olivia Wilde and Harry Styles pick up lunch together in\xa0LA https://t.co/GQynsRnMMD'

In [18]:
type(response_dict['data'][0]['public_metrics'])

dict

**I called the keys to the response dictionary to see what "public metrics" encompasses.**

In [20]:
response_dict['data'][0]['public_metrics'].keys()

dict_keys(['retweet_count', 'reply_count', 'like_count', 'quote_count'])

In [21]:
response_dict['data'][0]['public_metrics']['like_count']

0

In [22]:
response_dict['data'][0]['id']

'1585042580219338755'

**I created a dataframe to organize the information into a chart.**

In [27]:
response_df = pd.DataFrame(response_dict['data'])

In [28]:
response_df.head()

Unnamed: 0,edit_history_tweet_ids,text,author_id,created_at,id,public_metrics
0,[1585042580219338755],Olivia Wilde and Harry Styles pick up lunch to...,1331116530000564224,2022-10-25T22:56:14.000Z,1585042580219338755,"{'retweet_count': 0, 'reply_count': 0, 'like_c..."
1,[1585034959206551553],@harryscowgirI https://t.co/cY6hNdynB5,1286868444495962113,2022-10-25T22:25:57.000Z,1585034959206551553,"{'retweet_count': 0, 'reply_count': 1, 'like_c..."
2,[1585025321778442241],https://t.co/pN1KYna1C4,1229412320880865282,2022-10-25T21:47:39.000Z,1585025321778442241,"{'retweet_count': 1, 'reply_count': 0, 'like_c..."
3,[1585021471411216384],Olivia Wilde &amp; Harry Styles Pick Up Lunch ...,1526200453696040964,2022-10-25T21:32:21.000Z,1585021471411216384,"{'retweet_count': 0, 'reply_count': 0, 'like_c..."
4,[1585014750378795008],Olivia Wilde grabs food with Harry Styles whil...,19538986,2022-10-25T21:05:39.000Z,1585014750378795008,"{'retweet_count': 6, 'reply_count': 2, 'like_c..."


**Then I created another dataframe specifically for the "public metrics" of each tweet.**

In [38]:
public_metrics_df = pd.DataFrame(list(response_df['public_metrics']))

Looking at the public metrics can also tell me what people think about the couple because more popular and agreed on tweets will have more likes. So even if people are not posting about the relationship still, they could be liking and retweeting. 

In [39]:
public_metrics_df

Unnamed: 0,retweet_count,reply_count,like_count,quote_count
0,0,0,0,0
1,0,1,1,0
2,1,0,3,0
3,0,0,1,0
4,6,2,30,7
5,0,0,0,0
6,1,1,5,0
7,0,0,0,0
8,0,0,2,0
9,0,0,1,0


**I combined both dataframes into one, essentially tacking the "public metrics" onto the end of the previous dataframe I created.**

In [42]:
response_df['retweets'] = public_metrics_df['retweet_count']
response_df['replies'] = public_metrics_df['reply_count']
response_df['likes'] = public_metrics_df['like_count']
response_df['quotes'] = public_metrics_df['quote_count']

In [43]:
response_df.head()

Unnamed: 0,edit_history_tweet_ids,text,author_id,created_at,id,public_metrics,retweets,replies,likes,quotes
0,[1585042580219338755],Olivia Wilde and Harry Styles pick up lunch to...,1331116530000564224,2022-10-25T22:56:14.000Z,1585042580219338755,"{'retweet_count': 0, 'reply_count': 0, 'like_c...",0,0,0,0
1,[1585034959206551553],@harryscowgirI https://t.co/cY6hNdynB5,1286868444495962113,2022-10-25T22:25:57.000Z,1585034959206551553,"{'retweet_count': 0, 'reply_count': 1, 'like_c...",0,1,1,0
2,[1585025321778442241],https://t.co/pN1KYna1C4,1229412320880865282,2022-10-25T21:47:39.000Z,1585025321778442241,"{'retweet_count': 1, 'reply_count': 0, 'like_c...",1,0,3,0
3,[1585021471411216384],Olivia Wilde &amp; Harry Styles Pick Up Lunch ...,1526200453696040964,2022-10-25T21:32:21.000Z,1585021471411216384,"{'retweet_count': 0, 'reply_count': 0, 'like_c...",0,0,1,0
4,[1585014750378795008],Olivia Wilde grabs food with Harry Styles whil...,19538986,2022-10-25T21:05:39.000Z,1585014750378795008,"{'retweet_count': 6, 'reply_count': 2, 'like_c...",6,2,30,7


In [None]:
response_dict['meta']['next_token']

**Pages need to be linked and read, so calling next token is needed since Twitter can only do a certain number of tweets at a time.**

In [48]:
next_query_url = query_url + "&next_token={}".format(response_dict['meta']['next_token'])

In [49]:
next_response = requests.get(next_query_url, headers = headers)

In [50]:
next_response.status_code

200

In [None]:
next_response.text

In [52]:
next_response_dict = json.loads(next_response.text)

In [None]:
next_response_dict['meta']

**A for loop was created to receive the needed number of tweets (300) for the project. I used the code we went through in class to create this.**

In [54]:
def twt_recent_search (query, num_pages, headers):
    response_list = []
    next_token = ''
    for i in range(0, num_pages):
        if i > 0:
            this_query = query + "&next_token={}".format(next_token)
        else:
            this_query = query
        
        this_response = requests.get(this_query, headers = headers)
        print(this_response.status_code)
        this_response_dict = json.loads(this_response.text)
        response_list.append(this_response_dict)
        next_token = this_response_dict['meta']['next_token']
        
    return response_list

In [None]:
my_responses = twt_recent_search(query_url, 30, headers)

In [56]:
results_1 = pd.DataFrame.from_records(my_responses)

In [57]:
data_list = list(results_1['data'])

In [58]:
data_list_of_dfs = [pd.DataFrame(x) for x in data_list]

In [59]:
data_df = pd.concat(data_list_of_dfs)

In [69]:
data_df.head()

Unnamed: 0,author_id,public_metrics,edit_history_tweet_ids,text,id,created_at
0,1331116530000564224,"{'retweet_count': 0, 'reply_count': 0, 'like_c...",[1585042580219338755],Olivia Wilde and Harry Styles pick up lunch to...,1585042580219338755,2022-10-25T22:56:14.000Z
1,1286868444495962113,"{'retweet_count': 0, 'reply_count': 1, 'like_c...",[1585034959206551553],@harryscowgirI https://t.co/cY6hNdynB5,1585034959206551553,2022-10-25T22:25:57.000Z
2,1229412320880865282,"{'retweet_count': 1, 'reply_count': 0, 'like_c...",[1585025321778442241],https://t.co/pN1KYna1C4,1585025321778442241,2022-10-25T21:47:39.000Z
3,1526200453696040964,"{'retweet_count': 0, 'reply_count': 0, 'like_c...",[1585021471411216384],Olivia Wilde &amp; Harry Styles Pick Up Lunch ...,1585021471411216384,2022-10-25T21:32:21.000Z
4,19538986,"{'retweet_count': 6, 'reply_count': 2, 'like_c...",[1585014750378795008],Olivia Wilde grabs food with Harry Styles whil...,1585014750378795008,2022-10-25T21:05:39.000Z


I had to broaden my query several times to produce enough tweets for this assignment. This in part answers my question of "Did the fascination surrounding the couple's relationship die down alongside the movie's hype?" From how many synonyms for "relationship" I had to put in my query, the number of tweets being made every day about the pair has definitely declined. To answer my other question of "Does the general public still have a strong opinion on the couple?" I would have to look at the text of each tweet to see if it is positive, negative, or neutral. 

In [1246]:
len(data_df.index)

299