<h1>Intersection of Twitter Metrics</h1>
<p>This script assumes you already have three csv files: 
    <ul>
        <li>Twitter_Followers.csv</li>
        <li>Tweet_Likes.csv</li>
        <li>Tweet_Retweeters.csv</li>
      </ul>
<p>To randomize the dataframe there are instructions on this <a href="https://www.tutorialexample.com/understand-pandas-dataframe-sample-randomize-dataframe-by-row-python-pandas-tutorial/">tutorial using Pandas</a>.</p>
<p>This script will find the intersection between these CSV files so the end result are Twitter followers who liked AND retweeted your post. This script has cells to check other combinations if you are curious.</p>

<h2>Known Bugs</h2>
<p>Will only use display name to merge dataframes due to inconsistent follower_ids in the dataframe.</p>

In [None]:
import pandas as pd
import numpy as np
import os

In [None]:
# Edit the path with your own directory, keep the r before the path and use double slashes
csv_path = r'C:\\Users\'

In [None]:
# index_col = 0 ensures row number is not read as a column "Unnamed: 0"
Likes = pd.read_csv(os.path.join(csv_path,'Tweet_Likes.csv'), index_col = 0)
Retweets = pd.read_csv(os.path.join(csv_path,'Tweet_Retweeters.csv'), index_col = 0)
Followers = pd.read_csv(os.path.join(csv_path,'Twitter_Followers.csv'), index_col = 0)

In [None]:
Likes['liked_post'] = 1
Retweets['retweeted_post'] = 1
Followers['following'] = 1

In [None]:
print(Likes)
Likes.dtypes

In [None]:
print(Retweets)

In [None]:
print(Followers)

In [None]:
Followers

In [None]:
# Load the qtconsole for debugging
%qtconsole

In [None]:
pd.set_option('display.max_rows', 500) 
pd.set_option('display.max_rows', 500)

In [None]:
not_following = pd.merge(Followers, Likes, how='right', on=['screen_name'])
not_following['following'].fillna(0,inplace=True)
not_following[not_following['following'] == 0]

In [None]:
not_following_RT = pd.merge(Followers, Retweets, how='right', on=['screen_name'])
not_following_RT['following'].fillna(0,inplace=True)
not_following_RT[not_following_RT['following'] == 0]

In [None]:
# see if I can keep the list of followers but have a column name indicating whether they liked the post or not
intersect_likes = pd.merge(Followers, Likes, how='left', on=['screen_name'])
intersect_likes.fillna(0,inplace=True)
intersect_likes.head()

In [None]:
# who liked the post
intersect_likes[intersect_likes['liked_post']==1]

In [None]:
df = pd.merge(intersect_likes,Retweets, how="left", on=['screen_name'])
df.fillna(0,inplace=True)
df

In [None]:
meet_requirements = df[(df['following']==1) & (df['liked_post']==1) & (df['retweeted_post']==1)]
print('Those who are following, liked, and retweeted the post include:\n',
      meet_requirements)
print('This includes ',meet_requirements.shape[0])

In [None]:
like_but_not_RT = df[(df['following']==1) & (df['liked_post']==1) & (df['retweeted_post']==0)]
print('Those who are following, liked, but did not retweet the post include:\n',
      like_but_not_RT)
print('This includes ',like_but_not_RT.shape[0])

In [None]:
RT_but_not_like = df[(df['following']==1) & (df['liked_post']==0) & (df['retweeted_post']==1)]
print('Those who are following, but did not like, and retweeted include:\n',
      RT_but_not_like)
print('This includes ',RT_but_not_like.shape[0])

In [None]:
print('Those who liked but are not following include: \n',
     not_following[not_following['following']==0])

In [None]:
print('Those who retweeted but are not following include: \n',
     not_following_RT[not_following_RT['following']==0])

In [None]:
# Once we have found the intersection we need to randomize the dataframe to do the art raffle.
random_seed = 8 # pick a number
d = meet_requirements.sample(n = len(meet_requirements), random_state = random_seed)
print(type(d))
print(d)

<div class="alert alert-block alert-success">
<b>The Winner of the Art Raffle is:</b> ...
</div>

In [None]:
print(d.iloc[0])

In [None]:
df.to_csv('Art_raffle_results.csv')