Next, we'll examine some of what Facebook has on us.

Download your Facebook data by following these directions: https://www.facebook.com/help/1701730696756992/?helpref=hc_fnav

Quickly: 
1. "Settings" --> "Your Facebook Information" --> "Download Your Information" --> "View".
2. Download as JSON. 
3. "Create File".

Once you have the data, decompress it (it will likely be large, especially if you downloaded photos). 

In [None]:
import pandas as pd
import json
from collections import defaultdict
import numpy as np
import matplotlib.pyplot as plt

In [None]:
facebook_dir = "./facebook-YOURNAMEHERE/"

We will examine your likes and reactions on Facebook posts.

In [None]:
with open(facebook_dir + "likes_and_reactions/pages.json", "r") as f:
    reactions_pages = json.load(f)
with open(facebook_dir + "likes_and_reactions/posts_and_comments.json", "r") as f:
    reactions_posts = json.load(f)

We'll extract the reactor (you, name1) and reaction recipient (name2) from each reaction post.

Note: A lot more language processing could be done here; this was a quick first pass via glancing through the data and seeing what would capture information quickly.

We count up the number of each reaction you had to each different person's posts, ignoring time.

In [None]:
# reaction_matrix is a dict: (k, v) = (name, dict({reaction_type: count}))
reaction_matrix = defaultdict(lambda: defaultdict(int))

verb1_set = {"likes", "liked", "reacted to"}
obj_set = {"photo", "post", "comment"}
# basic sentence structure: name1 + " " + verb1 + name2 + "'s" + " " + obj
# + (if "comment": "on " + name3 + "'s wall" else: "") + "."

# OR, " a post."

for r in reactions_posts['reactions']:
    # First, set reaction actor as name1.
    name1 = r['data'][0]['reaction']['actor']
    
    # Then, parse the title. Extract names as name1 and name2.
    title = r['title'].replace(name1 + " ", "")

    # TODO regex time instead of what you have here
#    verb1_re = r"likes |liked |reacted to "
    
    for v in verb1_set:
        if title.startswith(v):
            title = title.replace(v + " ", "")
            break
    apost_loc = title.find("'s") if title.find("'s") > -1 else 0
    name2 = title[:apost_loc] if apost_loc > 0 else "NO_NAME" # TODO FIX HACK
    
    # Then, add a count of that type of reaction to name2.
    reaction_matrix[name2][r['data'][0]['reaction']['reaction']] += 1

Then we reorganize the name2/reaction counts for ease of plotting.

In [None]:
cols = ['name', 'LOVE', 'LIKE', 'SORRY', 'HAHA', 'WOW', 'ANGER']
fb_data = []
total = 0 # use per row
for n, v in reaction_matrix.items():
    row = [n]
    for r in cols[1:]:
        if r in v.keys():
            row.append(v[r])
        else:
            row.append(0)
    fb_data.append(row)

Tossing all into a DataFrame will make it easier to manipulate. Get the totals of each set of reactions.

In [None]:
fb_reactions_df = pd.DataFrame(data=fb_data, columns=cols)
fb_reactions_df['total'] = fb_reactions_df[['LOVE', 'LIKE', 'SORRY', 'HAHA', 'WOW', 'ANGER']].sum(axis=1)

We'll take the top 25 total-reaction counts for each person for visualization purposes.

In [None]:
totals_df = fb_reactions_df.sort_values(by="total", ascending=False).head(25)
fb_melted_df = pd.melt(totals_df, id_vars=['name'], var_name="reaction", value_name="count")

# stacked bar chart: 
reactions = ['LIKE', 'LOVE', 'HAHA', 'SORRY', 'WOW', 'ANGER', 'total']
colors    = ['blue', 'red', 'yellow', 'purple', 'orange', 'brown', 'black']
react_counts = [fb_melted_df[fb_melted_df['reaction']==reactions[i]] for i in range(len(reactions))]

In [None]:
# https://matplotlib.org/3.1.1/gallery/lines_bars_and_markers/bar_stacked.html

N = 6 # num of reaction types
ind = np.arange(len(react_counts[0]))    # the x locations for the groups
width = 0.35       # the width of the bars: can also be len(x) sequence

This graph will show you the top 25 people you have reacted to on Facebook, according to the posts you've kept in your records.

In [None]:
%matplotlib inline
#fig = plt.figure(figsize=(20,10))
plt.figure(figsize=(20,10))
p = [plt.bar(ind, react_counts[0]['count'], width)] # list of bar charts
for i in range(1,N):
    # sum the totals up to i to build the stacked bar chart heights
    subtotal_array = np.array([list(react_counts[j]['count']) for j in range(i)])
    subtotal = np.sum(subtotal_array, axis=0)
    p.append(plt.bar(ind, list(react_counts[i]['count']), width, bottom=subtotal))

plt.ylabel('reactions')
plt.title('reactions by name and type')
plt.xticks(ind, list(react_counts[0]['name']), rotation=-90)
plt.yticks(np.arange(0, 500, 20))
plt.legend((p[i][0] for i in range(N)), reactions)

plt.show()