# Facebook Data Gathering

** /// WORK IN PROGRESS /// **

I am trying to build an app called "Someone Else's News Feed". It is a way for people to plug in a few demographic traits (age, gender, location, political leanings) and see a simulation of what someone with that demographic makeup would have show up on their facebook newsfeed.

The implementation of this requires building a database of publicly accessible Facebook data and using the data to seed future queries. So if a user searches for a 23 year old male living in Kansas, I can query my datbase to find people who match this criteria and then present the user with a collapsed average of all the things on those people's news feeds. 

The goal of this document is to try to collect publicly available data and store it in a database.

In [68]:
# Get access token from https://developers.facebook.com/tools/access_token/
# and retrieved the "User Token" value from the Access Token Tool

# Graph API explorer access token: https://developers.facebook.com/tools/explorer/145634995501895/

ACCESS_TOKEN = 'EAACEdEose0cBAE5w8TiGRzPoPM1T5hQUMgde5HXcKSmzMSeapi9b3IXBVehQhYfgxVdQdTuOgLaKK8MtHUga38SZASDIZAN5DxqCQHqjVVzEh2WaaeLnzgwKCu8Mf0ma9u5MlcEoDSja5gjomfxoCSm5q2e5LpVkYV081MZAwZDZD'

Check to see that the access token is valid.

In [3]:
import requests # pip install requests
import json
import facebook # pip install facebook-sdk

FOX_NEWS_ID = "15704546335"
base_url = 'https://graph.facebook.com/v2.6'

fields = "id,about,can_post"

url = '%s/%s?fields=%s&access_token=%s' % \
    (base_url, FOX_NEWS_ID, fields, ACCESS_TOKEN)
    
print url

# Interpret the response as JSON and convert back
# to Python data structures
content = requests.get(url).json()

# Pretty-print the JSON and display it
print json.dumps(content, indent=1)

https://graph.facebook.com/v2.6/15704546335?fields=id,about,can_post&access_token=EAACEdEose0cBAKJbHzKZCCAIg0E6ZCwKCSXZCehXbZArekxS5MKyshqWJp5wscBigqZCgBqbidLR6BwI5umsUfHP4zUkmOHkMKZAOj3UaZC0YUhL3coKrpYZBlMTeZCZCDdSchYTq7slXbWkD9Lkm4O6CGcgGuwTAE1fAiaNMsf9rqUwZDZD
{
 "about": "Welcome to the official Fox News facebook page.  Get breaking news, must see videos and exclusive interviews from the #1 name in news.", 
 "can_post": false, 
 "id": "15704546335"
}


# Getting User Information

The algorithm I will be using is a kind of Depth First Search - this makes use of the Facebook network structure. First, I will choose a random user (in this case myself) and generate a list of all their publicly accessible friends. Then choose one of those friends and repeat the process until I have a long list of public facebook users.

This section creates helper functions which allow us to learn about users. In particular, we are interested in the quality and quantity of publicly available facebook data.

In [4]:
# Function: returns an array of {id: `id`, name: `name`} objects that are the publicly 
# accessible friends of the user specified by ID.

def get_N_pages_of_friends_for_ID(N, ID):
    # Call the FB API
    base_url = 'https://graph.facebook.com/v2.6'
    fields = "id,name,friends"
    url = '%s/%s?fields=%s&access_token=%s' % \
        (base_url, ID, fields, ACCESS_TOKEN)
    print url

    response = requests.get(url).json() # Now we have our first page of friends
    if "error" in response:
        print "Error: %s" % response["error"]
        return 0;
    
    my_friends = response["friends"]["data"]

    # Need to deal with paging in the FB response.
    if "paging" in response["friends"] and "next" in response["friends"]["paging"]:
        next_url = response["friends"]["paging"]["next"]

        for page_counter in range(1,N):
            response_2 = requests.get(next_url).json()
            if "error" in response_2:
                print "Error: %s" % response["error"]
                return 0

            
            my_friends.extend(response_2["data"])
            if ("paging", "next") in response_2:
                next_url = response_2["paging"]["next"]
            else:
                break
        
    return my_friends

In [5]:
# Tests for above function
friend_list = get_N_pages_of_friends_for_ID(5, "me")
friend_list

https://graph.facebook.com/v2.6/me?fields=id,name,friends&access_token=EAACEdEose0cBAKJbHzKZCCAIg0E6ZCwKCSXZCehXbZArekxS5MKyshqWJp5wscBigqZCgBqbidLR6BwI5umsUfHP4zUkmOHkMKZAOj3UaZC0YUhL3coKrpYZBlMTeZCZCDdSchYTq7slXbWkD9Lkm4O6CGcgGuwTAE1fAiaNMsf9rqUwZDZD


[{u'id': u'37109066', u'name': u'Sandy Rogers'},
 {u'id': u'61308789', u'name': u'Deepak Shukla'},
 {u'id': u'222408570', u'name': u'Nik Brbora'},
 {u'id': u'500230810', u'name': u'Vivian Leung'},
 {u'id': u'504489854', u'name': u'Alistair Shepherd'},
 {u'id': u'506326565', u'name': u'Adrien Montcoudiol'},
 {u'id': u'10152870546851054', u'name': u'Govind Chandrasekhar'},
 {u'id': u'510754254', u'name': u'Neal Wu'},
 {u'id': u'519781220', u'name': u'Jane Thomas'},
 {u'id': u'520438231', u'name': u'Kevin G Sun'},
 {u'id': u'537362711', u'name': u'David Liu'},
 {u'id': u'538842179', u'name': u'Carl Gao'},
 {u'id': u'557298431', u'name': u'Tommy Chen'},
 {u'id': u'557364248', u'name': u'Michael Gribben'},
 {u'id': u'578093466', u'name': u'Balaji Pandian'},
 {u'id': u'580236753', u'name': u'Brandon Sim'},
 {u'id': u'586514187', u'name': u'Elakkara Krish'},
 {u'id': u'10152929222946667', u'name': u'Leila Hofer'},
 {u'id': u'10152715658522095', u'name': u'Patrick Leonard'},
 {u'id': u'1015239

In [6]:
# Function: Returns True if `friend_ID` refers to a user that has a public friends list.

def is_friends_list_public(friend_ID):
    base_url = 'https://graph.facebook.com/v2.6'
    fields = "id,name,friends"
    url = '%s/%s?fields=%s&access_token=%s' % (base_url, friend_ID, fields, ACCESS_TOKEN)
#     print url
    
    response = requests.get(url).json()
    
    if "friends" in response and "data" in response["friends"] and len(response["friends"]["data"]) != 0:
#         print "%s's (%s) friends list is public" % (response["name"], friend_ID)
        return True
    else:
#         print "%s's (%s) friends list is private" % (response["name"], friend_ID)
        return False

In [7]:
# Tests for above function
print is_friends_list_public("me") # should return True
print is_friends_list_public("792623746") # should return True
print is_friends_list_public("10101360698376343") # should return False

True
True
False


In [8]:
# Function: Returns True if `friend_ID` refers to a user that has a public news feed.

def is_newsfeed_public(friend_ID):
    base_url = 'https://graph.facebook.com/v2.6'
    url = '%s/%s/feed?access_token=%s' % \
        (base_url, friend_ID, ACCESS_TOKEN)
    response = requests.get(url).json()
#     print url
    
    # Pretty-print the JSON and display it
#     print json.dumps(response, indent=1)

    if "data" in response:
        if len(response["data"]) != 0:
#             print "%s's (%s) newsfeed is public" % (response["name"], friend_ID)
            return True

#     print "%s's (%s) newsfeed is public" % (response["name"], friend_ID)
    return False

In [9]:
# Tests for above function
print is_newsfeed_public("me") # should return True
print is_newsfeed_public("792623746") # should return False

True
False


In [10]:
# Function: Returns an array of keys (strings) specifying all the 
# information the user specified by `Friend_ID` shares publicly

def which_user_data_is_public(friend_ID):
    base_url = 'https://graph.facebook.com/v2.6'
    fields = "id,name,bio,birthday,context,education,gender,"\
        "hometown,link,location,political,religion,sports,work"
    url = '%s/%s?fields=%s&access_token=%s' % (base_url, friend_ID, fields, ACCESS_TOKEN)
#     print url
    
    response = requests.get(url).json()
    return response.keys()

In [11]:
print which_user_data_is_public("me") # should return ??
print which_user_data_is_public("792623746") # should return ?? 

[u'bio', u'name', u'gender', u'sports', u'religion', u'birthday', u'link', u'location', u'context', u'hometown', u'education', u'id']
[u'link', u'id', u'context', u'name']


In [12]:
# Function: Utility Function, prints everything we know to be publicly 
# available for the user specified by `Friend_ID`

def print_info_for_ID(friend_ID):
    base_url = 'https://graph.facebook.com/v2.6'
    fields = "id,name,bio,birthday,education,gender,"\
        "hometown,link,location,political,religion,sports,work"
    url = '%s/%s?fields=%s&access_token=%s' % (base_url, friend_ID, fields, ACCESS_TOKEN)
    print url
    user = requests.get(url).json()

    print user.keys()
    if ("name" in user): print "--------- Name: %s" % user["name"]
    if ("link" in user): print "--------- Link: %s" % user["link"]
    if ("gender" in user): print "--------- Gender: %s" % user["gender"]
    if ("work" in user): print "--------- Work: %s" % user["work"]
    if ("education" in user): print "--------- Education: %s" % user["education"]
    if ("political" in user): print "--------- Political: %s" % user["political"]
    if ("birthday" in user): print "--------- Birthday: %s" % user["birthday"]
    if ("location" in user): print "--------- Location: %s" % user["location"]
    if ("hometown" in user): print "--------- Hometown: %s" % user["hometown"]

In [13]:
print_info_for_ID('me')

https://graph.facebook.com/v2.6/me?fields=id,name,bio,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAKJbHzKZCCAIg0E6ZCwKCSXZCehXbZArekxS5MKyshqWJp5wscBigqZCgBqbidLR6BwI5umsUfHP4zUkmOHkMKZAOj3UaZC0YUhL3coKrpYZBlMTeZCZCDdSchYTq7slXbWkD9Lkm4O6CGcgGuwTAE1fAiaNMsf9rqUwZDZD
[u'bio', u'name', u'gender', u'sports', u'religion', u'birthday', u'link', u'location', u'hometown', u'education', u'id']
--------- Name: Awais Hussain
--------- Link: https://www.facebook.com/app_scoped_user_id/10152732637689045/
--------- Gender: male
--------- Education: [{u'school': {u'id': u'110639368963540', u'name': u'Langdon High School'}, u'type': u'High School', u'id': u'10150200158139045', u'year': {u'id': u'141778012509913', u'name': u'2008'}}, {u'school': {u'id': u'298392486922848', u'name': u'Forest School'}, u'type': u'High School', u'id': u'10150200158134045', u'year': {u'id': u'142963519060927', u'name': u'2010'}}, {u'id': u'10150200158119045', u

In [15]:
# Function: Prints what we know about a list of users. 
# Input is an array of {id: `id`, name: `name`} objects

def print_info_for_friend_list(friend_list):
    print "%50s | %12s | %10s | %30s"%("User", "Friend_List", "Newsfeed", "Data")
    print 81*"-"

    for friend in friend_list:
        if "id" not in friend:
            print "Error: The friend %s does not have an ID" % friend
            exit();

    # check if their friends list is public
        bool_friend_list = is_friends_list_public(friend["id"])

    # check if their newsfeed is public
        bool_newsfeed = is_newsfeed_public(friend["id"])

    # check if you can get access to their data (name, age, location, income etc.)
        user_data = which_user_data_is_public(friend["id"])

        print "%50s | %12s | %10s | %30s"%((friend["name"],friend["id"]), bool_friend_list, bool_newsfeed, user_data)

In [16]:
# Test for the function above
print_info_for_friend_list(friend_list) # Should print a table of information

                                              User |  Friend_List |   Newsfeed |                           Data
---------------------------------------------------------------------------------
                    (u'Sandy Rogers', u'37109066') |        False |       True | [u'gender', u'link', u'id', u'context', u'name']
                   (u'Deepak Shukla', u'61308789') |         True |      False | [u'gender', u'link', u'id', u'context', u'name']
                     (u'Nik Brbora', u'222408570') |         True |      False | [u'gender', u'link', u'id', u'context', u'name']
                   (u'Vivian Leung', u'500230810') |         True |      False | [u'gender', u'link', u'id', u'context', u'name']
              (u'Alistair Shepherd', u'504489854') |         True |      False | [u'bio', u'name', u'gender', u'work', u'birthday', u'link', u'location', u'context', u'education', u'id']
             (u'Adrien Montcoudiol', u'506326565') |         True |      False | [u'gender', u'link

In [61]:
def print_info_for_ID_list(IDs):
    print "%30s | %12s | %10s | %30s"%("User", "Friend_List", "Newsfeed", "Data")
    print 81*"-"

    for ID in IDs:
        
    # check if their friends list is public
        bool_friend_list = is_friends_list_public(ID)

    # check if their newsfeed is public
        bool_newsfeed = is_newsfeed_public(ID)

    # check if you can get access to their data (name, age, location, income etc.)
        user_data = which_user_data_is_public(ID)

        print "%30s | %12s | %10s | %30s"%((ID), bool_friend_list, bool_newsfeed, user_data)

In [17]:
# Function: A golden person is someone who has a public newsfeed and reveals some bare 
# minimum of information publicly. At the moment the information required 
# is `gender`, `birthday`, and `location`.
# This function returns True if `ID` refers to a golden person.

def is_golden_person(ID):
    if not is_newsfeed_public(ID): return False
    public_data = which_user_data_is_public(ID)
# check list of keys to ensure enough data is public
    result = all(x in public_data for x in ["gender","birthday","location"])
    return result

In [18]:
# Tests for the above function
print is_golden_person("me") # Should return True
print is_golden_person("645815986") # Should return False
print is_golden_person("857495149") # Should return True

True
False
True


# Saving the data to a file

These are the helper functions which allow us to save data to a csv file. Later I intend to save directly to a SQL database but for now since we are just prototyping, a CSV file is fine.

In [19]:
# Function: Returns a dictionary of data for the user specified by `ID`

def get_info_for_ID(ID):
    base_url = 'https://graph.facebook.com/v2.6'
    fields = "id,name,birthday,education,gender,"\
        "hometown,link,location,political,religion,sports,work"
    url = '%s/%s?fields=%s&access_token=%s' % (base_url, ID, fields, ACCESS_TOKEN)
    print url
    user = requests.get(url).json()

    d = dict()
    for key in user.keys():
        d[key] = user[key]
    
    return d

In [20]:
# Test for above function
get_info_for_ID("me")

https://graph.facebook.com/v2.6/me?fields=id,name,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD


{u'birthday': u'11/03/1991',
 u'education': [{u'id': u'10150200158139045',
   u'school': {u'id': u'110639368963540', u'name': u'Langdon High School'},
   u'type': u'High School',
   u'year': {u'id': u'141778012509913', u'name': u'2008'}},
  {u'id': u'10150200158134045',
   u'school': {u'id': u'298392486922848', u'name': u'Forest School'},
   u'type': u'High School',
   u'year': {u'id': u'142963519060927', u'name': u'2010'}},
  {u'concentration': [{u'id': u'109279729089828', u'name': u'Physics'},
    {u'id': u'108026662559095', u'name': u'Philosophy'}],
   u'id': u'10150200158119045',
   u'school': {u'id': u'105930651606', u'name': u'Harvard University'},
   u'type': u'College',
   u'year': {u'id': u'105576766163075', u'name': u'2015'}}],
 u'gender': u'male',
 u'hometown': {u'id': u'106078429431815', u'name': u'London, United Kingdom'},
 u'id': u'10152732637689045',
 u'link': u'https://www.facebook.com/app_scoped_user_id/10152732637689045/',
 u'location': {u'id': u'106377336067638', u'n

In [21]:
# Functions: helper functions to create and save users to csv file

import csv
import sys

fieldnames = ["id","name","birthday","education","gender","hometown",\
              "link","location","political","religion","sports","work"]
csv_name = "data_1.csv"

def start_csv_file(csv_name, fieldnames):
    with open(csv_name, 'w') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()

# Function: Input a FB user ID and this function will retrieve the necessary 
# data and save it to the working csv file.
def save_ID_to_csv(ID, csv_name, fieldnames):
    # Should do some duplication checking here, but can't be bothered.
    user = get_info_for_ID(ID)
    with open(csv_name, 'a') as csvfile:
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        try:
            writer.writerow(user)
        except Exception as err:
            print "Error: %s" % err

In [21]:
start_csv_file(csv_name, fieldnames)

In [22]:
# Test for above functions
save_ID_to_csv("me", csv_name, fieldnames)
print "All done!"

https://graph.facebook.com/v2.6/me?fields=id,name,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBANQbOJeZAFdbo9uC2WMz4WVrsf1JzLngKhctZBu2HUgttSvWV2035l3wl430ikv6YmzhpUyZBI3TmIZCE8ZAfwiYpZCcHayN2G18QZBjwT4eRF1b0Vpf4xkV1TRxkOQfZBZAZAW2DDg8VKlzjX3qfZAIDhCpBryoqg6pAZDZD
All done!


# Generating a List of Golden People

Now I need to generate a giant long list of Facebook User IDs so that I can check against that list to find 'golden people'.

Trying a depth first search (while checking for duplicates) since spread is more important than coverage.

In [23]:
# Procedure pulled from online

# 1 procedure DFS(G, v):
# 2     label v as explored
# 3     for all edges e in G.incidentEdges(v) do
# 4         if edge e is unexplored then
# 5             w ← G.adjacentVertex(v, e)
# 6             if vertex w is unexplored then
# 7                 label e as a discovered edge
#                   // need to do some work here. Save w to long list of people
# 8                 recursively call DFS(G, w)
# 9             else
# 10               label e as a back edge

In [22]:
import random

# I make the decision now to work exclusively with IDs and not worry about names, 
# this makes my data structures simpler

# Function: This is the main workhorse of the DFS. Specify a root node and how many 
# friends to find, the DFS will keep iterate until it hits the limit.
def DFS_on_ID(ID, ID_list, limit):
    print "entering DFS function"
    print "length of master list: %i" % len(ID_list)

    if len(ID_list) >= limit:
        return
    
    # Add ID to master list
    if ID not in ID_list:
        print "Adding ID %s to master list" % ID
        ID_list.append(ID)
    
    friend_list = get_N_pages_of_friends_for_ID(5, ID)
    print "Got %i friends for %s"%(len(friend_list), ID)
    
    next_friend_found = False
    for friend in friend_list:
        if friend["id"] not in ID_list and is_friends_list_public(friend["id"]):
            next_friend_found = True
            DFS_on_ID(friend["id"], ID_list, limit)
            break
        else:
            continue
    
    if not next_friend_found:
        print "Going through bypass section, seeding with random id from master list"
        DFS_on_ID(ID_list[random.randint(0,len(ID_list)-1)], ID_list, limit)

In [23]:
ID_master_list = []
# "10152732637689045" is "me"
# me = "10152732637689045"
DFS_on_ID("me", ID_master_list, 50)

entering DFS function
length of master list: 0
Adding ID me to master list
https://graph.facebook.com/v2.6/me?fields=id,name,friends&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD
Got 50 friends for me
entering DFS function
length of master list: 1
Adding ID 61308789 to master list
https://graph.facebook.com/v2.6/61308789?fields=id,name,friends&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD
Got 32 friends for 61308789
entering DFS function
length of master list: 2
Adding ID 10152732637689045 to master list
https://graph.facebook.com/v2.6/10152732637689045?fields=id,name,friends&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRY

In [24]:
print "Is golden person?"

num_golden_people = 0
for ID in ID_master_list:
    if (is_golden_person(ID)): num_golden_people = num_golden_people + 1
    print "%s: %s"% (ID, is_golden_person(ID))

Is golden person?
me: True
61308789: False
10152732637689045: True
222408570: False
504489854: False
500230810: False
557298431: False
510754254: False
506326565: False
557364248: False
538842179: False
537362711: False
580236753: False
10152393576818784: False
519781220: False
704446001: True
10152122559296317: False
719003798: False
578093466: False
10152975494071091: False
633215477: False
10152393368903676: False
10152929222946667: True
10204218119587067: False
645815986: False
1208777884: False
643735396: False
10202751221877677: False
10203761591374091: False
10152870546851054: False
1387154996: False
1494600179: True
1109894092: True
1256912018: True
743496822: False
10152715658522095: False
10152549225726215: True
1129283744: False
705347929: False
1232401193: False
739994128: True
1113450135: False
1255053395: False
620793652: False
1507921178: True
1371877687: False
1636860086: False
648074185: False
1518653665: False
792623746: False


In [25]:
print "******************************************************"
print "from a list of %d users, %d (or %d%%) are golden people" % \
(len(ID_master_list), num_golden_people, (float(num_golden_people) / len(ID_master_list)) * 100)
print "******************************************************"


******************************************************
from a list of 50 users, 10 (or 20%) are golden people
******************************************************


In [26]:
x = get_N_pages_of_friends_for_ID(5, "1494600179")
for f in x:
    id = f["id"]

    print "%s: friends list public? %s"%(id, is_friends_list_public(id))

https://graph.facebook.com/v2.6/1494600179?fields=id,name,friends&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD
10101756499832051: friends list public? False
10101523206493777: friends list public? False
10101727397887587: friends list public? False
10100986622028354: friends list public? False
10106029898207476: friends list public? False
10104606429160856: friends list public? False
10103583775347333: friends list public? False
10101770806711018: friends list public? False
10106052136052699: friends list public? False
10103804623925058: friends list public? False
10103564021154938: friends list public? False
10101997334472637: friends list public? False
10152861447873221: friends list public? False
10152672562398737: friends list public? False
10155082731635015: friends list public? False
10155503152060377: friends list public? False
10153455

Seems like we're kind of stuck. Got to 50 people, and now it seems like I've exhausted the area around my network. No one in this local area seems to be sharing their friends list publically, and of the people that do share their friends lists, they don't go very far before we hit a cul-de-sac

In [117]:
# Save the data we have to the csv file

fieldnames = ["id","name","birthday","education","gender","hometown",\
              "link","location","political","religion","sports","work"]
csv_name = "data_all_master_list.csv"

start_csv_file(csv_name, fieldnames)

for ID in ID_master_list:
    save_ID_to_csv(ID, csv_name, fieldnames)

https://graph.facebook.com/v2.6/me?fields=id,name,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAFz7PHFZBuxULHj4HYxZCztwqYRElZAJGvbYA1ZBRgpB9ZC8ZA4HRNtKigFogTqlUoOk65ErCNVzTtdLlxuX5fYGV16RV6xz5i3maaMdNOXYzsP86kdDpyVCYR7ZCWafYLsIS3ZCV8PISnkNyogWrbxxXfQna2M3jgZDZD
https://graph.facebook.com/v2.6/61308789?fields=id,name,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAFz7PHFZBuxULHj4HYxZCztwqYRElZAJGvbYA1ZBRgpB9ZC8ZA4HRNtKigFogTqlUoOk65ErCNVzTtdLlxuX5fYGV16RV6xz5i3maaMdNOXYzsP86kdDpyVCYR7ZCWafYLsIS3ZCV8PISnkNyogWrbxxXfQna2M3jgZDZD
https://graph.facebook.com/v2.6/10152732637689045?fields=id,name,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAFz7PHFZBuxULHj4HYxZCztwqYRElZAJGvbYA1ZBRgpB9ZC8ZA4HRNtKigFogTqlUoOk65ErCNVzTtdLlxuX5fYGV16RV6xz5i3maaMdNOXYzsP86kdDpyVCYR7ZCWafYLsIS3ZCV8PISnkNyogWrbxxXfQna2M3jgZDZD
https://gr

# Seed with a random user?

Given that I have already exhausted my own local area of friends, maybe I can try seeding the DFS with a random user?

Problem is that a random user is unlikely to share their friends list publicly... The area of exploration for each DFS is constrainingly small.

In [123]:
ID_list = []
DFS_on_ID("100006437354964", ID_list, 50)

entering DFS function
length of master list: 0
Adding ID 100006437354964 to master list
https://graph.facebook.com/v2.6/100006437354964?fields=id,name,friends&access_token=EAACEdEose0cBAFz7PHFZBuxULHj4HYxZCztwqYRElZAJGvbYA1ZBRgpB9ZC8ZA4HRNtKigFogTqlUoOk65ErCNVzTtdLlxuX5fYGV16RV6xz5i3maaMdNOXYzsP86kdDpyVCYR7ZCWafYLsIS3ZCV8PISnkNyogWrbxxXfQna2M3jgZDZD
Got 0 friends for 100006437354964
Going through bypass section, seeding with random id from master list
entering DFS function
length of master list: 1
https://graph.facebook.com/v2.6/100006437354964?fields=id,name,friends&access_token=EAACEdEose0cBAFz7PHFZBuxULHj4HYxZCztwqYRElZAJGvbYA1ZBRgpB9ZC8ZA4HRNtKigFogTqlUoOk65ErCNVzTtdLlxuX5fYGV16RV6xz5i3maaMdNOXYzsP86kdDpyVCYR7ZCWafYLsIS3ZCV8PISnkNyogWrbxxXfQna2M3jgZDZD
Got 0 friends for 100006437354964
Going through bypass section, seeding with random id from master list
entering DFS function
length of master list: 1
https://graph.facebook.com/v2.6/100006437354964?fields=id,name,friends&access_toke

KeyboardInterrupt: 

The above technique is not really working

# End Notes

## Get info

In [28]:
ID = 'me'

base_url = 'https://graph.facebook.com/v2.6'
fields = "id,name,bio,birthday,education,gender,"\
    "hometown,link,location,political,religion,sports,work"
url = '%s/%s?fields=%s&access_token=%s' % (base_url, ID, fields, ACCESS_TOKEN)

print url

https://graph.facebook.com/v2.6/me?fields=id,name,bio,birthday,education,gender,hometown,link,location,political,religion,sports,work&access_token=EAACEdEose0cBAKp9i6qqJBHm9dUIqs9uUBVR0mbz1lJ7LDzdPCwZCPjuCeC3SOPTB0tHKhreq8eNFt4ceNMhPTluWysWbVmbZCHBG5sIZAYcFYkQo3OGZCObXetmbe5w9capN3do0gqOvpOAzQo4O9JcFmKHdZBSm4rUlisTtKAZDZD


## Get feed

In [24]:
ID = 'me'

base_url = 'https://graph.facebook.com/v2.6'
url = '%s/%s/feed?access_token=%s' % (base_url, ID, ACCESS_TOKEN)
print url

https://graph.facebook.com/v2.6/me/feed?access_token=EAACEdEose0cBAPtacXrPZBvh38In7FJc87Y2AtvKXqgsUjE8h1CIbSIL6xssT0ebxvMdft9tka7W5nyNm8EBkzGARzflPocVZC9vJaHZBayuc8q5ECn0jQMtCJwZCzRzszQD5BsE7aZCVZCh5PG0bViAupwhqIZBsFsilzXF14sVgZDZD


## Lookup post id

In [25]:
post_ID = "10152732637689045_10155205567189045"
base_url = 'https://graph.facebook.com/v2.8'
fields = "id,admin_creator,application,caption,created_time,description,from,icon,link,message,name,object_id,permalink_url,picture,place,privacy,properties,shares,source,status_type,story,to,type,updated_time"
url = '%s/%s?fields=%s&access_token=%s' % (base_url, post_ID, fields, ACCESS_TOKEN)
print url

https://graph.facebook.com/v2.8/10152732637689045_10155205567189045?fields=id,admin_creator,application,caption,created_time,description,from,icon,link,message,name,object_id,permalink_url,picture,place,privacy,properties,shares,source,status_type,story,to,type,updated_time&access_token=EAACEdEose0cBAPtacXrPZBvh38In7FJc87Y2AtvKXqgsUjE8h1CIbSIL6xssT0ebxvMdft9tka7W5nyNm8EBkzGARzflPocVZC9vJaHZBayuc8q5ECn0jQMtCJwZCzRzszQD5BsE7aZCVZCh5PG0bViAupwhqIZBsFsilzXF14sVgZDZD


In [None]:
id,admin_creator,application,caption,created_time,description,from,icon,link,message,name,object_id,permalink_url,picture,place,privacy,properties,shares,source,status_type,story,to,type,updated_time

# A New Approach - Public Pages
   

Let's find some people who have interacted with the Fox News Page recently

In [43]:
def get_feed_for_id(ID, p = False):
    base_url = 'https://graph.facebook.com/v2.8'
    url = '%s/%s/feed?access_token=%s' % (base_url, ID, ACCESS_TOKEN)
    
    if p: print url
        
    return requests.get(url).json()

In [44]:
def get_likes_for_post(ID, p = False):
    base_url = 'https://graph.facebook.com/v2.8'
    url = '%s/%s/likes?access_token=%s' % (base_url, ID, ACCESS_TOKEN)
    if p: print url

    return requests.get(url).json()

In [45]:
def get_comments_for_post(ID, p = False):
    base_url = 'https://graph.facebook.com/v2.8'
    url = '%s/%s/comments?access_token=%s' % (base_url, ID, ACCESS_TOKEN)
    if p: print url

    return requests.get(url).json()

In [34]:
FOX_NEWS_ID = "15704546335"

content = get_feed_for_id(FOX_NEWS_ID)

# Pretty-print the JSON and display it
print json.dumps(content, indent=1)

https://graph.facebook.com/v2.8/15704546335/feed?access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD
{
 "paging": {
  "next": "https://graph.facebook.com/v2.8/15704546335/feed?access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD&limit=25&until=1476680400&__paging_token=enc_AdAP3eXS6qZAcQT6LJ5IZAUzGnzAVarHAJGGHQGmrn20Ttuk9sUZCxgPgV46OyUdHeSnp6LIbinIdhG863Ik67KCJyf", 
  "previous": "https://graph.facebook.com/v2.8/15704546335/feed?since=1476750925&access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD&limit=25&__paging_token=enc_AdAWtkYufet0Fde3AY2ILyufYBFnZB9DelovHAAhZAWaQnVADyXAaAw8A

In [39]:
posts = content["data"]

print json.dumps(get_likes_for_post(posts[0]["id"]), indent=1)

https://graph.facebook.com/v2.8/15704546335_10154706546376336/likes?access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD
{
 "paging": {
  "cursors": {
   "after": "OTk2NTI0NTA3MDI4NTI4", 
   "before": "MTQ5OTY5NTg2MzY0NTcxMQZDZD"
  }, 
  "next": "https://graph.facebook.com/v2.8/15704546335_10154706546376336/likes?access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD&limit=25&after=OTk2NTI0NTA3MDI4NTI4"
 }, 
 "data": [
  {
   "id": "1499695863645711", 
   "name": "Rodney Grap"
  }, 
  {
   "id": "746367348782415", 
   "name": "Leta Anderson"
  }, 
  {
   "id": "719065047264", 
   "name": "Tiffany Morgan"
  }, 
  {
   "id": "664895046879853", 
   "name": "Diane Burkart"
  }, 
  {
   "id": "10201746084958394", 
   "name"

In [48]:
print json.dumps(get_comments_for_post(posts[0]["id"]), indent=1)

{
 "paging": {
  "cursors": {
   "after": "NzU0", 
   "before": "Nzgw"
  }, 
  "next": "https://graph.facebook.com/v2.8/15704546335_10154706546376336/comments?access_token=EAACEdEose0cBAFlJr9kQ4pzS1GOGBM24pwvSmrDVcR54mI7lkD5UsKeyD5GJxp1AnZAFAVASJpOa0rTI5ZA8NERuZB01LT62GoZA9YCH4IB9im5dHRYPWGmzXE14JFUU17ALHAtkJvgpbMH3KknQyLPg7q0CZC0vzaTzkgaF1TgZDZD&limit=25&after=NzU0"
 }, 
 "data": [
  {
   "created_time": "2016-10-18T00:37:13+0000", 
   "message": "If people are unsure on who to vote for, ask yourself one question.Who is putting the American people's life at risk by letting refugees into this country? It's all good until one of your loved ones are raped or worse,, then you will ask yourself, Well i guess i voted for that...You will have blood on your hands...", 
   "from": {
    "name": "Richard Ogle", 
    "id": "867158269983597"
   }, 
   "id": "10154706546376336_10154706599071336"
  }, 
  {
   "created_time": "2016-10-18T00:35:54+0000", 
   "message": "Hillary Clinton made a lot of 

In [49]:
trump_fans = []

print "Started searching for Trump fans"
for post in posts:
    ID = post["id"]
    likers = get_likes_for_post(ID)['data']
    for liker in likers:
        trump_fans.append(liker['id'])
    comments = get_comments_for_post(ID)['data']
    for comment in comments:
        trump_fans.append(comment["from"]["id"])
print "Done searching for Trump fans"

Started searching for Trump fans
Done searching for Trump fans


In [51]:
trump_fans = list(set(trump_fans))
print len(trump_fans)

1188


In [64]:
num_golden_fans = 0
print "checking for golden fans"
for i, ID in enumerate(trump_fans):
    print "%s: %s"% (ID, is_golden_person(ID))
    if (is_golden_person(ID)): num_golden_fans = num_golden_fans + 1

print "Done finding golden Trump fans!"

checking for golden fans
578030265661727: False
10152446890627849: False
10205036478446756: False
10203224342117458: False
571177653014929: False
10153501532826692: False
847727081910816: False
10152857246901665: False
1048665595159795: False
10201922459703461: False
814456311945673: False
872802132782952: False
1638479233040879: False
902708156420733: False
786090178097660: False
10204103933746112: False
10203323683511437: False
10154609350593312: False
116951435441439: False
1453055944938832: False
1597394393853229: False
1207311689294318: False
10204965218268983: False
513297158798977: False
712712428818025: False
10152911554301180: False
867158269983597: False
10203497804940216: False
839141946166384: False
814661031964076: False
10102067208230315: False
496593107133140: False
785290911514971: False
806739409397700: False
124926414621058: False
1015956115088272: False
10204128359789826: False
10201769281065559: False
962775750409274: False
10202031193001756: False
10202134561986811

In [65]:
print "******************************************************"
print "from a list of %d users, %d (or %d%%) are golden people" % \
(len(trump_fans), num_golden_fans, (float(num_golden_fans) / len(trump_fans)) * 100)
print "******************************************************"


******************************************************
from a list of 1188 users, 0 (or 0%) are golden people
******************************************************


In [63]:
print_info_for_ID_list(trump_fans[600:699])

                          User |  Friend_List |   Newsfeed |                           Data
---------------------------------------------------------------------------------
               602175256579575 |        False |      False | [u'link', u'id', u'context', u'name']
              1037178099639154 |        False |      False | [u'link', u'id', u'context', u'name']
             10205434968576489 |        False |      False | [u'link', u'id', u'context', u'name']
               207009622976499 |        False |      False | [u'link', u'id', u'context', u'name']
             10203381969003093 |        False |      False | [u'link', u'id', u'context', u'name']
             10154048801048243 |        False |      False | [u'link', u'id', u'context', u'name']
               778772568849691 |        False |      False | [u'link', u'id', u'context', u'name']
             10152374736254617 |        False |      False | [u'link', u'id', u'context', u'name']
             10202485884888087 |  

In [72]:
FACEBOOK_ID = "19292868552"

content = get_feed_for_id(FACEBOOK_ID, True)
posts = content["data"]

FB_fans = []

print "Started searching for FB fans"
for post in posts:
    ID = post["id"]
    likers = get_likes_for_post(ID)['data']
    for liker in likers:
        FB_fans.append(liker['id'])
    comments = get_comments_for_post(ID)['data']
    for comment in comments:
        FB_fans.append(comment["from"]["id"])
print "Done searching for FB fans"

FB_fans = list(set(FB_fans))

https://graph.facebook.com/v2.8/19292868552/feed?access_token=EAACEdEose0cBAE5w8TiGRzPoPM1T5hQUMgde5HXcKSmzMSeapi9b3IXBVehQhYfgxVdQdTuOgLaKK8MtHUga38SZASDIZAN5DxqCQHqjVVzEh2WaaeLnzgwKCu8Mf0ma9u5MlcEoDSja5gjomfxoCSm5q2e5LpVkYV081MZAwZDZD
Started searching for FB fans
Done searching for FB fans


In [70]:
num_golden_fb_fans = 0
print "checking for golden fans"
for i, ID in enumerate(FB_fans):
#     print "%s: %s"% (ID, is_golden_person(ID))
    if (is_golden_person(ID)): num_golden_fb_fans = num_golden_fb_fans + 1

print "Done finding golden FB fans!"

checking for golden fans
Done finding golden FB fans!


In [71]:
print "******************************************************"
print "from a list of %d users, %d (or %d%%) are golden people" % \
(len(FB_fans), num_golden_fb_fans, (float(num_golden_fb_fans) / len(trump_fans)) * 100)
print "******************************************************"

******************************************************
from a list of 612 users, 0 (or 0%) are golden people
******************************************************
