# Using Facepy to collect and analyze Facebook data

## Introduction

Facebook is one of the most popular and large-scale social network in the world. It contains a huge amount of data for social network analysis. Therefore, it would be helpful to know how to access those data if you want to study about network of human or even trends around the globe. The tutorial will introduce how to collect data from Facebook for further analysis. This includes  introduction of [Facebook API](https://developers.facebook.com/docs/graph-api/overview) with basic web scraping techniques and usage of [Facepy](https://facepy.readthedocs.io/en/latest/), a python package which make interactaction with Facebook API much easier. 

### Tutorial Content
- [Example of collecting data from Facebook without APIs](#Example-of-collecting-data-from-Facebook-without-APIs)
- [Facebook API access](#Facebook-API-access)
- [Overview of Facebook API](#Overview-of-Facebook-API)
- [Easier usage of Facebook API using Facepy](#Easier-usage-of-Facebook-API-using-Facepy)
- [Example Application: Retrive and process information of URLs](#Example-Application:-Retrive-and-process-information-of-URLs)
- [References](#References)

## Example of collecting data from Facebook without APIs

Before actually getting to know Facebook API, let's try to collect some Facebook data directly by processing the HTML page. 

In [68]:
# Example: https://www.facebook.com/CocaColaUnitedStates/?brand_redir=427298164109099 (page of Coca-Cola)
# Caution: This is a vulnerable piece of code. It might not work for some Facebook pages. 
import requests
from bs4 import BeautifulSoup

#like count of this page
likeCount = 0
# follow count of this page
followCount = 0

html = requests.get('https://www.facebook.com/CocaColaUnitedStates/?brand_redir=427298164109099').content
soup = BeautifulSoup(html, 'html.parser')
engagementRelated = soup.find_all('div', class_='_4bl9')
for tag in engagementRelated:
    # print(cur)
    cur = tag.text.strip()
    if 'like' in cur:
        cur = cur.replace(',', '')
        likeCount = int(cur.split()[0])
    if 'follow' in cur:
        cur = cur.replace(',', '')
        followCount = int(cur.split()[0])
        
print('Like Count:', likeCount)
print('Follow Count:', followCount, '\n')

# get posts
posts = soup.find_all('div', class_='userContent')
for p in posts:
    print(p.text)

Like Count: 107427548
Follow Count: 107300006 

There's no one quite like you. Or her. Or him. Or them. The world is filled with over 7 billion unique yous. And while we're all different, there's a Coke for every single one of us. #EnjoyYours
No wrong choice in this #FinalFour. #EnjoyYours #MarchMadness
The ultimate score. #EnjoyYours #MarchMadness
Sipping while swishing. #EnjoyYours #MarchMadness
When #SpringBreak is so close, you can taste it. #CocaColaLife
Go for the crisp, refreshing gold. #EnjoyYours #WinterOlympics #PyeongChang2018 🏅❄️
True love is sharing your Coke. #ValentinesDay ❤️️
Three mural characters, three bottles of Coke, three winners. #EnjoyYours
There's a Coke for you. #EnjoyYours
Cheers to @AlabamaFTBL for taking home their 17th #NationalChampionship title! Congratulations on an incredible season. #RollTide
#HappyNewYears from @MadilynBailey and all of us here at Coca-Cola. ❤️️
Ice-cold duties deserve an ice-cold reward.  Who’s the loved one you’d like to thank for 

As you can see in this short snippet of code, getting information directly from a Facebook page is not an easy task (you have to look for the div, class etc. information manually and most of time you will still have to do some extra work to separate out desired information). The filtering process is easy to be broken if any keyword (e.g. like, follow) appears in other texts also with this class name in this example. In addition, not all information we need regarding the social network would be displayed directly on a page (e.g. number of times an URL is shared). 

Therefore, it is easier and better to collect data from Facebook with APIs it provides rather than directly scrapping Facebook pages. So next, we will start to go through how to use the tool. 

## Facebook API access

To access Facebook API, please follow the below instructions to create a Facebook app and obtain your app ID and app secret. (https://developers.facebook.com/docs/apps/register) 

1. Create a Facebook account (if you don't have one) or login to your Facebook account
2. [Upgrade Facebook personal account to developer account](https://developers.facebook.com/docs/apps/register#developer-account) (if you don't have one)
3. [Create a new app](https://developers.facebook.com/apps/) and obtain your app ID and secret (App Dashboard -> Settings -> Basic). Put app ID and secret in a file and on first and second line respectively.
4. For easy usage, we also [get user access token](https://developers.facebook.com/tools/explorer/) under this app and put it as third line of the file (Usually this should be gotten from [setting up a login flow](https://developers.facebook.com/docs/facebook-login/manually-build-a-login-flow), but for convenience, we directly create and save it in this tutorial. Saving access token in file is a very bad decision for apps and it expires after a while). 
<img src='token_1.png'>
<img src='token_2.png'>

5. For comparison purpose, add your personal access token (also using [Graph API explorer](https://developers.facebook.com/tools/explorer/)) as the last line in pervious file (we will compare usage of two access tokens in later section). 

In [1]:
"""
This is a helper function to get app ID and secret from the file you just created. 

Args: 
    filepath (string): path to the file in which you saved app ID and secret
    
Returns:
    appID (string)
    appSecret (string)
    accessToken (string)
    personalToken (string)
"""
def getAppIdSecret(filepath):
    
    with open(filepath, 'r') as f:
        appID, appSecret, accessToken, personalToken = f.read().split('\n')
        return (appID, appSecret, accessToken, personalToken)

# substitute the 'appid.txt' with your own filename
# accessToken is app-wise user token while personalToken is just a user token (more will be mentioned later)
appID, appSecret, accessToken, personalToken = getAppIdSecret('appid.txt')
# print(appID)
# print(appSecret)
# print(accessToken)

## Overview of Facebook API

The primary API for collecting data from Facebook platform is [Graph API](https://developers.facebook.com/docs/graph-api). Other Facebook APIs which require using these data usually interact or are built on Graph API.  

Graph API see Facebook social network as a graph which contains nodes and edges. 
- node: Object in the social graph (for example, user like "me"). It has fields (node properties) containing relevant data of the node. This includes node ID which is a unique identifier of the node and can be used to query it. 
- edge: It servers as connection for objects (for example, a photo (node) belongs to "me" (node)). 

Graph API is HTTP-based so we can use HTTP GET request to retrieve desired data (host URL is https://graph.facebook.com for most cases, for detailed documentation: https://developers.facebook.com/docs/graph-api/using-graph-api). 

### Query a node and its fields

To read a node, the information you need is its unique ID (for URL, its unique ID can be the URL itself). The GET request takes the form like following: `GET https://graph.facebook.com/{node id}?access_token={token}` (most accesses to Graph API require access token, here we use the token we got in previous step).

Below are examples using urllib to establish GET requests to the API and query the information of a node 'me' and a node of a [Wikipedia page](https://en.wikipedia.org/wiki/Yuzuru_Hanyu) respectively (the return result has two fields: name and id, and is presented in json format):

In [26]:
import urllib
import json

host = "https://graph.facebook.com"
# we limit it to only retrive five entries
defaultParamDict = {"access_token": accessToken, "limit":5}

"""
This is a helper function for establishing HTTP GET request to Graph API and format the result to json object. 

Args:
    path (string): this is the desired query end point following the host URL
    paramDict (dict): this is a dictionary containing key, value pairs after the '?' in query. 
    (default is only access token we previously got and the limit of entries retrieved)
    
Returns:
    Result (json object): this is a json object containing the response from the API
"""
def apiQuery(path, paramDict=defaultParamDict):
    
    try:
        parameter = urllib.parse.urlencode(paramDict)
        # https://graph.facebook.com/{path}?access_token={token}
        url = "{host}{path}?{params}".format(host=host, path=path, params=parameter)
        r = urllib.request.urlopen(url).read()
        return json.loads(r)
    
    except Exception as e: 
        print(e)
        

# GET https://graph.facebook.com/me?access_token={token}
print(apiQuery('/me'))
# can also use {node ID} to query the same node
# GET https://graph.facebook.com/1884456408272942?access_token={token}
print(apiQuery('/1884456408272942'))
# GET https://graph.facebook.com//https://en.wikipedia.org/wiki/Yuzuru_Hanyu?access_token={token}
print(apiQuery('/https://en.wikipedia.org/wiki/Yuzuru_Hanyu'))

{'name': 'Jenny Wu', 'id': '1884456408272942'}
{'name': 'Jenny Wu', 'id': '1884456408272942'}
{'id': 'https://en.wikipedia.org/wiki/Yuzuru_Hanyu'}


As previously mentioned, each node has some fields related to it. To access those information, we query the node and specify the desired [field names](https://developers.facebook.com/docs/graph-api/reference/page). Just added 'fields' as key and the list of wanted field names as value in the end of previous request. It will be in this form: `GET https://graph.facebook.com/{node id}?access_token={token}&fields={names of field}`. The example here shows how to query 'birthday' and 'hometown' of node 'me'.

In [3]:
fields = 'birthday, hometown'
fieldDict = defaultParamDict
fieldDict['fields'] = fields
# GET https://graph.facebook.com/me?access_token={token}&fields=birthday, hometown
print(apiQuery('/me', paramDict=fieldDict))

{'birthday': '07/28/1995', 'hometown': {'id': '115217241824342', 'name': 'Hsinchu, Taiwan'}, 'id': '1884456408272942'}


Fields information can be very complicated since fields can contain objects as well. For instance, URL contains an og_object field which indicates an open graph object. The code here demonstrates a more complicated query for fields of a URL node.
- engagement{share_count}: Engagement contains social information. This provides the number of times the URL is shared. 
- [og_object](https://developers.facebook.com/docs/graph-api/reference/v2.12/url): 
 - title
 - engagement{count}: This is the number of likes the URL received. 
 - image: an image object with size and its own URL

In [4]:
fields = 'engagement{share_count}, og_object{title, engagement{count}, image}'
fieldDict = defaultParamDict
fieldDict['fields'] = fields
# GET https://graph.facebook.com/me?access_token={token}&fields=engagement{share_count}, og_object{title, engagement{count}, image}
print(apiQuery('/https://en.wikipedia.org/wiki/Yuzuru_Hanyu', paramDict=fieldDict))

{'engagement': {'share_count': 434}, 'og_object': {'title': 'Yuzuru Hanyu - Wikipedia', 'engagement': {'count': 516}, 'image': [{'height': 400, 'url': 'https://upload.wikimedia.org/wikipedia/commons/9/93/Yuzuru_Hanyu-Sochi_2014.jpg', 'width': 267}], 'id': '1141550705861692'}, 'id': 'https://en.wikipedia.org/wiki/Yuzuru_Hanyu'}


### Query an edge

Reading an edge using Graph API means finding a collection of nodes which are connected to a specified node using a specific kind of connection. Therefore, to query an edge, we need to specify the node and edge name. The reuqest to API will be formed like: `GET https://graph.facebook.com/{node id}/{edge name}?access_token={token}`. The example shows how to retrive 'photos' of 'me' ('photos' is the edge name). The return result would be a set of photo nodes that is connected to 'me'. 

In [5]:
# GET https://graph.facebook.com/me/photos?access_token={token}
photos = apiQuery('/me/photos')
print(photos['data'], '\n')
print(photos['paging']['cursors'])

[{'id': '693101264127390'}, {'id': '693099920794191'}, {'id': '693098307461019'}, {'id': '693096817461168'}, {'id': '693096500794533'}] 

{'before': 'Tmprek1UQXhNalkwTVRJM016a3dPakUwTXpZAMk1UWTRNVEE2TXprME1EZAzVOalF3TmpRM09ETTIZD', 'after': 'Tmprek1EazJOVEF3TnprME5UTXpPakUwTXpZAMk1UWTRNVEE2TXprME1EZAzVOalF3TmpRM09ETTIZD'}


We can query and check the return photos using ID and get the link to the photo. 

In [6]:
fields = 'link'
fieldDict = defaultParamDict
fieldDict['fields'] = fields
for photo in photos['data']:
    result = apiQuery('/'+photo['id'], paramDict=fieldDict)
    print(result)

{'link': 'https://www.facebook.com/photo.php?fbid=693101264127390&set=p.693101264127390&type=3', 'id': '693101264127390'}
{'link': 'https://www.facebook.com/photo.php?fbid=693099920794191&set=p.693099920794191&type=3', 'id': '693099920794191'}
{'link': 'https://www.facebook.com/photo.php?fbid=693098307461019&set=p.693098307461019&type=3', 'id': '693098307461019'}
{'link': 'https://www.facebook.com/photo.php?fbid=693096817461168&set=p.693096817461168&type=3', 'id': '693096817461168'}
{'link': 'https://www.facebook.com/photo.php?fbid=693096500794533&set=p.693096500794533&type=3', 'id': '693096500794533'}


When we request 'photos' of 'me', there is a 'paging' section in the return json which contains curson to 'before' and 'after' as you can see in the example. This is to limit the size of a single query so we won't be overwhelmed with data and give clues regarding where to query next (can think as grouping results as 'pages', return one page and pointer to next page). Therefore, if we want more entries of 'photos', we can follow the pointers and request for data on next page. 

In [7]:
after = photos['paging']['cursors']['after']
fieldDict = defaultParamDict
fieldDict['after'] = after
nextPage = apiQuery('/me/photos', paramDict=fieldDict)
print(nextPage['data'], '\n')
print(nextPage['paging']['cursors'])

[{'link': 'https://www.facebook.com/photo.php?fbid=693096497461200&set=p.693096497461200&type=3', 'id': '693096497461200'}, {'link': 'https://www.facebook.com/photo.php?fbid=693058730798310&set=p.693058730798310&type=3', 'id': '693058730798310'}, {'link': 'https://www.facebook.com/ustrobotics/photos/a.854557934624910.1073741873.138596202887757/854559064624797/?type=3', 'id': '854559064624797'}, {'link': 'https://www.facebook.com/ustrobotics/photos/a.854557934624910.1073741873.138596202887757/854558777958159/?type=3', 'id': '854558777958159'}, {'link': 'https://www.facebook.com/ustrobotics/photos/a.854557934624910.1073741873.138596202887757/854558767958160/?type=3', 'id': '854558767958160'}] 

{'before': 'Tmprek1EazJORGszTkRZAeE1qQXdPakUwTXpZAMk1UWTRNVEE2TXprME1EZAzVOalF3TmpRM09ETTIZD', 'after': 'T0RVME5UVTROelkzT1RVNE1UWXdPakUwTWprMk5qazRNVFk2TXprME1EZAzVOalF3TmpRM09ETTIZD'}


### Search a term in the graph

Graph API also allows searching terms among public objects in the graph. The request should be in this format: `GET https://graph.facebook.com/search?q={search term}&type={specified types to search for}`. There were initially 7 types supported (post, user, page, event, group, place and checkin). Yet, due to tightened privacy policy, post and checkin are deprecated when upgraded to Graph API v2.0. 

In [8]:
"""
This is a helper function to utilize search API Facebook provides. 

Args:
    q (string): the term to search
    type (string): specified the type of objects to search
    
Returns:
    Result (list): the data field in the return json (a list of nodes)
"""
def searchTerm(q, searchType):
    
    fieldDict = defaultParamDict
    fieldDict['q'] = q
    fieldDict['type'] = searchType
    return apiQuery('/search', paramDict=fieldDict)['data']

print(searchTerm('Mark', 'user'), '\n')
print(searchTerm('disney', 'place'))

[{'link': 'https://www.facebook.com/app_scoped_user_id/10104732454694651/', 'id': '10104732454694651'}, {'link': 'https://www.facebook.com/app_scoped_user_id/1744771328894798/', 'id': '1744771328894798'}, {'link': 'https://www.facebook.com/app_scoped_user_id/10213864897728765/', 'id': '10213864897728765'}, {'link': 'https://www.facebook.com/app_scoped_user_id/1583583445079012/', 'id': '1583583445079012'}, {'link': 'https://www.facebook.com/app_scoped_user_id/10107952721743683/', 'id': '10107952721743683'}] 

[{'link': 'https://www.facebook.com/DisneyAulani/', 'id': '147898618589980'}, {'link': 'https://www.facebook.com/WaltDisneyWorld/', 'id': '155669083273'}, {'link': 'https://www.facebook.com/DisneySprings/', 'id': '140898365947669'}, {'link': 'https://www.facebook.com/pages/Disneyland-Los-Angeles-California/440437566338476', 'id': '440437566338476'}, {'link': 'https://www.facebook.com/pages/Disney-on-Ice/163314890924761', 'id': '163314890924761'}]


### Privacy and permission issues

In previous queries, there is always an access token sent together with it. This is because of the privacy restrictions enforced by Facebook regarding their users' data. If you are requesting data which the access token does not provide you with the access rights to do so, the API will either give back bad request error or an empty result. The example here shows an empty list given back as result since we do not have access rights to friends' user information with "an app's user access token".  

In [27]:
print(apiQuery('/me/friendlists'))

{'data': []}


Facebook tighten their policies so that only friends also giving this app user_friends permission will be shown in above query result. Therefore, even the user of this access token give user_friends permission to the app, the app still can't get his/her friends' information. We can compare the query result with using the same user's access token but not under the app. 

In [30]:
# use user access token under app
print(apiQuery('/me/friends'), '\n')
# use personal user access token
personalAccess = {"access_token": personalToken}
personalResult = apiQuery('/me/friends', paramDict=personalAccess)
print(personalResult['data'], personalResult['summary'])

{'data': [], 'summary': {'total_count': 647}} 

[{'name': 'Harrison Ng', 'id': '568403382'}, {'name': 'Sung Kim', 'id': '10152332572534521'}, {'name': 'Eric Hsieh', 'id': '642156660'}, {'name': 'Peter Chung', 'id': '689157111'}, {'name': 'Ryan Lei', 'id': '800267179'}, {'name': 'Ken Cheng', 'id': '10203115537031599'}, {'name': 'Frances Lee', 'id': '10203190656355932'}, {'name': 'Long Hoang', 'id': '1569352433'}, {'name': 'Ruey-Lin Jahn', 'id': '100000056523864'}, {'name': 'Heron Yang', 'id': '100000094588500'}, {'name': 'Camille Hsu', 'id': '100000193686984'}, {'name': 'Yi-Chu Chen', 'id': '100000435919091'}, {'name': 'Pei-Chen Cheng', 'id': '926316524061466'}, {'name': 'Frank Shyu', 'id': '100000654759521'}, {'name': 'Jeff Hu', 'id': '625572670850662'}, {'name': 'Jonathan Beaulieu', 'id': '849612951800618'}, {'name': 'Emily Tu', 'id': '923183164444474'}, {'name': 'Jenny Kang', 'id': '579770425483139'}] {'total_count': 647}


## Easier usage of Facebook API using Facepy

Facepy is a python wrapper which ease the interaction with Facebook API. It consists of three main parts: [Graph API](https://facepy.readthedocs.io/en/latest/usage/graph-api.html), [signed request](https://facepy.readthedocs.io/en/latest/usage/signed-requests.html) and [utilities](https://facepy.readthedocs.io/en/latest/usage/utilities.html). Since our focus is on collection of data from Facebook platform, Graph API is the key part. 

### Installation of Package

Before getting started, we have to install the package from python package index (for detailed information of installation, please check https://facepy.readthedocs.io/en/latest/installation.html#installation). It can be installed using `pip` command as following (uncomment and run `!pip install facepy`): 

In [11]:
# Run pip install only for the first time!
# !pip install facepy

### Example usage of Facepy

Facepy simplifies the work we have to do to interact with Facebook API. Let's redo the example in previous part but this time using Facepy instead. This will make it clear that how Facepy makes the interaction process eaiser and more intuitive. 

To utilize Graph API, we initialize a graph using `facepy.GraphAPI` with the access token. This initialization is to set up usage of access token, the host URL, and verification of SSL certificate. Therefore, there is no need to construct the URL like we did in previous part every time accessing the API. 
```python
class facepy.GraphAPI(oauth_token=False, url='https://graph.facebook.com', verify_ssl_certificate=True)
```

After initializing the `facepy.GraphAPI`, we can request the node, edge or fields we need through this graph. We can easily obtain the item with `get` function. The path parameter is basically the same as the one for apiQuery. Other Graph API parameters is passed in using options including fields or limit. The return item is either a python dictionary storing the json-formatted response (single item) or a generator (paging of items). If we directly accessing Graph API, we need to construct a new query with next page information retrieved (example in previous section). Facepy makes this easier by returning a python generator object (when page parameter set to `True`). You can get the next page desired simply by calling next on the generator ("photos with pages" below is a simple example of how this works). 
```python
facepy.GraphAPI.get(path='', page=False, retry=3, **options)
```

In [2]:
from facepy import GraphAPI

# initialize a graph
graph = GraphAPI(accessToken)

# get node 'me'
me = graph.get('/me')
print(type(me))
print(me['id'], me['name'], '\n')

# get 'birthday, hometown' fields of 'me'
fieldsExample = graph.get('/me', fields='birthday, hometown')
print(fieldsExample['birthday'], fieldsExample['hometown'], '\n')

# get edge photos connected with me
photos = graph.get('/me/photos', limit='5')
print(photos['data'], '\n')

# photos with pages
photosWithPage = graph.get('/me/photos', page=True)
print(type(photosWithPage))
print(next(photosWithPage)['data'][:5])
print(next(photosWithPage)['data'][:5])

<class 'dict'>
1884456408272942 Jenny Wu 

07/28/1995 {'id': '115217241824342', 'name': 'Hsinchu, Taiwan'} 

[{'created_time': '2015-04-22T02:19:20+0000', 'id': '854559064624797'}, {'created_time': '2015-04-22T02:19:08+0000', 'id': '854558777958159'}, {'created_time': '2015-04-22T02:19:08+0000', 'id': '854558767958160'}, {'created_time': '2014-09-03T02:21:22+0000', 'id': '704696932944345'}, {'created_time': '2014-07-09T07:58:45+0000', 'name': 'Team Photo with CMA School!', 'id': '674960632584642'}] 

<class 'generator'>
[{'created_time': '2015-04-22T02:19:20+0000', 'id': '854559064624797'}, {'created_time': '2015-04-22T02:19:08+0000', 'id': '854558777958159'}, {'created_time': '2015-04-22T02:19:08+0000', 'id': '854558767958160'}, {'created_time': '2014-09-03T02:21:22+0000', 'id': '704696932944345'}, {'created_time': '2014-07-09T07:58:45+0000', 'name': 'Team Photo with CMA School!', 'id': '674960632584642'}]
[{'created_time': '2014-06-12T03:22:45+0000', 'name': 'The 13th annual MATE int

For search function in Graph API, Facepy also provided a `search` function under the class. The options parameter makes it a lot easier to query additional information than directly forming the URL (e.g. center, distance parameters for place type). 
```python
facepy.GraphAPI.search(term, type, page=False, retry=3, **options)
```

In [44]:
print(graph.search('Mark', 'user')['data'][:5], '\n')
print(graph.search('disney', 'place', fields='about, category_list, fan_count')['data'][:5])

[{'name': 'Mark Zuckerberg', 'id': '10104732454694651'}, {'name': 'Mark Florence Arcamo', 'id': '1449214531874129'}, {'name': 'Mark', 'id': '1421895541248968'}, {'name': 'Mark Suster', 'id': '10155174658496816'}, {'name': 'Mark Jenkins', 'id': '10155384035983715'}] 

[{'about': 'Welcome to the official Facebook Page of the Disneyland Resort!   For questions and comments, please contact us at Disneyland Today: http://di.sn/60088HvPn', 'category_list': [{'id': '220626791295805', 'name': 'Amusement & Theme Park'}], 'fan_count': 17915292, 'id': '11081890741'}, {'about': 'Aloha from Aulani! You can contact us by phone at 866-443-4763 or visit us online at http://www.disneyaulani.com', 'category_list': [{'id': '187686707929197', 'name': 'Hotel Resort'}], 'fan_count': 634259, 'id': '147898618589980'}, {'about': "Explore a themed retail, dining, and entertainment complex inspired by Florida's charming waterfront towns and filled with unique boutiques, world-class restaurants, and cutting-edge 

## Example Application: Retrive and process information of URLs

We are currently trying to create a new website. Before starting building our own, we would like to analyze what key words can make a website "popular". Therefore, we collect data from Facebook about social engagement for a set of URLs. To make it convenient for later analysis, we process data to pandas dataframe and export it to .csv file. 

In [90]:
import pandas as pd

# our example of set URLs
URL_list = ['https://developers.facebook.com/docs/graph-api/overview/', 
            'https://towardsdatascience.com/how-to-use-facebook-graph-api-and-extract-data-using-python-1839e19d6999', 
            'https://www.finalwebsites.com/facebook-api-php-tutorial/', 
            'https://code.tutsplus.com/tutorials/wrangling-with-the-facebook-graph-api--net-23059', 
            'https://hackernoon.com/graphapi-get-query-fetch-public-facebook-page-feed-3-step-tutorial-example-access-token-auth-post-d7403c717fbf'
            ]

columns = ['URL', 'title', 'type', 'share', 'likes', 'update time', 'words']
data = []

for page in URL_list:
    
    result = graph.get(page, fields='engagement{share_count}, og_object{title, description, type, updated_time, engagement{count}}')
    resultOG = result['og_object']
    newRow = [page, resultOG['title'], resultOG['type'], result['engagement']['share_count'], resultOG['engagement']['count'], resultOG['updated_time'], resultOG['description']]
    data.append(newRow)
    
resultData = pd.DataFrame(data=data, columns=columns)
# filter out URL which is very unpopular (likes<10)
resultData['likes'] = resultData['likes'].astype(int)
resultData.query('likes>10', inplace=True)
print(resultData)
resultData.to_csv('dataToAnalyze.csv')

                                                 URL  \
0  https://developers.facebook.com/docs/graph-api...   
2  https://www.finalwebsites.com/facebook-api-php...   
4  https://hackernoon.com/graphapi-get-query-fetc...   

                                               title     type  share  likes  \
0  Overview - Graph API - Documentation - Faceboo...  article    164   1018   
2  Facebook API Tutorial for PHP (Open Graph) | f...  article     56     56   
4  [GraphAPI] Fetch public Facebook page’s feed b...  article     10     11   

                update time                                              words  
0  2018-03-28T12:47:01+0000  The Graph API is the primary way to get data i...  
2  2018-03-20T11:59:58+0000  Creating applications with the Facebook PHP SD...  
4  2018-02-13T05:59:42+0000  This is a example to show how easy to query da...  


## References

1. https://gist.github.com/jkuruzovich/b8485a368f80a3b88df46326cf54bbce
2. https://developers.facebook.com/docs/graph-api 
3. https://facepy.readthedocs.io/en/latest/