# Connecting to Facebook API

Written by Kat Chuang [@katychuang](http://twitter.com/katychuang)

## Objective

The goal of this exercise is to connect with Facebook Graph Api to collect information about my most recent posts, and also to collect each posts' subsequent comments and likes.

## Setup

- I first created a virtual environment for my notebooks: `mkvirtualenv --python=/usr/local/bin/python3 dataAnalysis`
- Then installed the notebook server to the machine. Instructions can be [found here](http://jupyter.org/install.html)
- Start server in the root directory, `jupyter notebook`

In addition to getting the console ready, I saved my access token information in a separate local folder, `_keys` in a file name `facebook.py`. Inside you want to save a string variable like the following:

```
ACCESS_TOKEN="XXXXXX"
```


## Making a request

In [1]:
from _keys.facebook import USER_ID, ACCESS_TOKEN
import requests

u = 'https://graph.facebook.com/v2.8/{}/posts?access_token={}'.format(USER_ID, ACCESS_TOKEN)
r = requests.get(u)
data = requests.get(u).json()

print(data["data"][0], "\n")



{'message': "Yesterday morning I woke up surprisingly early after going to sleep late coding. Did weekly laundry and made breakfast, it's been awhile since I've cooked. Had mushrooms leftover from making pasta earlier in the week so I used it up for breakfast. \n\nHot sauce covered scrambled eggs with mushroom and bacon with a large cup of coffee to code more. I like to use a mug warmer to keep my beverages hot 😁\n\nOne of these days I'm going to learn to make omurice!! With brown rice instead of white and hot sauce instead of ketchup. \n\nhttps://www.youtube.com/watch?v=bcJlmhoYNfI&t=6s", 'story': 'Kat Chuang added a new photo to album: MMXVII.', 'created_time': '2017-03-19T22:23:36+0000', 'id': '10103434986805726_10106726635894566'} 



The JSON data is saved into the variable `data`,which is a list of dictionaries, so we can now use the data structure functions to access information for each story. Here are the two most recent posts' IDs and timestamps.

In [2]:
# Two most recent posts:
print(data['data'][0]['id'], data['data'][0]['created_time'])
print(data['data'][1]['id'], data['data'][1]['created_time'])

10103434986805726_10106726635894566 2017-03-19T22:23:36+0000
10103434986805726_10106726518130566 2017-03-19T21:45:38+0000


## Parsing data

The next step is to save this data for analysis. We could save it into a text file. We could save it into a database. Since this is a small amount of data (n=25), I chose to iterate quickly on developing the code by loading the JSON response directly into a Python dataframe.

Let's see what happens when we use json_normalize to flatten the JSON response:


In [3]:
import datetime
from dateutil import parser

# return 
def scrub(timestamp):
    d = parser.parse(timestamp)
    return dow(d)

# returns day of week
def dow(date):
    days=["Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"]
    dayNumber=date.weekday()
    return days[dayNumber]

list(map((lambda x: scrub(x['created_time'])), data['data'] ))

['Sunday',
 'Sunday',
 'Sunday',
 'Sunday',
 'Friday',
 'Thursday',
 'Wednesday',
 'Sunday',
 'Sunday',
 'Saturday',
 'Saturday',
 'Friday',
 'Friday',
 'Friday',
 'Thursday',
 'Thursday',
 'Sunday',
 'Sunday',
 'Saturday',
 'Thursday',
 'Wednesday',
 'Tuesday',
 'Wednesday',
 'Friday',
 'Saturday']