# Facebook Ads Analysis 
---

**Limitations of using the Facebook Graph API to access the Facebook Ads library and perform data analysis on advertisement practices**

In the following notebook you will find a description of how to use and access Facebook ads data from a general user perspective via the Facebook ads library service, but most importantly how to connect to the Graph API and gather advertisement data directly for data analysis.

## Accessing the Facebook API as **General user and Marketers**  
---

[Facebook Ads Library](https://www.facebook.com/ads/library/)

The service is presented with a search engine, which will help the user to look up ads in the Facebook database. The search has two filters **Country** and **Ad Type**. Important
- The service is limited for the user to select one country at a time
- The service **only** allows to look up `All ads` at once or the `Issues, Elections or Politics` category. 
**Unfortunately Facebook is not making public the rest of their defined categories, this should be followed up by data-related regulation conversations.**

Interface example:

<img src="assets/example2.png" alt="drawing" width="700"/>

Once the search is defined one can browse all the ads related to the keywords entered. Additional filters will appear.
Active/Inactive, Advertiser, Platform and Impressions by date.   

<img src="assets/example3.png" alt="drawing" width="700"/>

One can see the details of each ad, but important to note that the details only show the ad identifier, the link to the page, and the ad content, **any information about the demographics of the audiences that this ad was shown NOT presented by Facebook**

<img src="assets/example4.png" alt="drawing" width="800"/>

Using the API is impossible to do the last query since Facebook is **only** making public the parameter `POLITICAL_AND_ISSUE_ADS` therefore the rest of the ads are not accessible via the API

<img src="assets/example5.png" alt="drawing" width="800"/>

The following EU technical in ads transparency report documents the use of Facebook Graph API and how was used for analysing general elections ads https://adtransparency.mozilla.org/eu/methods/

**The main aim is to be able to collect evidence of unfair or illegal practices in advertising**


## Accessing the Facebook API as **Developer**  
---

[Facebook Graph API](https://www.facebook.com/ads/library/api)

The oficial documentation is found [here](APIs and SDKs https://developers.facebook.com/docs/apis-and-sdks). Accessing the Facebook API as **Developer**  
UI for developers  https://www.facebook.com/ads/library/api

more https://towardsdatascience.com/how-to-use-facebook-graph-api-and-extract-data-using-python-1839e19d6999

1. Register as developer
2. Create an app
3. Create a Token for the new app
4. Define the Graph API Node to use: `ads_archive`


Interface example:

![](assets/example1.png)

In [1]:
import facebook
import requests
import urllib3
import json
import pandas as pd
import csv

https://github.com/minimaxir/facebook-ad-library-scraper

In [89]:
def main():
    token = "TOKEN"
    graph = facebook.GraphAPI(token)
    #define parameters for request
    profile = graph.get_object('ads_archive',fields='page_id,page_name,ad_snapshot_url,ad_creative_body,region_distribution,demographic_distribution',
                               search_terms='klimaat',
                               ad_type='POLITICAL_AND_ISSUE_ADS',
                               ad_reached_countries=['NL'])
    #return desired fields
    #print(json.dumps(profile, indent=4))
    #save results as json file
    with open('data.json', 'w', encoding='utf-8') as f:
        json.dump(profile, f, ensure_ascii=False, indent=4)

if __name__ == '__main__':
    main()
    
results = json.load(open('data.json'))
df = pd.DataFrame(results["data"])
df.to_csv('data.csv')

In [90]:
# Transform to ads table

df_ads = pd.DataFrame(results["data"])
df_ads = df_ads[['id','ad_creative_body','ad_snapshot_url','page_name','page_id']]

matrix = []
data = results["data"]
for i in range(len(data)):
    ids = data[i]['id']
    reg_dist = data[i]['region_distribution']
    for r in range(len(reg_dist)):
        perc = reg_dist[r]['percentage']
        reg = reg_dist[r]['region']
        matrix.append((ids,perc, reg))
df_reg_dist = pd.DataFrame(matrix).rename(columns={0:'id',1:'perc_region',2:'region'})

matrix = []
data = results["data"]
for i in range(len(data)):
    ids = data[i]['id']
    dem_dist = data[i]['demographic_distribution']
    for r in range(len(dem_dist)):
        perc = dem_dist[r]['percentage']
        age = dem_dist[r]['age']
        gen = dem_dist[r]['gender']
        matrix.append((ids, perc, age, gen))
df_dem_dist = pd.DataFrame(matrix).rename(columns={0:'id',1:'perc_demographic',2:'age',3:'gender'})

df_reg_dist = df_reg_dist.merge(df_ads, on='id',how='left')
df_reg_dist['perc_region'] = df_reg_dist['perc_region'].apply(float)
df_dem_dist = df_dem_dist.merge(df_ads, on='id',how='left')
df_dem_dist['perc_demographic'] = df_dem_dist['perc_demographic'].apply(float)

In [91]:
df_reg_dist.groupby('id').count()

Unnamed: 0_level_0,perc_region,region,ad_creative_body,ad_snapshot_url,page_name,page_id
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1023754321480469,1,1,1,1,1,1
1034254717099339,12,12,12,12,12,12
175689150744699,1,1,1,1,1,1
1865817646933817,12,12,12,12,12,12
217052436841392,12,12,12,12,12,12
348735506562218,12,12,12,12,12,12
3740557662648493,12,12,12,12,12,12
422025382390044,12,12,12,12,12,12
780399602577376,12,12,12,12,12,12


In [92]:
df_reg_dist[df_reg_dist['id'] == '175689150744699']

Unnamed: 0,id,perc_region,region,ad_creative_body,ad_snapshot_url,page_name,page_id
61,175689150744699,1.0,Zuid-Holland,"Het klimaatakkoord van Parijs is een opdracht,...",https://www.facebook.com/ads/archive/render_ad...,CDA,320374518118


In [78]:
df_reg_dist[df_reg_dist['id'] == '1034254717099339'].sort_values('perc_region',ascending = False)

Unnamed: 0,id,perc_region,region,ad_creative_body,ad_snapshot_url,page_name,page_id
63,1034254717099339,0.264574,Noord-Holland,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
66,1034254717099339,0.251121,Zuid-Holland,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
68,1034254717099339,0.103139,Gelderland,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
70,1034254717099339,0.103139,North Brabant,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
64,1034254717099339,0.071749,Utrecht,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
69,1034254717099339,0.049327,Groningen,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
71,1034254717099339,0.049327,Overijssel,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
61,1034254717099339,0.040359,Drenthe,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
67,1034254717099339,0.026906,Flevoland,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
62,1034254717099339,0.022422,Limburg,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002


In [81]:
df_dem_dist[df_dem_dist['id'] == '1034254717099339'].sort_values('perc_demographic',ascending = False)

Unnamed: 0,id,perc_demographic,age,gender,ad_creative_body,ad_snapshot_url,page_name,page_id
74,1034254717099339,0.130045,25-34,female,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
73,1034254717099339,0.121076,35-44,male,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
82,1034254717099339,0.121076,25-34,male,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
75,1034254717099339,0.103139,35-44,female,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
78,1034254717099339,0.089686,45-54,male,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
71,1034254717099339,0.080717,18-24,female,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
77,1034254717099339,0.080717,45-54,female,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
83,1034254717099339,0.080717,55-64,male,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
79,1034254717099339,0.071749,18-24,male,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002
81,1034254717099339,0.058296,55-64,female,💡 Wij willen kernenergie! Want het klimaat is ...,https://www.facebook.com/ads/archive/render_ad...,VVD,121264564551002


## Analysis

- Proportion of unique ads by candidate (or page)
- Unique targets (where certain ads targeted 100 one group or region)
- Number of ads by page, (top advertiser)
- Most shown by gender, most shown by age range
- n-grams, per region https://www.cbc.ca/news/politics/facebook-political-ads-canadian-federal-election-1.5246710
- similarity of messages?

https://opop999.github.io/election_monitoring_fb_ads/  
https://github.com/robroc/facebook-political-ads/blob/master/facebook_political_ads_canada.ipynb  

Methodology layout

CBC collected data on more than 36,000 ads using Facebook's Ad Library API. We searched the API by advertiser ID numbers, which were published in Facebook's daily Ad Library Report. Major non-political advertisers (those that sell products or services and ran at least 50 ads or spent at least $30,000 in 2019) were excluded from the analysis. These advertisers were mistakenly classified as political by Facebook's algorithms because their ad texts may contain words associated with political issues like "environment," "guns" and "economy."

Data analysis was done using the Python programming language. Word frequencies were found using the Natural Language Toolkit.

Laura Edelson of New York University and Jason Chuang of Mozilla provided valuable expertise.

The ad data and code used for the analysis can be found on GitHub. Got any questions about this story? Contact the reporter at roberto.rocha@cbc.ca or through Twitter: @robroc

