# Fetch ads archive from Facebook API

Test on the Graph explorer:
* https://developers.facebook.com/tools/explorer
* `ads_archive?fields=ad_creative_body%2Cad_creation_time%2Cad_creative_link_caption%2Cad_creative_link_description%2Ccurrency%2Cfunding_entity%2Cimpressions%2Cad_snapshot_url%2Cpage_id%2Cpage_name%2Cspend&search_terms=''&ad_reached_countries=['FR']&limit=25`

Documentation:
* https://www.facebook.com/ads/library/?active_status=all&ad_type=political_and_issue_ads&country=FR
* https://www.facebook.com/ads/library/api/?source=archive-landing-page
    

In [1]:
import collections
import pprint

import pandas

import fetch

## Fetch using a single request with an empty search

In [4]:
ads_active = fetch.fetch(country_code='GB', search_params={'search_terms': "''"})
len(ads_active)

Got 2500 ads
Got 307 ads
Got 0 ads


2807

In [5]:
ads_all = fetch.fetch(country_code='GB', search_params={'search_terms': "''", 'ad_active_status': 'ALL'})
len(ads_all)

Got 2500 ads
Got 2529 ads
Got 2537 ads
Got 2501 ads
Got 2501 ads
Got 2536 ads
Got 2501 ads
Got 2501 ads
Got 2501 ads
Got 2572 ads
Got 2501 ads
Got 2501 ads
Got 2501 ads
Got 2501 ads
Got 2501 ads
Got 2500 ads
Got 2503 ads
Got 2520 ads
Got 2501 ads
Got 2502 ads
Got 2138 ads
Got 0 ads


52348

In [7]:
unknown_ads = [
    ad
    for ad in ads_active
    if ad not in ads_all
]
print('A few ads ({}) are not found in the global search.'.format(len(unknown_ads)))

A few ads (17) are not found in the global search.


In [14]:
df = pandas.DataFrame(ads_all)
df

Unnamed: 0,ad_creation_time,ad_creative_body,ad_creative_link_caption,ad_creative_link_description,ad_creative_link_title,ad_delivery_start_time,ad_delivery_stop_time,ad_snapshot_url,currency,demographic_distribution,funding_entity,impressions,page_id,page_name,region_distribution,spend
0,2019-05-16T05:58:42+0000,"Want to work with animals, as a Zookeeper at a...",,,,2019-05-16T05:58:42+0000,2019-05-19T12:35:56+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.413681', 'age': '13-17', 'g...",Jason Alexander Hendry,"{'lower_bound': '0', 'upper_bound': '999'}",2345328912417791,African Wildlife Courses,"[{'percentage': '0.878594', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
1,2019-05-16T05:55:48+0000,Don't miss the Boat ! Vote for an Independent ...,,,,2019-05-16T05:55:52+0000,2019-05-17T05:55:48+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.298246', 'age': '25-34', 'g...",Robert John Davis,"{'lower_bound': '0', 'upper_bound': '999'}",2354803371253384,Jason McMahon MEP Campaign Page,"[{'percentage': '0.982456', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
2,2019-05-16T05:50:37+0000,"Want to work with animals, as a Zookeeper at a...",,,,2019-05-16T05:50:37+0000,2019-05-19T12:35:56+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.414634', 'age': '13-17', 'g...",Jason Alexander Hendry,"{'lower_bound': '0', 'upper_bound': '999'}",2345328912417791,African Wildlife Courses,"[{'percentage': '0.95122', 'region': 'England'...","{'lower_bound': '0', 'upper_bound': '99'}"
3,2019-05-16T04:46:18+0000,Labour are working with the Tories to deliver ...,,,Vote Change UK on the 23rd of May,2019-05-16T04:46:18+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.034976', 'age': '25-34', 'g...",Change UK - The Independent Group,"{'lower_bound': '0', 'upper_bound': '999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.832037', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
4,2019-05-16T04:46:18+0000,Labour are working with the Tories to deliver ...,,,Vote Change UK on the 23rd of May,2019-05-16T04:46:18+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.03003', 'age': '65+', 'gend...",Change UK - The Independent Group,"{'lower_bound': '0', 'upper_bound': '999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.906585', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
5,2019-05-16T04:46:18+0000,Labour are working with the Tories to deliver ...,,,Vote Change UK on the 23rd of May,2019-05-16T04:46:18+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.199815', 'age': '65+', 'gen...",Change UK - The Independent Group,"{'lower_bound': '1000', 'upper_bound': '4999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.875229', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
6,2019-05-16T04:46:18+0000,Labour are working with the Tories to deliver ...,,,Vote Change UK on the 23rd of May,2019-05-16T04:46:18+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.090631', 'age': '55-64', 'g...",Change UK - The Independent Group,"{'lower_bound': '1000', 'upper_bound': '4999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.867513', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
7,2019-05-16T04:46:18+0000,Labour are working with the Tories to deliver ...,,,Vote Change UK on the 23rd of May,2019-05-16T04:46:18+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.115', 'age': '45-54', 'gend...",Change UK - The Independent Group,"{'lower_bound': '1000', 'upper_bound': '4999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.84623', 'region': 'England'...","{'lower_bound': '0', 'upper_bound': '99'}"
8,2019-05-16T04:46:12+0000,Chuka kicked off our first election rally in L...,,,Chuka Election Rally,2019-05-16T04:46:12+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.000249', 'age': '65+', 'gen...","The Independent Group (TIG) Ltd, company numbe...","{'lower_bound': '5000', 'upper_bound': '9999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.851747', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
9,2019-05-16T04:46:12+0000,Heidi speaking at our first UK rally in London...,,,,2019-05-16T04:46:12+0000,2019-05-22T22:59:18+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.103425', 'age': '18-24', 'g...",Change UK - The Independent Group,"{'lower_bound': '5000', 'upper_bound': '9999'}",395151614382309,Change UK - The Independent Group,"[{'percentage': '0.823167', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"


In [15]:
df[df['page_name']=='Disabled Lives Matter']

Unnamed: 0,ad_creation_time,ad_creative_body,ad_creative_link_caption,ad_creative_link_description,ad_creative_link_title,ad_delivery_start_time,ad_delivery_stop_time,ad_snapshot_url,currency,demographic_distribution,funding_entity,impressions,page_id,page_name,region_distribution,spend
296,2019-05-14T21:12:52+0000,https://www.change.org/p/give-victims-support-...,change.org,Episode 13 - Are HM Employment Judges Involved...,ARE HM EMPLOYMENT JUDGES INVOLVED IN CORRUPTIO...,2019-05-14T21:13:04+0000,2019-05-24T21:12:52+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.101695', 'age': '18-24', 'g...",Craig Chant,"{'lower_bound': '0', 'upper_bound': '999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '1', 'region': 'England'}]","{'lower_bound': '0', 'upper_bound': '99'}"
1076,2019-05-11T20:54:07+0000,Please show some care & respect for the disabl...,change.org,Following the previous video covering my autis...,Solicitor & Barrister Threaten Autistic Disabl...,2019-05-11T20:54:20+0000,2019-05-21T20:54:07+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.015288', 'age': '55-64', 'g...",Craig Chant,"{'lower_bound': '1000', 'upper_bound': '4999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '1', 'region': 'England'}]","{'lower_bound': '0', 'upper_bound': '99'}"
1726,2019-05-10T13:51:04+0000,Following the previous video covering my autis...,change.org,Following the previous video covering my autis...,Solicitor & Barrister Threaten Autistic Disabl...,2019-05-10T13:51:08+0000,2019-05-30T13:51:04+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.00207', 'age': '55-64', 'ge...",Craig Chant,"{'lower_bound': '0', 'upper_bound': '999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '1', 'region': 'England'}]","{'lower_bound': '0', 'upper_bound': '99'}"
5903,2019-04-30T19:59:45+0000,#DisabilityNotDisturbing #DisabledLivesMatter ...,,,,2019-04-30T19:59:47+0000,2019-05-10T19:59:45+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.001515', 'age': '35-44', 'g...",Craig Chant,"{'lower_bound': '1000', 'upper_bound': '4999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '0.664201', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
12350,2019-04-15T17:27:07+0000,The fight for justice continues\nDISABLED LIVE...,change.org,Police &amp; Crime Commissioner Campaign Updat...,Police & Crime Commissioner Campaign Update - ...,2019-04-15T17:27:16+0000,2019-04-25T17:27:07+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.000515', 'age': '25-34', 'g...",Craig Chant,"{'lower_bound': '1000', 'upper_bound': '4999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '0.851206', 'region': 'England...","{'lower_bound': '0', 'upper_bound': '99'}"
14679,2019-04-07T19:56:04+0000,The fight for justice continues\nDISABLED LIVE...,change.org,Police &amp; Crime Commissioner Campaign Updat...,Police & Crime Commissioner Campaign Update - ...,2019-04-07T19:56:04+0000,2019-04-15T12:35:00+0000,https://www.facebook.com/ads/archive/render_ad...,GBP,"[{'percentage': '0.002208', 'age': '45-54', 'g...",Craig Chant,"{'lower_bound': '1000', 'upper_bound': '4999'}",195394467932052,Disabled Lives Matter,"[{'percentage': '0.018262', 'region': 'Norther...","{'lower_bound': '0', 'upper_bound': '99'}"


## Search by page

In [5]:
page_ids = set(df['page_id'])
page_names = set(df['page_name'])
print('The global search returned {} different pages'.format(len(page_ids)))

The global search returned 248 different pages


In [48]:
# Warning: this is likely to trigger the rate limiting
for page_id in page_ids:
    print('Search for page {}'.format(page_id))

    ads_page = fetch(search_params={'search_page_ids': page_id})
    
    nb_ads_global_search = len(df[df['page_id']==page_id])
    nb_ads_page_search = len(ads_page)

    if nb_ads_global_search != nb_ads_page_search:
        page_name = list(df[df['page_id']==page_id]['page_name'])[0]
        print('The numbers do not match for page {}: {} in page search vs {} in global search'.format(
            page_name, nb_ads_page_search, nb_ads_global_search
        ))


305531639643278
1934566156780439
407402162736656
167962303683318
1834548586875556
138435322972255
298398717018170
365430600210847
741181696045328
2317233945166352
176178532420023
883296735214769
102734363097680
678120165658753
30239959348
360322167647280
137385490131
310039442421596
452897548398251
201951136994785
228249960960213
386752004789933
435340366822684
2235255886787452
367006363463686
329262860438866
249840792393105
450568105101647
6587671199
491862484305741
485789368166508
146383495502578
111257452230362
1799561767025031
1361286460616795
211043423125695
17373130431
555226237822956
308313479373888
685145698338214
2257604454310952
105732052794210
1807970796086865
1168773666629907
226189004178864
199108670575850
324070087738168
396013737832767
327226704149899
Discrepancy!!
327226704149899 East Herts Green Party 1 0
111956985487415
123817579216
1106278742785368
474613052643462
9250349228
1793510380925494
535129226536200
679879869119530
2041107186182812
628368003989904
15713517832

AssertionError: 

It seems that generaly searching a specific page does not give more results than the general query.

## Most common values

In [18]:
def find_most_common(field):
    l = [
        ad[field]
        for ad in ads_all
        if field in ad
    ]
    c = collections.Counter(l)
    pprint.pprint(c.most_common(20))

In [19]:
find_most_common('funding_entity')

[('The Conservative Party', 4597),
 ("Britain's Future", 4318),
 ("People's Vote", 3545),
 ('the Liberal Democrats', 3181),
 ('Friends of the Earth', 1652),
 ('The Labour Party', 908),
 ('Best for Britain', 866),
 ('Conservatives', 767),
 ('John Mccabe', 693),
 ('Influence Digital Ltd', 639),
 ('38 Degrees', 632),
 ('Change UK - The Independent Group', 558),
 ('Right to Vote', 358),
 ('Our West Lancashire', 350),
 ('Global Justice Now', 347),
 ('We are the 52%', 278),
 ('Terence Brotheridge', 269),
 ('Bedford Liberal Democrats', 226),
 ('The Independent Group (TIG) Ltd, company number 11770529, a registered '
  'company in England and Wales.',
  223),
 ('Friends of the Earth England, Wales and Northern Ireland', 189)]


In [20]:
find_most_common('page_name')

[("Britain's Future", 4488),
 ('Liberal Democrats', 3998),
 ("People's Vote UK", 3547),
 ('Friends of the Earth', 2412),
 ('Conservatives', 2224),
 ('Best For Britain', 882),
 ('Change UK - The Independent Group', 746),
 ('McCabe 4 Mayor', 693),
 ('The Labour Party', 663),
 ('38 Degrees', 656),
 ('Primavera Sound', 637),
 ('Andy Street', 635),
 ('Paul Bristow', 376),
 ('Global Justice Now', 365),
 ('Right To Vote', 358),
 ('Our West Lancashire', 350),
 ('Property & Business Info', 287),
 ('Andrew Stephenson MP', 280),
 ('We are the 52%', 278),
 ('Amber Rudd MP', 265)]
