The goal of metatargetr is to parse targeting information from the
Meta Ad Targeting
dataset
and retrieve data from the Audience
tab
in the Meta Ad Library. It also includes helper functions for Meta ad
library data and integrates data from the Google Transparency Report.
💡 Support Open-Source Development
If metatargetr has been helpful to you, consider supporting the
project! Every contribution
keeps the maintenance work going and helps me develop new features 😊
- 🚀 Installation
- 📦 Load in Package
- 🎯 Get Targeting Criteria
- ⏳ Last 30 Days
- 🗓️ Last 7 Days
- 🕰️ Retrieve Historical Targeting Data
- 🗂️ Retrieve Historical Report Data
- ℹ️ Get Page Info
- 🔍 Retrieve Targeting Metadata
- 🖼️ Get Ad Snapshots (Images, Videos, and Metadata)
- 📄 Fetch and Cache Ad HTML
- 🔗 Get Deeplink Data
- 📊 Google Transparency Report
- ✍️ Citing metatargetr
You can install the development version of metatargetr like so:
remotes::install_github("favstats/metatargetr")library(metatargetr)The following code retrieves the targeting criteria used by the main page of the VVD (Dutch party) in the last 30 days of available data.
Just put in the right Page ID. These can be found in the Meta Ad Library or the Meta Ad Library Report. You can also retrieve historical report data from the maintained database.
last30 <- get_targeting(id = "121264564551002",
timeframe = "LAST_30_DAYS")
head(last30, 5)
#> # A tibble: 0 × 7
#> # ℹ 7 variables: ds <chr>, main_currency <chr>, total_num_ads <int>,
#> # total_spend_formatted <chr>, is_30_day_available <lgl>,
#> # is_90_day_available <lgl>, page_id <chr>The following code retrieves the targeting criteria used by the main page of the VVD (Dutch party) in the last 7 days. Just put in the right Page ID.
last7 <- get_targeting(id = "121264564551002",
timeframe = "LAST_7_DAYS")
head(last7, 5)
#> # A tibble: 0 × 7
#> # ℹ 7 variables: ds <chr>, main_currency <chr>, total_num_ads <int>,
#> # total_spend_formatted <chr>, is_30_day_available <lgl>,
#> # is_90_day_available <lgl>, page_id <chr>Unfortunately, using get_targeting you can only get the targeting
criteria in the last 7, 30, and 90 days windows. However, I have set
up scrapers that retrieve the daily targeting data for every single page
in the world that runs advertisements in order to archive this data. You
can use the function below to retrieve it.
Be aware: sometimes the scrapers do not work so it is possible that some pages are missing. You can use
get_targeting_metadatafunction to check which data for which country and day is present.
# # set some parameters
the_cntry <- "DE"
tf <- 30
ds <- "2024-10-25"
# # Call the function
latest_data <- get_targeting_db(the_cntry, tf, ds)
# # Inspect the data
head(latest_data)
#> # A tibble: 6 × 37
#> internal_id no_data tstamp page_id cntry page_name partyfacts_id
#> <chr> <lgl> <dttm> <chr> <chr> <chr> <chr>
#> 1 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> 2 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> 3 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> 4 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> 5 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> 6 <NA> NA 2024-10-27 19:12:35 7440553… DE CDU-Frak… 1375
#> # ℹ 30 more variables: sources <chr>, country <chr>, party <chr>,
#> # left_right <dbl>, tags <glue>, tags_ideology <chr>, disclaimer <chr>,
#> # amount_spent_eur <chr>, number_of_ads_in_library <chr>, date <chr>,
#> # path <chr>, tf <chr>, remove_em <lgl>, total_n <int>, amount_spent <dbl>,
#> # value <chr>, num_ads <int>, total_spend_pct <dbl>, type <chr>,
#> # location_type <chr>, num_obfuscated <int>, is_exclusion <lgl>, ds <chr>,
#> # main_currency <chr>, total_num_ads <int>, total_spend_formatted <dbl>, …Using get_report_db, you can retrieve archived advertising reports for
specific pages, countries, and timeframes. Reports are stored in a
repository and can be downloaded and read directly into R.
Note: While we strive to keep the archive complete, occasional scraper failures may lead to missing data for certain days.
# # set some parameters
the_cntry <- "DE"
tf <- 30
ds <- "2024-10-25"
# # Call the function
latest_data <- get_report_db(the_cntry, tf, ds)
# # Inspect the data
head(latest_data)
#> # A tibble: 6 × 9
#> page_id page_name disclaimer amount_spent_eur number_of_ads_in_lib…¹ date
#> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 2781178155… EU Justi… EU Justic… 296508 28 2024…
#> 2 1706886445… UNICEF D… UNICEF De… 78283 79 2024…
#> 3 1891313609… LIQID In… LIQID Inv… 76300 88 2024…
#> 4 1918513976… VETO – T… VETO - Ti… 71581 218 2024…
#> 5 23216224900 Plan Int… Plan Inte… 62605 62 2024…
#> 6 1612732458… Save the… Save the … 59891 195 2024…
#> # ℹ abbreviated name: ¹number_of_ads_in_library
#> # ℹ 3 more variables: path <chr>, tf <chr>, cntry <chr>You can also retrieve some page info of the page that you are interested in.
page_info <- get_page_insights("121264564551002", include_info = "page_info")
str(page_info)
#> 'data.frame': 1 obs. of 20 variables:
#> $ page_name : chr "VVD"
#> $ is_profile_page : chr "FALSE"
#> $ page_is_deleted : chr "FALSE"
#> $ page_is_restricted : chr "FALSE"
#> $ has_blank_ads : chr "FALSE"
#> $ hidden_ads : chr "0"
#> $ page_profile_uri : chr "https://facebook.com/VVD"
#> $ page_id : chr "121264564551002"
#> $ page_verification : chr "BLUE_VERIFIED"
#> $ entity_type : chr "PERSON_PROFILE"
#> $ page_alias : chr "VVD"
#> $ likes : chr "109676"
#> $ page_category : chr "Political party"
#> $ ig_verification : chr "TRUE"
#> $ ig_username : chr "vvd"
#> $ ig_followers : chr "54188"
#> $ shared_disclaimer_info: chr "[]"
#> $ about : chr "Een sterk en veilig Nederland. Met rust in je portemonnee en alle ruimte voor jou om te groeien en iets op te b"| __truncated__
#> $ event : chr "CREATION: 2010-04-23 21:05:02"
#> $ no_address : logi TRUEThe get_targeting_metadata function is designed to retrieve metadata
about targeting data releases from a GitHub repository to see which data
is present (or not). It extracts and organizes information such as file
names, sizes, timestamps, and tags for a specified country and
timeframe. This metadata provides an overview of the available
targeting data without downloading the actual files.
-
country_code(Character):
The ISO country code (e.g.,"DE"for Germany,"US"for the United States). -
timeframe(Character):
The timeframe for the targeting data. Acceptable values are:"7": Last 7 days."30": Last 30 days."90": Last 90 days.
-
base_url(Character, default:"https://github.com/favstats/meta_ad_targeting/releases/expanded_assets/"):
The base URL for the GitHub repository hosting the targeting data.
# Retrieve metadata for Germany for the last 30 days
metadata <- get_targeting_metadata("DE", "30")
print(metadata)
#> # A tibble: 695 × 3
#> cntry ds tframe
#> <chr> <chr> <chr>
#> 1 DE 2026-02-15 last_30_days
#> 2 DE 2026-02-14 last_30_days
#> 3 DE 2026-02-13 last_30_days
#> 4 DE 2026-02-12 last_30_days
#> 5 DE 2026-02-11 last_30_days
#> 6 DE 2026-02-10 last_30_days
#> 7 DE 2026-02-09 last_30_days
#> 8 DE 2026-02-08 last_30_days
#> 9 DE 2026-02-07 last_30_days
#> 10 DE 2026-02-06 last_30_days
#> # ℹ 685 more rowsget_ad_snapshots() retrieves snapshot data for a Facebook ad from the
Ad Library, including images, videos, cards, body text, page info, and
more. It uses headless Chrome (via chromote) to bypass Facebook’s
JavaScript-based bot detection.
This piece of code was created in collaboration with Philipp Mendoza.
snap <- get_ad_snapshots("561403598962843")For best performance when processing multiple ads, use a persistent browser session. This passes Facebook’s JS challenge once during startup, so all subsequent calls are fast.
browser_session_start()
results <- map_dfr_progress(ad_ids, ~get_ad_snapshots(.x))
browser_session_close()If Chrome becomes unresponsive mid-batch, use
browser_session_restart() to recover without restarting R.
Setting download = TRUE saves images and videos to disk. Use
hashing = TRUE to deduplicate media files (recommended for large-scale
collection).
get_ad_snapshots("561403598962843", download = TRUE, hashing = TRUE, mediadir = "data/media")get_ad_html() fetches the raw HTML for one or more ads and caches
results to disk as gzipped files. Useful for archiving or downstream
parsing with parse_ad_htmls().
paths <- get_ad_html(
ad_ids = c("561403598962843", "1103135646905363"),
country = "US"
)get_deeplink() extracts the full deeplinkAdCard JSON object from an
ad page, which contains additional metadata beyond what
get_ad_snapshots() returns (e.g., fevInfo,
free_form_additional_info, learn_more_content).
dl <- get_deeplink("561403598962843")ggl_get_spending is a function in R that queries the Google
Transparency Report to retrieve information about advertising spending
for a specified advertiser. It supports a range of countries and can
provide either aggregated data or time-based spending data.
To use ggl_get_spending, you need the advertiser’s unique identifier,
the desired date range, and the country code. The function also has an
option to retrieve time-based spending data.
Retrieve aggregated spending data for a specific advertiser in the Netherlands. It returns details like currency, number of ads, ad type breakdown, advertiser details, and other metrics.
ggl_get_spending(advertiser_id = "AR18091944865565769729",
start_date = "2023-10-24",
end_date = "2023-11-22",
cntry = "NL")
#> # A tibble: 1 × 2
#> spend number_of_ads
#> <dbl> <dbl>
#> 1 0 0Retrieve time-based spending data for the same advertiser and country.
If get_times is set to TRUE, it returns a tibble with date-wise
spending data.
# Retrieve time-based spending data for the same advertiser and country
timeseries_dat <- ggl_get_spending(advertiser_id = "AR18091944865565769729",
start_date = "2023-10-24",
end_date = "2023-11-22",
cntry = "NL",
get_times = TRUE)
# Plotting the time-series data
timeseries_dat %>%
ggplot2::ggplot(ggplot2::aes(x = date, y = spend)) +
ggplot2::geom_col() +
ggplot2::theme_minimal()If you use the metatargetr package or data from its database in your
research, publications, or other outputs, please ensure you provide
proper attribution. This helps recognize the effort and resources
required to maintain and provide access to these data.
Votta, Fabio, & Mendoza, Philipp. (2024).
metatargetr: A package for parsing and analyzing ad library and targeting data. GitHub. Available at: https://github.com/favstats/metatargetr
@misc{votta2024metatargetr,
author = {Votta, Fabio and Mendoza, Philipp},
title = {metatargetr: A package for parsing and analyzing ad library and targeting data},
year = {2024},
publisher = {GitHub},
url = {https://github.com/favstats/metatargetr}
}If you use data from the metatargetr database, please include the
following acknowledgement in your work:
Data were retrieved from the
metatargetrdatabase, maintained by Fabio Votta. The database archives targeting data from the Meta Ad Library and Google Transparency Report. For more information, visit https://github.com/favstats/metatargetr.
By including these citations and acknowledgements, you help support the
continued development of metatargetr and its associated resources.
Thank you for your collaboration!
