# Scrape political party parliamentarian elections 2022 ads

Purpose of this document is to scrape political party ads from Slovenian parliamentarian elections 2022. These will be used to test correlations between political advertisements and EEG signals.

This document also serves as documentation to show which political parties are included in the test.

## Imports

In [1]:
import datetime
import pandas as pd

from lib.ad_scraper.FBAdLibraryPartyPageScraper import FBAdLibraryPartyPageScraper

# Helper functions

### Parliamentarian election data

Helper function that filters advertisements that were not running in March and April 2022 in the time before the Slovenian parliamentarian elections 2022.

In [2]:
def parliamentarian_election_data(data: pd.DataFrame):
    data['ad_delivery_start_time'] = pd.to_datetime(data['ad_delivery_start_time'])
    data['ad_delivery_end_time'] = pd.to_datetime(data['ad_delivery_start_time'])

    start_date = datetime.datetime(2022, 3, 1)
    end_date = datetime.datetime(2022, 4, 30)
    mask_start_date = (data['ad_delivery_start_time'] > start_date) & (data['ad_delivery_start_time'] <= end_date)
    mask_end_date = (data['ad_delivery_end_time'] > start_date) & (data['ad_delivery_end_time'] <= end_date)
    return data.loc[mask_start_date | mask_end_date]

### Collect media files

Helper function to collect media files of political advertisements of a given political party for Slovenian parliamentarian elections 2022.

In [3]:
def collect_media_political_party(political_party_abbreviation: str):
    df_party_metadata = pd.read_csv(f'../data/metadata/{political_party_abbreviation}.csv')
    if len(set(df_party_metadata['page_id'])) > 1:
        raise RuntimeError("More than 1 page for a political party.")

    df_party_metadata_filtered = parliamentarian_election_data(data=df_party_metadata)

    scraper = FBAdLibraryPartyPageScraper(page_id=list(df_party_metadata['page_id'])[0],
                                          valid_ids=list(df_party_metadata_filtered['ad_archive_id']))
    scraper.scroll_to_bottom()
    scraper.collect_media()
    scraper.close()

## Political party scraping

### Gibanje Svoboda

In [4]:
collect_media_political_party(
    political_party_abbreviation='GS'
)

[WDM] - Downloading: 100%|██████████| 8.61M/8.61M [00:00<00:00, 44.2MB/s]


KeyboardInterrupt: 