## Scrape banking apps:
- Commercial Bank of Ethiopia
- Bank of Abbysinia
- Dashen Bank Superapp

# Scrape Reviews for Three Banking Apps

This notebook demonstrates how to scrape Google Play reviews for three banking apps using the `GooglePlayReviewScraper` class from `scraper.py`, with app configuration in `utils.py`.

## 1. Import Required Libraries and Modules

Import necessary libraries and modules, including the `GooglePlayReviewScraper` class from `scripts.scraper` and app configuration variables from `utils.py`.

In [2]:
import os
import sys
import pandas as pd

# Add scripts path to sys.path
scripts_path = os.path.abspath(os.path.join(os.getcwd(), '..', 'scripts'))
if scripts_path not in sys.path:
    sys.path.append(scripts_path)

# Import the GooglePlayReviewScraper class
from scripts.scraper import GooglePlayReviewScraper

# Import app configuration variables from utils.py
from scripts.utils import APP_IDS, APP_ID_TO_BANK_NAME

ModuleNotFoundError: No module named 'pandas'

## 2. Define App IDs and Bank Name Mapping

The `APP_IDS` list and `APP_ID_TO_BANK_NAME` dictionary should be defined in `utils.py` as follows:

In [None]:
# Example content of utils.py

APP_IDS = [
    'com.combanketh.android',      # Commercial Bank of Ethiopia
    'com.bankofabyssinia.mobile',  # Bank of Abyssinia
    'com.dashenbank.app'           # Dashen Bank Superapp
]

APP_ID_TO_BANK_NAME = {
    'com.combanketh.android': 'Commercial Bank of Ethiopia',
    'com.bankofabyssinia.mobile': 'Bank of Abyssinia',
    'com.dashenbank.app': 'Dashen Bank Superapp'
}

## 3. Set Output Directory and Date

Specify the directory to save the scraped reviews and define the current date string for timestamping.

In [None]:
from datetime import datetime

RAW_DATA_DIR = os.path.join(os.getcwd(), 'raw_reviews')
os.makedirs(RAW_DATA_DIR, exist_ok=True)

TODAY_DATE_STR = datetime.today().strftime('%Y-%m-%d')

## 4. Instantiate the GooglePlayReviewScraper

Create an instance of the `GooglePlayReviewScraper` using the imported configuration variables.

In [None]:
scraper = GooglePlayReviewScraper(
    app_ids=APP_IDS,
    app_id_to_bank_name=APP_ID_TO_BANK_NAME,
    raw_data_dir=RAW_DATA_DIR,
    today_date_str=TODAY_DATE_STR
)

## 5. Scrape Reviews for All Apps

Call the `scrape_all()` method to scrape reviews for all three apps and save the results.

In [None]:
saved_files = scraper.scrape_all()
print("Scraping completed. Files saved:")
for f in saved_files:
    print(f)

## 6. Display Scraping Results

Display the list of saved files and preview the scraped data for one of the apps.

In [None]:
# Preview the first scraped file (if available)
if saved_files:
    df = pd.read_csv(saved_files[0])
    display(df.head())
else:
    print("No files found.")