Skip to content

ys-lin14/skyscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SkyScraper

An ETL pipeline for scraping Google Play reviews for Sky: Children of the Light. I used Airflow for task scheduling, extracted the data using the google-play-scraper library, transformed it with pandas and loaded it into a local MySQL database.

Review Table

Column Description
review_id Google Play review ID
user_name Google username
content Google Play review
rating rating (1 - 5)
thumbs_up_count Number of users who found the review helpful
version Game version
last_modified Date on which the review was last modified

Folder Structure

  |--- skyscraper
  |    |-- modules
  |    |   |-- ... 
  |    |-- skyscraper.py (Airflow DAG definition file)
  |
  |--- sql
       |-- create_sky_database.sql 

References
Sky [Game]. (2020). Santa Monica (California): thatgamecompany.

About

ETL pipeline for scraping Google Play Reviews for Sky: Children of the Light.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages