Aligulac (Starcraft 2 matches) Analytics

There was an attempt at

Aligulac (Starcraft 2 matches) Analytics

Hello friends and family and other creatures of the sea.
This is an end-to-end analytics pipeline for starcraft 2 matches using the Aligulac dataset from Aligulac.

Description

Here we will attempt to explore the state of the game with regards to the pro-gamer's activity. Starcraft 2 have always had a sizeable audiance and activity around it but over the years players have noticed that its popularity is starting to wane. So here we will try to see if that is the case and if SC2 is indeed a dead game.

Tools

The pipeline comprises of multiple parts - most notably are:

PostgreSQL to restore the Aligulac database locally (inside a docker container with Mage.AI)
Mage.AI as the primary orchestrator and secondary data-transformation tool
dbt (Mage-flavoured) as the primary data transformation tool running inside Mage.
Terraformq as the IaC tool to manage cloud resources in GCP

Google Cloud Provider is the primary cloud resource being used (for now):

BigQuery as the Data Warehouse solution with Raw, Bronze, Silver and Gold stages
Cloud Storage bucket as the Data Lake to store various Parquet files and table
and, Data Studio as the dashboarding tool

Overall view and steps

The Aligulac Database extract gets downloaded and restored into the local postgresql installation
Mage is used to funnel the extracted database into a parquet file and table format within Google Cloud Storage
dbt-Mage is then used to transform the GCS stored data into BigQuery's Raw, Bronze, Silver and Gold layer
Data in the gold layer can then be consumed by Streamlit cloud

Dashboard and Visuals

Streamlit cloud is being used as a dashboarding solution (for now).
The streamlit dashboard can be accessed here Aligulac Analytics on Steamlit

Installation

.ENV file

Copy and rename the supplied dev.env file to .env

Modify the content of the file if required (default setting listed below)

PROJECT_NAME=aligulac_analytics
POSTGRES_DB=aligulac_db
POSTGRES_USER=admin
POSTGRES_PASSWORD=aligulacdb
POSTGRES_HOST=postgres_aligulac
PG_HOST_PORT=5432

Google Cloud Platform

Create Service account(s) for Mage, Terraform and Streamlit.

For both Mage and Terraform Service Accounts uses the roles listed below:

Cloud Storage Admin
BigQuery Admin

Export the keys as .json and drop them into their respective locations in

/keys/ for the Mage and dbt, next to the mage-dbt_key_location.txt placeholder
/terraform/keys/ for Terraform, next to the terraform_key_location.txt placeholder

Streamlit Service Account (optional)

Streamlit will require its own separate service account to access the datalake hosted on GCP's BigQuery.

Please follow the instructions in Connecting Streamlit to BigQuery

Terraform (optional)

Terraform code are located in the /terraform/ directory and a terraform apply will generate the requested resources listed below

Google Cloud Storage bucket - aligulac_datalake used to store parquet files and table generated by the pipeline
BigQuery datasets
- raw
- bronze
- silver
- gold

Note: In some installations - the BigQuery API may need to be activated

Mage

The Mage installation are located in /aligulac_analytics/ and a docker compose up to activate the container.

Access the Mage interface via http://localhost:6789

Local Database Seed

In the very likely scenario of the PostgreSQL Db being empty you can generate the required data by downloading the database dump file from aligulac.sql or aligulac.sql.gz

Note: the /aligulac_export/ have been attached to the postgres installation as /tmp/aligulac_export/

wget -P /aligulac_export http://static.aligulac.com/aligulac.sql

And restoring the db via

docker exec -i postgres-aligulac psql -U admin aligulac_db < tmp/aligulac_export/aligulac.sql

or

psql -U admin aligulac_db < tmp/aligulac_export/aligulac.sql

Pipelines

Within Mage there should be two "Core" pipelines.

Run the Setup pipeline to move the data from Local PostgreSQL to GCS
Run the Primarypipeline to build the Data Warehouse using the files inside GCS

The Mage-DAG uses a mix of MAGE based blocks and dbt (below)

Streamlit Cloud (optional)

Follow the Streamlit Cloud sign-up process. Streamlit Community edition (Free)

As mentioned earlier, follow the instructions in Connecting Streamlit to BigQuery

Use the streamlit__aligulac.py file in the repo to generate the Streamlit dashboard.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
aligulac_analytics		aligulac_analytics
aligulac_export		aligulac_export
keys		keys
readme_files		readme_files
terraform		terraform
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
dev.env		dev.env
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
streamlit__aligulac.py		streamlit__aligulac.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aligulac (Starcraft 2 matches) Analytics

Description

Tools

Overall view and steps

Dashboard and Visuals

Installation

.ENV file

Google Cloud Platform

Streamlit Service Account (optional)

Terraform (optional)

Mage

Local Database Seed

Pipelines

Streamlit Cloud (optional)

About

Releases

Packages

Languages

tanhtra/aligulac_analytics

Folders and files

Latest commit

History

Repository files navigation

Aligulac (Starcraft 2 matches) Analytics

Description

Tools

Overall view and steps

Dashboard and Visuals

Installation

.ENV file

Google Cloud Platform

Streamlit Service Account (optional)

Terraform (optional)

Mage

Local Database Seed

Pipelines

Streamlit Cloud (optional)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages