Skip to content

In this repository, a data extraction process is performed with web scrapping and, based on the data processing, a dashboard is generated that presents general metrics about Japanese programs called anime. Cloud tools are used to store the data and display the results with an instance of AWS and streamlit.

Notifications You must be signed in to change notification settings

JuanPalms/Anime_ELT_dashboard

Repository files navigation

Anime_ELT_dashboard

General description

An end-to-end data science project is implemented in this repository. Using data available on the MyAnimeList public website, a web scraping data extraction process is implemented, a data warehouse and a datalake are created in AWS and metrics of interest are displayed in a streamlit dashboard deployed in an EC2 AWS instance.

Project structure

Anime_ETL_Project
├── project_description
│   ├── pipeline_design.py # Python script that draws the project architecture using graphviz.
│   ├── project_stages.txt # Txt file that displays the steps to implement the project.
│   └── verbal_description.txt # Txt file containing a general description of every step in the architecture. 
├── README.md
└── requirements.txt

Project replicability

conda activate env

To complete

About

In this repository, a data extraction process is performed with web scrapping and, based on the data processing, a dashboard is generated that presents general metrics about Japanese programs called anime. Cloud tools are used to store the data and display the results with an instance of AWS and streamlit.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published