An end-to-end data science project is implemented in this repository. Using data available on the MyAnimeList public website, a web scraping data extraction process is implemented, a data warehouse and a datalake are created in AWS and metrics of interest are displayed in a streamlit dashboard deployed in an EC2 AWS instance.
Anime_ETL_Project
├── project_description
│ ├── pipeline_design.py # Python script that draws the project architecture using graphviz.
│ ├── project_stages.txt # Txt file that displays the steps to implement the project.
│ └── verbal_description.txt # Txt file containing a general description of every step in the architecture.
├── README.md
└── requirements.txt
conda activate env