The goal of this project is to help our imaginary customer in launching her marketing campaign by providing her a data ready for finding out top category of trending YouTube videos.
Dataset : link
- Data ingestion
- Data cleansing
- Data transformation
- Data Catalog
- Data quering and analysis
- ETL
- Scalability through trigger of Lambda function
- Data partitioning
- Visualization - BI Dashboard
- Extraction:
- CLI
- S3
- IAM
- Transformation:
- S3
- Lambda (with Trigger)
- Glue (Crawler, Database, ETL Job)
- Athena
- IAM
- Loading
- S3
- Glue (Database, ETL Job)
- IAM
- Business Intelligence
- S3
- Glue (Database)
- QuickSight
- IAM
- CSV
- JSON
- Parquet
- Miro board - Used for Data Pipeline Diagram