The course is broken up into five sections, Data Modeling, Cloud Data Warehouses, Data Lake with Spark, Data Pipelines with Airflow, and a capstone project. The structure for each section consists of introducing concepts through lectures, demos and exercises, and concludes with 1-2 project(s). In the projects it is necessary to design an ETL process using song data for an imaginary company called Sparkify. Desing data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets
- Course 1: Data Modeling
- Course 2: Cloud Data Warehouses
- Course 3: Spark and Data Lakes
- Course 4: Data pipelines with Airflow
- Course 5: Capstone Project