A music streaming company, Sparkify, has decided that it is time to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow.
Airflow is a platform to programmatically author, schedule, and monitor workflows.
available in S3 contains JSON format:
- Log data: s3://udacity-dend/log_data
- Song data: s3://udacity-dend/song_data