You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Utilizing Airflow's built-in functionalities creating a reusable ETL pipeline. Source data resides in a S3 bucket, and the pipeline should include data quality checks and data should be processed within AWS Redshift.
An example repo which aggregates multiple sources of Apache Airflow DAGs from Apache Maven repositories into a single Git branch that can be used with git-sync in the Airflow Helm Chart (User Community).
A process of data cleaning and saving results automatically in separated folders is done using Apache Airflow and a weather API. Specifically the weather data of LA is used.
This project focuses on utilizing Apache Airflow to orchestrate an ETL (Extract, Transform, Load) process using data from the Stack Overflow API. The primary objective is to determine the most prominent tags on Stack Overflow for the current month.
This project presents a robust data pipeline using Apache Airflow for orchestration, Apache Kafka for real-time data streaming, and MongoDB for data storage. It automates the process of web scraping to collect large companies' data, transforms and processes this data, and then stores it efficiently.
This project containes some sample dags for airflow xcom, simple bash operations, table creation in google bigTable using google cloud operations and last one is used to upload daily covid data on google bigTable.