airflow-dags

Here are 196 public repositories matching this topic...

Manny-Brar / DataEngineeringNanodegree-P5-DataPipelines-Airflow-Spark-AWS

Utilizing Airflow's built-in functionalities creating a reusable ETL pipeline. Source data resides in a S3 bucket, and the pipeline should include data quality checks and data should be processed within AWS Redshift.

aws airflow spark etl aws-s3 data-warehouse pyspark data-pipelines aws-redshift dataengineering airflow-dags

Updated Sep 27, 2020
Python

contactsunny / apache_airflow_poc

Sponsor

Star

A POC to demonstrate how to work with Apache Airflow

python data-science airflow big-data bigdata apache apache-airflow airflow-dags

Updated Oct 10, 2021
Python

kacperstyslo / most-wanted-programming-skills-finder

Star

With this app, you can see what programming skills are most in-demand in the current job market.

javascript css docker scraper django docker-compose aws-s3 postgresql pandas pyspark serverless-framework shell-scripting aws-lambda-python aws-emr-clusters terraform-aws python38 airflow-dags

Updated Dec 20, 2021
Python

thiagoheron1 / study_airflow

Star

Repositório criado para estudar Airflow baseado nos cursos do Marc Lamberti

airflow airflow-docker airflow-dags airflow-operator

Updated Jul 15, 2021
Python

santiagortiiz / Apache-Airflow-Foundations

Star

Hands up project for learning purposes

python docker learning airflow docker-compose apache-airflow airflow-dags

Updated May 11, 2023
Python

belladzhu / airflow-projects

Star

Создание дагов для автоматизации отчетности

airflow numpy pandas airflow-dags

Updated Aug 29, 2023
Python

Susanhuynh / etl-API-data-to-AWS-S3-using-Airflow

Star

This project focuses on utilizing Apache Airflow to orchestrate an ETL (Extract, Transform, Load) process using data from the Stack Overflow API. The primary objective is to determine the most prominent tags on Stack Overflow for the current month.

docker airflow etl data-engineering amazon-s3 airflow-dags

Updated Jan 28, 2024
Python

AakashDeorukhkar / Airflow_NYC_Taxi_Data

Star

Ingesting NYC Yellow Taxi Data to GCP using Airflow running on Docker

python docker bigquery airflow sql docker-compose terraform google-cloud-platform airflow-docker gcs-bucket airflow-dags

Updated Aug 10, 2022
Python

ammarali1214 / Airflow_Data_Integration_Orchestration

Star

This repository contains the code of Airflow DAG, Snowflake Tasks and Snowflake Procedures

airflow snowflake stored-procedures sql-task airflow-dags airbyte

Updated May 6, 2023
Python

sagardua297 / udacity-data-engineering-nd

Star

Data Pipeline Analytics Platform is an end-to-end generic Big Data pipeline. Involves following tech stack: AWS S3, AWS Redshift, AWS EMR Cluster, Apache Spark, Apache Airflow.

python airflow spark cassandra aws-s3 data-warehouse data-engineering data-lake data-modeling airflow-plugin aws-redshift etl-pipeline aws-emr-clusters postrgresql airflow-dags airflow-operators

Updated Feb 13, 2021
Python

ssinga-dev / airflow-docker-DAGs

Star

Apache Airflow source code to begin with

python3 airflow-docker airflow-dags

Updated Jan 4, 2023
Python

Hellthrashers / pulumi-airflow

Star

Pulumi airflow reosurce provider

airflow pulumi airflow-dags pulumi-airflow

Updated Jul 5, 2023
Python

elkhaddi / Web-Scrapping-Datawarehousing

Star

Analysis of Economic Trends through Alternative Data.

docker-compose beautifulsoup4 airflow-dags datawarehouseautomation webscrapping-python

Updated Aug 18, 2023
Python

visheshgupta-BA / MLOps---Airflow-Docker

Star

docker airflow airflow-dags

Updated Mar 17, 2024
Python

vitorjpc10 / ETL-Pipeline--dbt--Snowflake--Airflow-

Star

This project demonstrates how to build an ELT pipeline using dbt, Snowflake, and Airflow. Follow the steps below to set up your environment, configure dbt, create models, macros, tests, and deploy on Airflow.

airflow snowflake dbt airflow-dags

Updated May 26, 2024
Python

bhavanachitragar / zillow-data-analytics

Star

A Python script extracts data from Zillow and stores it in an initial S3 bucket. Then, Lambda functions handle the flow: copying the data to a processing bucket and transforming it from JSON to CSV format. The final CSV data resides in another S3 bucket, ready to be loaded into Amazon Redshift for in-depth analysis. QuickSight for visualizations

lambda-functions s3 ec2-instance redshift zillow-api etl-pipeline airflow-dags quicksight-dashboard