aws-redshift

The goal of this repository is to provide good and clear examples of Amazon CLI commands together with Amazon CDK to easily create any AWS services and resources

python aws-s3 python3 aws-sqs aws-ec2 aws-iam amazon-web-services aws-rds aws-vpc amazon-aws aws-codedeploy aws-route53 aws-elasticsearch aws-redshift aws-parameter-store aws-load-balancer aws-security-group aws-systemmanager amazon-cdk

Updated Dec 22, 2019
Python

eduardofb / redshift-remove-duplicates

Star

Remove duplicates entries from a Redshift cluster

remove-duplicates redshift aws-redshift

Updated May 15, 2017
Python

mcamarad / ETL_music_streaming_app

Star

aws data-science database etl aws-s3 postgresql rdbms aws-redshift dataengineering etl-pipeline

Updated Jul 7, 2020
Python

sanogotech / aws-data-pipeline

Star

A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from locally hosted Airflow containers. The end product is a Superset dashboard and a Postgres database, hosted on an EC2 instance at this address (powered down):

python docker aws airflow terraform aws-redshift apache-superset

Updated May 14, 2022
Python

MartinKalema / mysql-kafka-s3-redshift-data-pipeline

Star

ETL pipeline

streaming kafka aws-s3 mysql-database aws-redshift

Updated Apr 9, 2024
Python

aluxh / carpark-sg-data-pipeline

Star

The goal of this project is to build data pipeline for gathering real-time carpark lots availability and weather datasets from Data.gov.sg. These data are extracted via API, and stored them in the S3 bucket before ingesting them into the Data Warehouse.

airflow redshift data-pipeline aws-redshift carpark carpark-sg carpark-availability

Updated Sep 1, 2019
Python

rigganni / AWS-RedShift-Music-Analysis

Star

Load data from the Million Song Dataset into AWS RedShift.

aws etl redshift aws-redshift dimensional-model

Updated May 8, 2020
Python

lkellermann / sparkify-dw

Star

Udacity Data Engineering Nanodegree Project #3.

aws udacity boto3 datawarehouse aws-redshift

Updated Apr 26, 2021
Python

jomavera / dataPipeline

Star

ETL pipeline with AWS Redshift orchestrated with Airflow

data-warehouse data-engineering data-pipeline aws-redshift apache-airflow

Updated Mar 10, 2021
Python

FutureTroglodyte / udacity-nd027-data_pipelines

Star

Udacity Data Engeneering Nanodegree Program - My Submission of Project: Data Pipelines

airflow aws-s3 data-engineering data-pipelines aws-redshift etl-pipeline airflow-operators

Updated Apr 12, 2021
Python

ManjinderSingh3 / ETL-Operations-using-AWS-Glue-and-Redshift

Star

Used AWS Glue to perform ETL operations and load resultant data to AWS Redshift. In the second phase used AWS CloudWatch rules and LAMBDA to automatically run GLUE Jobs

aws aws-lambda aws-redshift etl-pipeline aws-glue

Updated Feb 9, 2022
Python

marcus-repo / etl-redshift

Star

ETL Pipeline extracts JSON files from AWS S3 bucket and inserts these into an AWS Redshift Cluster.

python aws sql aws-s3 aws-redshift pyscopg2

Updated Feb 12, 2021
Python

sagardua297 / udacity-data-engineering-nd

Star

Data Pipeline Analytics Platform is an end-to-end generic Big Data pipeline. Involves following tech stack: AWS S3, AWS Redshift, AWS EMR Cluster, Apache Spark, Apache Airflow.