Skip to content

๐Ÿ‘ท๐ŸŒ‡ Set up and build a big data processing pipeline with Apache Spark, ๐Ÿ“ฆ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows๐ŸฅŠ

License

Notifications You must be signed in to change notification settings

longNguyen010203/Spark-Processing-AWS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

85 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ‘ท Spark-Processing-AWS

In this project, I set up and build a big data processing pipeline using Apache Spark integrated with various AWS services, including S3, EMR, EC2, VPC, IAM, and Redshift and Terraform to setup the infrastructure

๐Ÿ”ฆ About Project

๐Ÿ“ฆ Technologies

  • S3
  • EMR
  • EC2
  • Airflow
  • Redshift
  • Terraform
  • Spark
  • VPC
  • IAM

๐Ÿฆ„ Features

๐Ÿ‘ฉ๐Ÿฝโ€๐Ÿณ The Process

๐Ÿ“š What I Learned

๐Ÿ’ญ How can it be improved?

๐Ÿšฆ Running the Project

About

๐Ÿ‘ท๐ŸŒ‡ Set up and build a big data processing pipeline with Apache Spark, ๐Ÿ“ฆ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows๐ŸฅŠ

Topics

Resources

License

Stars

Watchers

Forks