PROJECT | AUTOMATE ETL JOBS ON AWS

Trigger a Glue crawler and Glue ETL job every time a file is uploaded in an S3 bucket including SNS email notifications

Intoduction

Alot of times, data engineering teams spend a considerable amount of time on routine and repeatitive tasks. In this project, we are attempting to remedy this We set up Glue crawlers that run every time a file is added to a given S3 bucket. The crawler crawls and adds the new file/data to the Meta data catalogue. We create new tables or append to exiting ones and make the data available for querrying with Athena and Redshift spectrum We also run a Glue Extrat-Transform-Load (ETL) in Glue studio to clean the data before uploading it into data catalog tables

AWS Services used

S3
Glue
Simple Notification Services (SNS)
EventBridge
Lambda
Athena

Improvements

Set the S3 path dynamically so that crawler only goes through the folder where the new folder is instead of crawling the entire bucket
Include crawler name in the EventsBridge rules
Improve the formart of the message to SNS from Lambda

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
lambda_start_glue_crawler .ipynb		lambda_start_glue_crawler .ipynb
lambda_start_glue_etl_job.ipynb		lambda_start_glue_etl_job.ipynb
lambda_write_to_sns_topic.ipynb		lambda_write_to_sns_topic.ipynb
upload_parquet_file_to_S3.ipynb		upload_parquet_file_to_S3.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PROJECT | AUTOMATE ETL JOBS ON AWS

Trigger a Glue crawler and Glue ETL job every time a file is uploaded in an S3 bucket including SNS email notifications

Intoduction

AWS Services used

Improvements

About

Releases

Packages

Languages

kimerajoseph/automate_etl_jobs

Folders and files

Latest commit

History

Repository files navigation

PROJECT | AUTOMATE ETL JOBS ON AWS

Trigger a Glue crawler and Glue ETL job every time a file is uploaded in an S3 bucket including SNS email notifications

Intoduction

AWS Services used

Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages