Skip to content

This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon S3, AWS Glue and Delta Lake.

Notifications You must be signed in to change notification settings

klescosia/aws-glue-delta-lake

Repository files navigation

How to do UPSERTS, DELETES and INSERTS into your Data Lake

This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon S3, AWS Glue and Delta Lake.


Architecture Diagram

Kyle-Escosia-AWS-Glue-Delta-Lake-Diagram

athena-query.sql

Generates the schema for the Delta Table

UPSERTS.py

This code takes a Delta Lake table and does an UPSERT operation using Spark SQL.

DELETES.py

This code takes a Delta Lake table and does a DELETE operation using Spark SQL.

INSERTS.py

This code takes a Delta Lake table and does an INSERT operation using Spark SQL.


Full documentation of functions can be found here and a more detailed one with examples can be found here

About

This repository is for demonstrating the capability to do SQL-based UPDATES, DELETES, and INSERTS directly in the Data Lake using Amazon S3, AWS Glue and Delta Lake.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages