This repository has a solution for a 1 time copy of data from a MongoDB collection into an Apache Iceberg table in S3 and a solution to use the MongoDB change stream to keep the iceberg copy of the data up to date.
The architecture below depicts the solution
The repository is broken down into several sections. Each section has its own read me that will explains its components
-
1_Sample_MongoDB_Data steps 1 and 2 in architecture diagram.
-
2_Glue_Iceberg_Initial_Load step 3 in the architecture diagram
-
3_Sample_MongoDB_Change_Stream_Data steps 4 and 5 in the architecture diagram
-
4_Glue_Iceberg_Change_Stream step 6 in the architecture diagram