Hello there, here is the final project for Data Engineering Nanodegree program. My idea here is to find insight about the immigration process over 2016 or almost all the 2016 year. I decided to use DImensional Modelling to add more value to the factless fact table by adding more dimnesion. The focus here is in the form i-94 from US to see how many people travelled to the counntry as well as where they are from, when they travelled, and so on.
- Tecnologies:
- Apache Spark.
- Python.
- S3 Buckets.
- AWS EMR Cluster.
- AWS Redshift Data Warehouse.