Built a distributed system which completes several objectives with given data to generate loan reports using Amazon Web Services, Apache Spark, Java and Python.
-
Updated
Nov 3, 2020 - Java
Built a distributed system which completes several objectives with given data to generate loan reports using Amazon Web Services, Apache Spark, Java and Python.
This BigData study intends to identify the most revenue-generating Taxi zones in New York City for the year 2019. Three MapReduce algorithms were developed and their performance was analyzed on different size of input datasets and different size clusters in EMR.
Implemented the PageRank algorithm in Hadoop MapReduce framework and Spark.
Add a description, image, and links to the emr-cluster topic page so that developers can more easily learn about it.
To associate your repository with the emr-cluster topic, visit your repo's landing page and select "manage topics."