A Spark application to merge small files on Hadoop
-
Updated
Sep 7, 2020 - Scala
A Spark application to merge small files on Hadoop
A fast, scalable and distributed community detection algorithm based on CEIL scoring function.
Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.
🌟Spark Ceph Connector: Implementation of Hadoop Filesystem API for Ceph
Apache Spark Analytics Queries Benchmarking
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."