apache-spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 211 public repositories matching this topic...
Analysis of live updated weather data (csv) with spark streaming
-
Updated
Mar 30, 2022 - Java
Homeworks Big Data Computing Course
-
Updated
Jul 17, 2022 - Java
Mirror of Apache Beam
-
Updated
Sep 26, 2017 - Java
Big Data implementations using Hadoop MapReduce and Apache Spark
-
Updated
Jan 22, 2019 - Java
This java program counts the most frequent word in a given file using Apache Spark
-
Updated
Aug 14, 2018 - Java
Projects related to Big Data technologies
-
Updated
Apr 21, 2023 - Java
Spark-based DNA error correction on distributed-memory systems
-
Updated
May 22, 2023 - Java
Repository for a big data class.
-
Updated
Sep 23, 2020 - Java
EDASS - Easy Data Analysis and Statistics Software
-
Updated
Feb 25, 2024 - Java
Apache Spark application streaming twitter feeds to estimate politicians social media approval based on sentiment analysis of tweets
-
Updated
Jan 10, 2018 - Java
-
Updated
May 13, 2018 - Java
Created by Matei Zaharia
Released May 26, 2014
- Followers
- 417 followers
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia