Apache Spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 1,873 public repositories matching this topic...
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
-
Updated
May 26, 2024 - Scala
酷玩 Spark: Spark 源代码解析、Spark 类库等
-
Updated
May 18, 2022 - Scala
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
-
Updated
May 24, 2024 - Scala
Simple and Distributed Machine Learning
-
Updated
May 23, 2024 - Scala
State of the Art Natural Language Processing
-
Updated
May 26, 2024 - Scala
High performance data store solution
-
Updated
May 18, 2024 - Scala
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
-
Updated
May 8, 2024 - Scala
深圳地铁大数据客流分析系统🚇🚄🌟
-
Updated
May 16, 2024 - Scala
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
-
Updated
Apr 22, 2024 - Scala
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
-
Updated
May 24, 2024 - Scala
scala、spark使用过程中,各种测试用例以及相关资料整理
-
Updated
Feb 9, 2019 - Scala
TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an AutoML library for building modular, reusable, strongly typed machine learning workflows on Apache Spark with minimal hand-tuning
-
Updated
Sep 29, 2023 - Scala
Sparkling Water provides H2O functionality inside Spark cluster
-
Updated
May 24, 2024 - Scala
[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization format.
-
Updated
Mar 22, 2022 - Scala
Created by Matei Zaharia
Released May 26, 2014
- Followers
- 416 followers
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia