
- All languages
- AsciiDoc
- C
- C#
- C++
- CSS
- CartoCSS
- Clojure
- CoffeeScript
- Cython
- Dart
- Dockerfile
- EJS
- Elixir
- Elm
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Jupyter Notebook
- Lua
- MDX
- Makefile
- Mako
- Markdown
- Nim
- Nix
- Objective-C++
- PHP
- PLpgSQL
- Perl
- PowerShell
- Python
- R
- Rich Text Format
- Roff
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Starlark
- Svelte
- Swift
- TeX
- TypeScript
- Vim Script
- Vue
Starred repositories
Apache Spark - A unified analytics engine for large-scale data processing
PredictionIO, a machine learning server for developers and ML engineers.
Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala
A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
A machine learning package built for humans.
Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
Breeze is/was a numerical processing library for Scala.
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
GeoTrellis is a geographic data processing engine for high performance applications.
Livy is an open source REST interface for interacting with Apache Spark from anywhere
A connector for Spark that allows reading and writing to/from Redis cluster
Simplifying robust end-to-end machine learning on Apache Spark.
Storehaus is a library that makes it easy to work with asynchronous key value stores
Distributed decision tree ensemble learning in Scala
An efficient updatable key-value store for Apache Spark
Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark
MLeap allows for easily putting Spark ML pipelines into production
A Scala library for Bayesian Inference and Probabilistic Programming