Skip to content
View jerheff's full-sized avatar
  • Winston Salem NC

Block or report jerheff

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

35 stars written in Scala
Clear filter

Apache Spark - A unified analytics engine for large-scale data processing

Scala 40,725 28,543 Updated Mar 14, 2025

PredictionIO, a machine learning server for developers and ML engineers.

Scala 12,528 1,927 Updated Jan 9, 2021

Removes large or troublesome blobs like git-filter-branch does, but faster. And written in Scala

Scala 11,363 552 Updated Jan 19, 2025

A Git platform powered by Scala with easy installation, high extensibility & GitHub API compatibility

Scala 9,216 1,254 Updated Mar 12, 2025

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

Scala 7,870 1,795 Updated Mar 13, 2025

A machine learning package built for humans.

Scala 4,793 562 Updated Sep 23, 2024

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules

Scala 4,383 526 Updated Jun 29, 2022

Breeze is/was a numerical processing library for Scala.

Scala 3,451 690 Updated Aug 29, 2024

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Scala 3,375 553 Updated Mar 5, 2025

Abstract Algebra for Scala

Scala 2,296 347 Updated Aug 19, 2024

GeoTrellis is a geographic data processing engine for high performance applications.

Scala 1,353 364 Updated Mar 13, 2025

CSV Data Source for Apache Spark 1.x

Scala 1,052 442 Updated Dec 13, 2018

Livy is an open source REST interface for interacting with Apache Spark from anywhere

Scala 1,006 315 Updated Oct 5, 2022

A connector for Spark that allows reading and writing to/from Redis cluster

Scala 945 370 Updated Oct 22, 2024

Mirror of Apache Toree (Incubating)

Scala 741 224 Updated Feb 20, 2025

Geo Spatial Data Analytics on Spark

Scala 531 149 Updated Aug 26, 2021

Simplifying robust end-to-end machine learning on Apache Spark.

Scala 470 117 Updated Apr 18, 2017

Storehaus is a library that makes it easy to work with asynchronous key value stores

Scala 465 85 Updated Jul 17, 2020

Distributed decision tree ensemble learning in Scala

Scala 392 50 Updated Jan 9, 2019

An efficient updatable key-value store for Apache Spark

Scala 251 78 Updated Mar 11, 2017

Bayesian Networks in Scala

Scala 205 39 Updated Nov 10, 2017

Distributed t-SNE via Apache Spark

Scala 162 37 Updated Dec 9, 2017

An API for Distributed Machine Learning

Scala 154 59 Updated Sep 22, 2016

Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark

Scala 147 37 Updated Jan 26, 2016

Big Spatial Data Processing using Spark

Scala 145 56 Updated Mar 7, 2017

An opinionated modern database interface

Scala 111 4 Updated Sep 9, 2019

A STAC/OGC API Features Web Service

Scala 80 19 Updated Feb 9, 2025

MLeap allows for easily putting Spark ML pipelines into production

Scala 78 27 Updated Oct 27, 2016

functionstest

Scala 33 11 Updated Oct 25, 2016

A Scala library for Bayesian Inference and Probabilistic Programming

Scala 33 4 Updated May 22, 2024
Next
35 stars written in Scala