Pinned repositories

  1. scio

    A Scala API for Apache Beam and Google Cloud Dataflow.

    Scala 1.2k 191

  2. luigi

    Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

    Python 9.8k 1.7k

  3. annoy

    Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

    C++ 3.9k 486

  4. apollo

    Java libraries for writing composable microservices

    Java 1.3k 177

  5. docker-client

    A simple docker client for the JVM

    Java 1k 415

  6. docker-gc

    Docker garbage collection of containers and images

    Shell 4.4k 407

  • Provides Spotify specific TensorFlow helpers

    Python 41 8 Apache-2.0 3 issues need help Updated Aug 17, 2018
  • Python virtualenvs in Debian packages

    Python 1,191 128 GPL-2.0 Updated Aug 17, 2018
  • A Scala API for Apache Beam and Google Cloud Dataflow.

    Scala 1,170 191 Apache-2.0 30 issues need help Updated Aug 17, 2018
  • Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

    C++ 3,945 486 Apache-2.0 Updated Aug 17, 2018
  • Utilities for working with futures in Java 8

    Java 159 19 Updated Aug 17, 2018
  • Ephemeral Hadoop clusters using Google Compute Platform

    Java 94 25 Apache-2.0 Updated Aug 17, 2018
  • A Scala feature transformation library for data science and machine learning

    Scala 206 32 Apache-2.0 7 issues need help Updated Aug 17, 2018
  • A lightweight workflow definition library

    Java 47 14 Apache-2.0 Updated Aug 17, 2018
  • "The path to execution", Styx is a service that schedules batch data processing jobs in Docker containers on Kubernetes.

    Java 110 18 Apache-2.0 Updated Aug 17, 2018
  • Common library for serving TensorFlow, XGBoost and scikit-learn models in production.

    Java 36 9 Apache-2.0 13 issues need help Updated Aug 17, 2018
  • DBeam extracts SQL tables using JDBC and Apache Beam

    Scala 42 12 Apache-2.0 3 issues need help Updated Aug 17, 2018
  • DNS record reconciliation for Gordon: Event-driven Cloud DNS

    Python 1 Apache-2.0 Updated Aug 17, 2018
  • Homebrew formula for open-source software developed by Spotify

    Ruby 28 18 Apache-2.0 Updated Aug 16, 2018
  • A tool for data sampling, data generation, and data diffing

    Scala 151 27 Apache-2.0 2 issues need help Updated Aug 16, 2018
  • A Java implementation of the FastForward metrics agent

    Java 41 23 Apache-2.0 Updated Aug 16, 2018
  • Scala Aggregators used for ML Model metrics monitoring

    Scala 24 5 Apache-2.0 Updated Aug 17, 2018
  • The Heroic Time Series Database

    Java 631 76 Apache-2.0 3 issues need help Updated Aug 16, 2018
  • Algebraic data types in Java.

    Java 74 5 Apache-2.0 Updated Aug 16, 2018
  • A ffwd-http-client for Java

    Java Apache-2.0 Updated Aug 16, 2018
  • Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

    Python 9,794 1,688 Apache-2.0 Updated Aug 16, 2018
  • Java library for working with Guava futures

    Java 70 15 Updated Aug 16, 2018
  • A Giter8 template for scio

    Scala 9 7 Apache-2.0 Updated Aug 16, 2018
  • Community-supported add-ons for Scio

    Scala 4 Apache-2.0 Updated Aug 16, 2018
  • A functional reactive framework for managing state evolution and side-effects.

    Java 342 13 Apache-2.0 Updated Aug 16, 2018
  • Android Architecture Blueprint sample app implementation using Mobius

    Java 20 8 Apache-2.0 Updated Aug 16, 2018
  • Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

    Scala 145 15 Apache-2.0 Updated Aug 15, 2018
  • Docker container orchestration platform

    Java 1,940 221 Apache-2.0 2 issues need help Updated Aug 15, 2018
  • GCP Plugin for Gordon: Event-driven Cloud DNS

    Python 7 2 Apache-2.0 Updated Aug 15, 2018
  • A simple docker client for the JVM

    Java 1,010 415 Apache-2.0 9 issues need help Updated Aug 14, 2018
  • Apache Cassandra cluster orchestration tool for the command line

    Python 7 2 Apache-2.0 Updated Aug 14, 2018