Skip to content
@data-commons

data-commons

Collection of Open Source libraries that enable working with data at scale

Popular repositories Loading

  1. prep-buddy prep-buddy Public

    A Scala / Java / Python library for cleansing, transforming and preparing large datasets for ML operations on Apache Spark.

    Scala 8 7

  2. protectr protectr Public

    A Scala / Java / Python library for anonymization, encryption and redaction operations for large datasets on Apache Spark.

    Scala 2

  3. pyts pyts Public

    A library for stats module in python

    Python 1

  4. spark-timeseries spark-timeseries Public

    Forked from sryza/spark-timeseries

    A library for time series analysis on Apache Spark

    Scala

  5. data-commons.github.io data-commons.github.io Public

    HTML 1

  6. ApacheWombat ApacheWombat Public

    Forked from justinmclean/ApacheWombat

    Apache worked LICENSE and NOTICE example

    HTML

Repositories

Showing 10 of 11 repositories
  • prep-buddy Public

    A Scala / Java / Python library for cleansing, transforming and preparing large datasets for ML operations on Apache Spark.

    data-commons/prep-buddy’s past year of commit activity
    Scala 8 Apache-2.0 7 0 1 Updated Oct 13, 2020
  • protectr Public

    A Scala / Java / Python library for anonymization, encryption and redaction operations for large datasets on Apache Spark.

    data-commons/protectr’s past year of commit activity
    Scala 2 Apache-2.0 0 0 0 Updated Sep 29, 2018
  • forecast Public Forked from robjhyndman/forecast

    forecast package for R

    data-commons/forecast’s past year of commit activity
    R 0 348 0 0 Updated Aug 9, 2017
  • superset Public Forked from apache/superset

    Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application

    data-commons/superset’s past year of commit activity
    Python 0 Apache-2.0 13,918 0 0 Updated Jun 10, 2017
  • pyts Public

    A library for stats module in python

    data-commons/pyts’s past year of commit activity
    Python 1 MIT 0 0 0 Updated Mar 7, 2017
  • spark-setup Public

    This is a simple setup for spark using maven.

    data-commons/spark-setup’s past year of commit activity
    Scala 0 0 0 0 Updated Dec 6, 2016
  • FiloDB Public Forked from filodb/FiloDB

    Distributed. Columnar. Versioned. Streaming. SQL.

    data-commons/FiloDB’s past year of commit activity
    Scala 0 Apache-2.0 230 0 0 Updated Nov 15, 2016
  • gobblin Public Forked from apache/gobblin

    Universal data ingestion framework for Hadoop.

    data-commons/gobblin’s past year of commit activity
    Java 0 Apache-2.0 802 0 0 Updated Jul 1, 2016
  • ApacheWombat Public Forked from justinmclean/ApacheWombat

    Apache worked LICENSE and NOTICE example

    data-commons/ApacheWombat’s past year of commit activity
    HTML 0 Apache-2.0 8 0 0 Updated Jun 23, 2016
  • data-commons/data-commons.github.io’s past year of commit activity
    HTML 0 1 0 0 Updated Jun 18, 2016

Top languages

Loading…

Most used topics

Loading…