Skip to content
@databrickslabs

Databricks Labs

Labs projects to accelerate use cases on the Databricks Unified Analytics Platform

Pinned Loading

  1. Databricks Terraform Provider

    Go 180 116

  2. tempo Public

    The purpose of this project is to provide an API for manipulating time series on top of Apache Spark. Functionality includes featurization using lagged time values, rolling statistics (mean, avg, s…

    Jupyter Notebook 141 21

  3. dbx Public

    CLI tool for advanced Databricks jobs management.

    Python 61 21

Repositories

  • dbx Public

    CLI tool for advanced Databricks jobs management.

    Python 61 21 7 0 Updated Dec 2, 2021
  • terraform-provider-databricks Public

    Databricks Terraform Provider

    Go 180 Apache-2.0 116 8 3 Updated Dec 2, 2021
  • migrate Public

    Scripts to help customers with one-off migrations between Databricks workspaces.

    Python 52 41 14 1 Updated Dec 1, 2021
  • overwatch Public

    Capture deep metrics on one or all assets within a Databricks workspace

  • geoscan Public

    Geospatial clustering at massive scale

    Scala 45 4 1 3 Updated Nov 29, 2021
  • tempo Public

    The purpose of this project is to provide an API for manipulating time series on top of Apache Spark. Functionality includes featurization using lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, and downsampling & interpolation. This has been tested on TB-scale of historical data and is unit tested for quality pur…

    Jupyter Notebook 141 21 21 6 Updated Nov 28, 2021
  • delta-oms Public

    DeltaOMS is a solution that helps to build a centralized repository of operational metrics/statistics for your Lakehouse built on Delta Lake

    Scala 10 0 4 0 Updated Nov 19, 2021
  • dbldatagen Public

    Generate relevant data quickly for your projects. The Databricks data generator can be used to generate large simulated / synthetic data sets for test, POCs, and other uses

    Python 33 1 5 0 Updated Nov 9, 2021
  • cicd-templates Public

    Manage your Databricks deployments and CI with code.

    Python 161 75 5 1 Updated Nov 5, 2021
  • databricks-sync Public

    An experimental tool to synchronize source Databricks deployment with a target Databricks deployment.

    Python 19 4 20 3 Updated Oct 27, 2021