@mhamilton723 mhamilton723 released this Jun 28, 2018 · 126 commits to master since this release

Assets 2

New Functionality:

  • Export trained LightGBM models for evaluation outside of Spark

  • LightGBM on Spark supports multiple cores per executor

  • CNTKModel works with multi-input multi-output models of any CNTK
    datatype

  • Added Minibatching and Flattening transformers for adding flexible
    batching logic to pipelines, deep networks, and web clients.

  • Added Benchmark test API for tracking model performance across
    versions

  • Added PartitionConsolidator function for aggregating streaming data
    onto one partition per executor (for use with connection/rate-limited
    HTTP services)

Updates and Improvements:

  • Updated to Spark 2.3.0

  • Added Databricks notebook tests to build system

  • CNTKModel uses significantly less memory

  • Simplified example notebooks

  • Simplified APIs for MMLSpark Serving

  • Simplified APIs for CNTK on Spark

  • LightGBM stability improvements

  • ComputeModelStatistics stability improvements

Acknowledgements:

We would like to acknowledge the external contributors who helped create
this version of MMLSpark (in order of commit history):