Skip to content

@mhamilton723 mhamilton723 released this Mar 6, 2019 · 20 commits to master since this release

New Features

New Examples

Updates and Improvements


  • MMLSpark Image Schema now unified with Spark Core
  • Bugfixes for Text Analytics services
  • PageSplitter now propagates nulls
  • HTTP on Spark now supports socket and read timeouts
  • HyperparamBuilder python wrappers now return idiomatic python objects

LightGBM on Spark

  • Added multiclass classification
  • Added multiple types of boosting (Gradient Boosting Decision Tree, Random Forest, Dropout meet Multiple Additive Regression Trees, Gradient-based One-Side Sampling)
  • Added windows OS support/bugfix
  • LightGBM version bumped to 2.2.200
  • Added native support for categorical columns, either through Spark's StringIndexer, MMLSpark's ValueIndexer or list of indexes/slot names parameter
  • isUnbalance parameter for unbalanced datasets
  • Added boost from average parameter


We would like to acknowledge the developers and contributors, both internal and external who helped create this version of MMLSpark.

  • Ilya Matiach, Casey Hong, Daniel Ciborowski, Karthik Rajendran, Dalitso Banda, Manon Knoertzer, Sudarshan Raghunathan, Anand Raman,Markus Cozowicz, The Microsoft AI Development Acceleration Program, Cognitive Search Team, Azure Search Team
Assets 2
You can’t perform that action at this time.