[SPARK-10409] [ML] Add Multilayer Perceptron Regression to ML #13617

JeremyNixon · 2016-06-11T06:54:36Z

What changes were proposed in this pull request?

This is a pull request adding support for Multilayer Perceptron Regression, the counterpart to the Multilayer Perceptron Classifier (hereafter MLPR and MLPC).

Outline

Major Changes
API Decisions
Automating Scaling
Naming
Features
Reference Resources
Testing

Convenience Link to JIRA: https://issues.apache.org/jira/browse/SPARK-10409

Major Changes

There are two major differences between MLPR and MLPC. The first is the use of an linear (identity) activation function and a sum of squared error cost function in the last layer of the network. The second is the requirement to scale the data to [0,1] and back to make it easy for the weights to fit a value in the proper range.

Linear, Relu, Tanh Activations

In the forward pass the linear activation passes the value from the fully connected layer through to become the network prediction. In weight adjustment during the backward pass its derivative is one. All regression models will use the linear activation in the last layer, and so there is no option (as there is in MLPC) to use another activation function and cost function in the last layer.

The Relu and Tanh are activation functions that will benefit the accuracy and convergence speed MLPC and MLPR. Tanh zero-centers the data passed to neurons which aids optimization. Relu avoids saturating the gradients.

Automated Scaling

The data scaling is done through min-max scaling, where the minimum label is subtracted from every value (leading to a range from [0 to max-min]) and then dividing by max-min to get a scale from 0 to 1. The corner case where max-min = 0 is resolved by omitting the division step.

Motivating Examples

API Decisions

The API is identical to MLPC with the exception of softmaxOnTop - there is no option on the last layer activation function, or on the cost function to be used (MLPC gives a choice between cross entropy and sum of square error). This API has the user call MLPR with a set of layers that represent the topology of the network. The number of hidden layers is inferred by the parameter for the labels and is equal to the total number of layers - 2. Each hidden layer will be a feedforward layer with a sigmoid activation function up to the output layer and its linear activation.

Input/Output Layer Argument

For MLPR, the output count will always be 1, and the number of inputs will always be equal to the number of features in the training dataset. One API choice could be to omit the input and output counts and only have the user supply the number of neurons in the hidden layers, and automate the input and output counts by looking at the training data. At the very least, it makes sense to validate the user’s layer parameter and display a helpful error message instead of the error in the data stacker that currently appears if the improper number of inputs or outputs is provided.

Modular API

It also would make sense for the API to be modular. A user will want the flexibility to use the linear layer at different points in the network (as well as in MLPC), and will certainly want to be able to use new activation functions (tanh, relu) that are added to improve the performance of these models. That flexibility allows a user to tune the network to their dataset and will be particularly important for convnets or recurrent nets in future. We should decide on the best way to enable tanh and relu activations in this algorithm and in the classifier for the time being.

Automating Scaling

Current behavior is to automatically scale the data for the user. This makes a pass over the data. There are a few options. We could:

Scale data internally, always.
Scale data internally unless user provides min/max themselves.
Create argument that turns internal scaling off/on. Default it to one or the other. Warn user if running on unscaled data.

As well as all the variants between autoscaling or not, adding an argument or not, and warning the user or not.

The algorithm will run quite poorly on unscaled data, and so it makes sense to safeguard the user from this. But the same will be true of data that is not centered and scaled, and we don’t provide this automatically (though it may not be a bad idea as an option, given how sensitive this non-convex (whenever there are hidden layers) function can be to unscaled data). And so there’s a question of how much we hold the user’s hand. I advocate for helpful defaults that can be overridden, where we scale automatically but give an option to run without scaling and don’t run autoscaling if both the min and max are provided by the user.

Naming

Lastly there’s the naming of the multiLayerPerceptron/ multilayerPerceptronRegresson function in the FeedForwardTopology class in Layer.scala. For consistency it may make sense to change multiLayerPerceptron to multiLayerPerceptronClassifier.

Features

There are a few features that have been checked:

Integrates cleanly with pipeline API
Model save/load is enabled
The example data is the popular LoadBoston dataset, scaled.
Example code is included

Reference Resources

Christopher M. Bishop. Neural Networks for Pattern Recognition.
Patrick Nicolas. Scala for Machine Learning, Chapter 9.
Ian Goodfellow Yoshua Bengio and Aaron Courville. Deep Learning, Chapter 6.

How was this patch tested?

The unit tests follow MLPC with the addition of a test for gradient descent. There are unit tests for:

L-BFGS behavior on toy data
Gradient descent on toy data
Input Validation
Set Weights Parameter
Save/Load Functionality working
Read / Write returns a model with similar layers and weights
Support for all Numeric Types

JeremyNixon · 2016-06-11T06:58:29Z

@avulanov @MLnick @mengxr @jkbradley

MLnick · 2016-06-11T19:42:46Z

jenkins add to whitelist

SparkQA · 2016-06-11T20:36:07Z

Test build #60347 has finished for PR 13617 at commit 46783ac.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-06-12T05:29:51Z

Test build #60354 has finished for PR 13617 at commit 138fd25.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2016-06-17T03:37:00Z

Thanks for the PR (with a great description)! FYI review on a big new feature may need to wait until 2.0 QA is done.

Also, @mengxr @avulanov have discussed regression before, and IIRC it was unclear if there were many use cases relative to classification. It may make sense to focus on merging improvements to classification first, though please push back if you can cite some use cases. Thanks!

JeremyNixon · 2016-06-20T08:59:31Z

@jkbradley, of course you’re welcome for the PR! I’d be happy to discuss a few use cases.

Among MLlib algorithms, MLPR has the unique ability to generalize to unseen features that have a nonlinear relationship with the output. Examples of learning relationships such as x^2 = y and x1*x2=y show that its performance is exceptionally better on such problems whenever the features leave the range that the model was trained on. These type of relationships show up in almost every important modeling problem.

Anyone looking to put a model into production who wants to perform well on new data that is seen that isn’t well represented in training needs an algorithm that can generalize to that range. Below is a classic example - two variables in the dataset interact to predict the outcome variable well. Within the range of the training data, MLPR’s performance is on par with Gradient Boosting, Random Forests and Linear Regression. But outside the range of the training data, the tree based models are incapable of generalizing. Linear regression is only capable of generalizing simple linear relationships, so it forces the user to manually encode the complex relationships they want modeled. And so because MLPR is capable of automatically modeling the target as a non-linear function of features with a structure that generalizes well, it outperforms every other algorithm in MLlib in a context like this.

MLPR also shows consistent, robust performance on standard datasets. Below are examples of its performance relative to other models on Load Boston, Diabetes, and Iris (Avaliable here: http://scikit-learn.org/stable/datasets/#toy-datasets). All models are using their default parameters (tanh activations and 50 neurons in a single hidden layer for MLPR) and are evaluated using RMSE. Train/Test split is a random 70/30 split with no validation set. All data is scaled (mean and std) in preprocessing.

Load Boston
NN - 3.87
DT - 4.17
RF - 3.23
GBT - 4.34
L2 LR - 4.4

Diabetes
NN - 51.3
DT - 65.2
RF - 55.6
GBT 67.4
L2 LR - 52.24

Iris (Predicting Sepal Length)
NN - 0.376
DT - 0.451
RF - 0.386
GBT - 0.444
LR - 0.295

Together these properties (generalization to unseen feature values + consistent performance) make it a valuable algorithm to have in a production systems that demands robust predictions. It learns a very different type of structure from the decision tree based models already in MLlib, and so has value as a part of an ensemble whether or not it has the highest predictive score on the validation data. Situations where it does have the best predictive score are clear use cases.

You bring up improvements to classification as well. One downside to the current implementation of MLPC is that it forces users to use a Sigmoid activation function, which has the unfortunate property of saturating the gradients. I provide support here for the more modern Tanh, Relu and Linear activations which gives the user options for activations that are zero-centered or do not kill gradients that can speed up convergence dramatically and improve accuracy. These benefits will go to MLPR and MLPC, and should be included regardless of the decision on the MLPR API.

With a linear activation/layer and squared error loss included, the library has all the functionality necessary to run MLPR. That functionality already effectively exists in the library - all of the critical components, from the topology to the optimizer to the activation functions are already supported and maintained in MLlib. All we require is an API to call the algorithm.

That API could be as minimal as a single parameter to MLPC that replaces the last layer with a linear layer w/ squared error.

The downside to that is inconsistency with the rest of MLlib and skimping on automated scaling which will put users through a lot more work or risk them getting extremely poor results from misuse. The naming may also lead to confusion, where the user would be doing regression with an algorithm named for classification.

The current proposed API is consistent with the rest of MLlib and with MLPC. It enables automated scaling and gives users a consistent experience, and so I recommend it. I can understand wanting the algorithm without having to support another API, and so we can entertain more flexible options if that looks attractive.

I entirely understand w.r.t. 2.0 QA - I look forward to hearing the thoughts of @avulanov and @mengxr!

viirya · 2016-06-20T09:40:46Z

mllib/src/main/scala/org/apache/spark/ml/ann/Layer.scala

+  /**
+   * Creates a multi-layer perceptron for regression
+   *
+   * @param layerSizes sizes of layers including input and output size


Does it include output size? I think your output size is always 1?

Yeah I wanted to check - is there a reason we can't support n-dimension output? It's less common but the MLP can support it.

That's right @viirya, the output is almost always 1. There are important corner cases with multiple outputs, like Overfeat's state of the art object localization predictions (see section 4.2). Instead of training 1 linear regression model on top of the network's generated features, it trains 4 linear regression models that each predict the corner of a box that surrounds an object in the image. I think that keeping the flexibility makes sense, but we'll need to add SPARK-9120.

SparkQA · 2016-06-20T09:47:38Z

Test build #60844 has finished for PR 13617 at commit 2dc114f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

MLnick · 2016-06-20T09:53:28Z

As per @avulanov's comment on SPARK-15581, if we do indeed plan to add the "essentials" for DL to Spark (e.g. MLP, CNN, autencoder), then MLPR seems like it should be in there too. Especially since this PR is mostly "wrapper" code to expose the DF-based API, example, and tests. The core changes are minimal and open up a powerful model for users - I guess what I am saying is the "risk vs reward" here seems good.

Also, FWIW this is in scikit-learn dev (http://scikit-learn.org/dev/modules/neural_networks_supervised.html)

sethah · 2016-06-21T06:24:13Z

@JeremyNixon Does it make sense to pull the new activation functions out of this PR and into a standalone? I know this PR depends upon some of them, but since it's a WIP and the other change is smaller it can likely be merged before this one.

Regarding the use cases, you mention that MLPR has advantages on generalizing and learning non-linear relationships (advantages over what is currently in MLlib, anyway). Linear regression can be used to model non-linear relationships using some feature engineering, though it can be cumbersome and is not always practical. MLPR should be better, but presumably takes longer to train. It might be nice to show example(s) of a case where the output is non-linear in the features with MLPR and LR in Spark.ML, where LR is used with polynomial expansion, on a large dataset. Comparing predictive performance and algorithm runtimes would help paint a clearer picture of the tradeoffs. At some point, the number of features make modeling higher order interactions with Linear Regression impractical, but I'm not sure exactly where that point is and how well MLPR can perform on the same data.

jkbradley · 2016-06-22T06:10:57Z

@JeremyNixon Thanks for your thoughts. I agree this should get in, but want to make sure the priorities are clear.

With respect to examples of improvements, I really meant either (a) research papers showing the importance or (b) industry use cases. One can always construct examples where an algorithm is helpful, and I agree that feature engineering is likely a good use case. But references are very helpful for guidance.

+1 for separating the activation functions out into another PR

About scaling: I'd say this should mimic LinearRegression's standardization API.

avulanov · 2016-06-23T02:17:38Z

@JeremyNixon Thank you for your PR! Actually, regression was in the original Multilayer perceptron PR: a226133. However, we removed it after discussion with @mengxr. The reason is that regression needs to have only one output to be consistent with RegressionModel in Spark ML. We did not find an evidence that multilayer perceptron with one output is widely used in research or in industry. We posted a JIRA issue indicating that use cases are needed to justify the implementation of this model https://issues.apache.org/jira/browse/SPARK-10409. There was no discussion until now, and I am glad that we finally have it. I think we are still missing some strong motivating use cases. Could you provide few references to research papers or industrial applications that rely on MLP regression?

The other way of addressing this problem would be to implement multilayer perceptron regression with multiple outputs. Justifying its usefulness might be simpler. We might need to implement multivariate regression interface beforehand: https://issues.apache.org/jira/browse/SPARK-9120

+1 for separating the activation functions into another PR. Currently, there is no public API to specify activation functions in hidden layers.

JeremyNixon · 2016-06-30T14:40:01Z

@avulanov Great to hear from you! I'd love to give you a short tour of MLPR's use cases.
@jkbradley Wonderful to hear that you agree this should get in, and I'm happy to provide a few applications and results from academia and industry.

Computer Vision
a. Object Localization / Detection as DNN Regression
b. Human Pose Regression
Finance
a. Currency Exchange Rate
b. Stock Price Prediction
c. Forecasting Financial Time Series
d. Crude Oil Price Prediction
Atmospheric Sciences
a. Air Quality Prediction
b. Carbon Dioxide Pollution Prediction
c. Ozone Concentration Modeling
d. Sulphur Dioxide Concentration Prediction
Infrastructure
a. Road Tunnel Cost Estimation
b. Highway Engineering Cost Estimation
Geology / Physics
a. Meteorology and Oceanography Applications
b. Pacific Sea Surface Temperature Prediction
c. Hydrological Modeling
Summary

Computer Vision

(Assumes we include convolutional and pooling layer types)

Detection as DNN Regression - Object Localization Detection

Precise object localization is necessary to track an object’s shape or movement. Includes a regression layer which generates an object binary mask, a binary representation of the object in the image. This creates an object detector, learning the location of an object or even specific parts of an object in an image.
http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

ImageNet winning solution for Object Localization

Overfeat: http://arxiv.org/pdf/1312.6229v4.pdf

It would be nice to support multiple outputs for an application like object localization -
“4.2, Regressor Training: The regression network takes as input the pooled feature maps from layer 5. It has 2 fully-connected hidden layers of size 4096 and 1024 channels, respectively. The final output layer has 4 units which specify the coordinates for the bounding box edges.”

Pose Regression

Estimate the pose of humans in video, results significantly better than the previous state of the art. Able to detect sign language, generalizes to finding the location of elbows/hands/head etc.
https://www.robots.ox.ac.uk/~vgg/publications/2014/Pfister14a/pfister14a.pdf

Finance

Currency Exchange Rate

Neural Network Regression for forecasting the exchange rate between currencies. NN outperforms standard ARIMA methodology for forecasting.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.52.2442

Accurate Currency Exchange Rate Forecasting using MLPR
http://liawww.epfl.ch/uploads/project_reports/report_282.pdf

Stock Price Prediction: Comparison of Methods

Neural Network Regression outperforms other regression methods in stock price prediction.
https://arxiv.org/pdf/1003.1457.pdf

Forecasting Financial Time Series

Applying deep regression networks to forecast market prices.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.15.8688&rep=rep1&type=pdf

Crude Oil Price Prediction

Spot price forecasting for world crude oil.
http://www.sciencedirect.com/science/article/pii/S0140988308000765

Atmospheric Sciences

Overview

There are numerous applications across the atmospheric sciences, where highly nonlinear relationships need to be appropriately modeled.
https://www.researchgate.net/publication/263416087_Artificial_Neural_Networks_The_Multilayer_Perceptron_-_A_Review_of_Applications_in_the_Atmospheric_Sciences

Air Quality Prediction

Modeling nonlinear relationship between meteorology and pollution for surface ozone concentrations in industrialized areas.
https://www.researchgate.net/profile/VR_Prybutok/publication/8612909_Prybutok_R._A_neural_network_model_forecasting_for_prediction_of_daily_maximum_ozone_concentration_in_an_industrialized_urban_area._Environ._Pollut._92(3)_349-357/links/0deec53babcab9c32f000000.pdf

Air Pollution Prediction - Carbon Dioxide

Neural Network Regression outperforms multiple linear regression for carbon dioxide air pollution prediction in China.
http://202.116.197.15/cadalcanton/Fulltext/21276_2014319_102457_186.pdf

Atmospheric Sulphur dioxide concentrations

Many applications of Neural Network Regression to air pollution, including predicting sulfur dioxide concentration.
http://cdn.intechweb.org/pdfs/17396.pdf

Ozone Concentration Comparison

Neural Networks for Regression outperform decision trees and linear regression for modeling nonlinear relationships required to predict ozone concentration.
https://www.researchgate.net/publication/263416130_Statistical_Surface_Ozone_Models_An_Improved_Methodology_to_Account_for_Non-Linear_Behaviour

Infrastructure

Road Tunnel Cost Estimation

Regression Neural Network leads to accurate cost estimation for road tunnels.
http://ascelibrary.org/doi/abs/10.1061/(ASCE)CO.1943-7862.0000479

Highway Engineering Cost Estimation

Neural Networks reliably predict the cost of highway construction projects.
http://www.jcomputers.us/vol5/jcp0511-19.pdf

Geophysics

Pacific Sea Surface Temperature

Surface temperature prediction environments are nonlinear. Presentation of an MLPR outperforming linear regression models over the domain.
http://www.ncbi.nlm.nih.gov/pubmed/16527455

Meteorology and Oceanography

Improving neural network methods for many tasks in meteorology and oceanography, including seasonal climate forecasting, various time series, satellite imagery analysis, ocean acoustics and more.
https://open.library.ubc.ca/cIRcle/collections/facultyresearchandpublications/32536/items/1.0041821

Hydrological Modeling

River flow forecasting from satellite data with neural networks.
http://hydrol-earth-syst-sci.net/13/1607/2009/hess-13-1607-2009.pdf
Modeling of nonlinear hydrological relationships for river basin (watershed) management.
http://jh.iwaponline.com/content/ppiwajhydro/10/1/3.full.pdf

JeremyNixon · 2016-06-30T14:40:12Z

+1 for multiple outputs. Deep NN Regression with multiple outputs has gotten state of the art performance on object localization tasks.

Let's have a conversation about the public API for activations but also for flexible neural network models in general - I'll put together a design doc and ping you on JIRA as well as break the activation functions into another PR.

avulanov · 2016-07-12T23:14:52Z

@JeremyNixon Thanks for the comprehensive list of references!

The internal API of Spark ANN is designed to be flexible and can handle different types of layers. However, only a part of the API is made public. We have to limit the number of public classes in order to make it simpler to support other languages. This forces us to use (String or Number) parameters instead of introducing of new public classes. One of the options to specify the architecture of ANN is to use text configuration with layer-wise description. We have considered using Caffe format for this. It gives the benefit of compatibility with well known deep learning tool and simplifies the support of other languages in Spark. Implementation of a parser for the subset of Caffe format might be the first step towards the support of general ANN architectures in Spark. However, other ANN features are of higher priority for Spark ML right now: https://issues.apache.org/jira/browse/SPARK-15581. In particular, it is Autoencoder and CNN. It would be great if you could help with them. For example, review the Autoencoder PR: #13621

With regards to the advanced ANN features, we are currently building a package that is supposed to support them. Eventually, some of them, might find their place in the main branch. It would be great to collaborate on this effort.

avulanov · 2016-09-09T19:38:45Z

@JeremyNixon I have released version 1.0.0 of scalable-deeplearning package. This package is based on the implementation of artificial neural networks in Spark ML. It is intended for new Spark deep learning features that were not yet merged to Spark ML or that are too specific to be merged. Contributions are very welcome. I think we can merge your mlp regression proposal after some modifications. Are you interested?

JeremyNixon · 2016-09-12T20:27:26Z

@avulanov I am interested - how about I replicate this PR at github.com/avulanov/scalable-deeplearning and we discuss details there?

SparkQA · 2016-09-12T20:56:13Z

Test build #65278 has finished for PR 13617 at commit 509cb23.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

avulanov · 2016-09-12T21:35:41Z

@JeremyNixon Sounds good!

SparkQA · 2016-11-01T22:48:54Z

Test build #67929 has finished for PR 13617 at commit a5d9972.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-01T23:34:24Z

Test build #67932 has finished for PR 13617 at commit 322f3bd.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-02T00:41:47Z

Test build #67934 has finished for PR 13617 at commit be4c5ea.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-11-02T00:42:38Z

Test build #67933 has finished for PR 13617 at commit f3a1193.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-03-07T19:56:10Z

Test build #74122 has finished for PR 13617 at commit 16b7bc2.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class MultilayerPerceptronRegressor @Since(\"2.0.2\") (

Nickersoft · 2017-11-13T02:14:46Z

@JeremyNixon @avulanov Any update on this? I noticed neither this PR or the one on the deeplearning package was ever merged, and it is the only resource I can find regarding neural net-based regression in Spark.

yolile · 2018-01-12T11:35:18Z

@JeremyNixon @avulanov @MLnick @mengxr @jkbradley any update? Is this PR going to be merged?

Neuw84 · 2018-06-04T13:16:04Z

Hi,
Another important use case for the MLP regressor is for forecasting machine sensor data. One of our clients uses this approach for predictive maintenance on Industry 4.0. assets. We were hoping to replace their custom implementation using ad-hoc library with Spark ML implementation but we are blocked until this get merged.

Can't go into too much detail about the use case, but it's in production in industrial environments. The general approach is to predict one sensor value based on others.

Any updates?

P.D. Anyway, just in case it helps anyone I have backported the pull request code to the 2.3 branch on my fork

d-kulikov · 2019-10-14T13:36:38Z

Hi, does anyone have an idea why this was abandoned?

srowen · 2019-10-14T14:34:47Z

Generally speaking, I'd say this was superseded by third-party deep learning packages, several of which can be used on top of Spark.

JeremyNixon added 10 commits June 6, 2016 10:41

working version of mlpr

979505c

refactor, enable mlpc to run simultaneously, remove commented code

abfe50d

update with ml Vector

c73eb7b

working with gd, updated with save-load

583febc

add additional test for gradient descent

080bedb

add validation for min = max, update tests

982d08c

top to bottom review of each file, add example code

85b4726

update testing suite

b5f90e5

efficiently autocompute min and max

8a3f984

Clean up loose new lines. Make comments more readable.

46783ac

update loss function

138fd25

add support for tanh and relu activation functions

5919230

remove experimentation with alternate activations for MLPC

2dc114f

viirya reviewed Jun 20, 2016
View reviewed changes

add param to allow label scaling to be toggled on and off

509cb23

remove additional activation functions

a5d9972

add linear function

322f3bd

JeremyNixon added 2 commits November 1, 2016 16:40

update activation function call to sigmoid

f3a1193

simplify standardization in label converter

be4c5ea

update tags

16b7bc2

Neuw84 added a commit to Neuw84/spark that referenced this pull request Jun 5, 2018

added Multilayer Perceptron Regressor from apache#13617

d46d27d

Neuw84 added a commit to Neuw84/spark that referenced this pull request Jun 7, 2018

added Multilayer Perceptron Regressor (using ReLU) from apache#13617

e7efbb9

dongjoon-hyun added the ML label Jun 14, 2019

srowen closed this Oct 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-10409] [ML] Add Multilayer Perceptron Regression to ML #13617

[SPARK-10409] [ML] Add Multilayer Perceptron Regression to ML #13617

JeremyNixon commented Jun 11, 2016 •

edited

Loading

JeremyNixon commented Jun 11, 2016

MLnick commented Jun 11, 2016

SparkQA commented Jun 11, 2016

SparkQA commented Jun 12, 2016

jkbradley commented Jun 17, 2016

JeremyNixon commented Jun 20, 2016 •

edited

Loading

viirya Jun 20, 2016

MLnick Jun 20, 2016

JeremyNixon Jun 30, 2016

SparkQA commented Jun 20, 2016

MLnick commented Jun 20, 2016

sethah commented Jun 21, 2016

jkbradley commented Jun 22, 2016

avulanov commented Jun 23, 2016

JeremyNixon commented Jun 30, 2016

JeremyNixon commented Jun 30, 2016

avulanov commented Jul 12, 2016 •

edited

Loading

avulanov commented Sep 9, 2016

JeremyNixon commented Sep 12, 2016

SparkQA commented Sep 12, 2016

avulanov commented Sep 12, 2016

SparkQA commented Nov 1, 2016

SparkQA commented Nov 1, 2016

SparkQA commented Nov 2, 2016

SparkQA commented Nov 2, 2016

SparkQA commented Mar 7, 2017

Nickersoft commented Nov 13, 2017

yolile commented Jan 12, 2018

Neuw84 commented Jun 4, 2018 •

edited

Loading

d-kulikov commented Oct 14, 2019

srowen commented Oct 14, 2019

[SPARK-10409] [ML] Add Multilayer Perceptron Regression to ML #13617

[SPARK-10409] [ML] Add Multilayer Perceptron Regression to ML #13617

Conversation

JeremyNixon commented Jun 11, 2016 • edited Loading

What changes were proposed in this pull request?

Outline

Major Changes

Linear, Relu, Tanh Activations

Automated Scaling

Motivating Examples

API Decisions

Input/Output Layer Argument

Modular API

Automating Scaling

Naming

Features

Reference Resources

How was this patch tested?

JeremyNixon commented Jun 11, 2016

MLnick commented Jun 11, 2016

SparkQA commented Jun 11, 2016

SparkQA commented Jun 12, 2016

jkbradley commented Jun 17, 2016

JeremyNixon commented Jun 20, 2016 • edited Loading

viirya Jun 20, 2016

Choose a reason for hiding this comment

MLnick Jun 20, 2016

Choose a reason for hiding this comment

JeremyNixon Jun 30, 2016

Choose a reason for hiding this comment

SparkQA commented Jun 20, 2016

MLnick commented Jun 20, 2016

sethah commented Jun 21, 2016

jkbradley commented Jun 22, 2016

avulanov commented Jun 23, 2016

JeremyNixon commented Jun 30, 2016

Computer Vision

Detection as DNN Regression - Object Localization Detection

ImageNet winning solution for Object Localization

Pose Regression

Finance

Currency Exchange Rate

Stock Price Prediction: Comparison of Methods

Forecasting Financial Time Series

Crude Oil Price Prediction

Atmospheric Sciences

Overview

Air Quality Prediction

Air Pollution Prediction - Carbon Dioxide

Atmospheric Sulphur dioxide concentrations

Ozone Concentration Comparison

Infrastructure

Road Tunnel Cost Estimation

Highway Engineering Cost Estimation

Geophysics

Pacific Sea Surface Temperature

Meteorology and Oceanography

Hydrological Modeling

JeremyNixon commented Jun 30, 2016

avulanov commented Jul 12, 2016 • edited Loading

avulanov commented Sep 9, 2016

JeremyNixon commented Sep 12, 2016

SparkQA commented Sep 12, 2016

avulanov commented Sep 12, 2016

SparkQA commented Nov 1, 2016

SparkQA commented Nov 1, 2016

SparkQA commented Nov 2, 2016

SparkQA commented Nov 2, 2016

SparkQA commented Mar 7, 2017

Nickersoft commented Nov 13, 2017

yolile commented Jan 12, 2018

Neuw84 commented Jun 4, 2018 • edited Loading

d-kulikov commented Oct 14, 2019

srowen commented Oct 14, 2019

JeremyNixon commented Jun 11, 2016 •

edited

Loading

JeremyNixon commented Jun 20, 2016 •

edited

Loading

avulanov commented Jul 12, 2016 •

edited

Loading

Neuw84 commented Jun 4, 2018 •

edited

Loading