IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression #3308

dmitrievanthony · 2017-12-28T09:22:33Z

No description provided.

…regression

…-5217

datasets, add BarzilaiBorwein gradient descent updater.

…ition.

zaleslaw

Should think about OLS, QR, SGD model-trainer hierarchy

zaleslaw · 2017-12-28T09:31:46Z

...n/ml/org/apache/ignite/examples/ml/regression/linear/DistributedLinearRegressionExample.java

+     * @param model linear regression model
+     * @return formatted string representation
+     */
+    private static String formatLinearRegressionModelPrettyPrint(LinearRegressionModel model) {


Better to add into LinearRegressionModel. I'd like the pretty print. Also you could add constraint in 30 or 50 vars to print out this model.

Added into toString of LinearRegressionModel.

zaleslaw · 2017-12-28T09:33:54Z

...n/ml/org/apache/ignite/examples/ml/regression/linear/DistributedLinearRegressionExample.java

+                System.out.println(">>> ---------------------------------");
+                System.out.println(">>> | Prediction\t| Ground Truth\t|");
+                System.out.println(">>> ---------------------------------");
+                for (double[] observation : data) {


The best way here to use LabeledDatasets (but it can be a separate ticket)

Good idea, will be done in separate task.

zaleslaw · 2017-12-28T09:35:57Z

modules/ml/src/main/java/org/apache/ignite/ml/optimization/GradientDescent.java

+     * @return This gradient descent instance
+     */
+    public GradientDescent withMaxIterations(int maxIterations) {
+        if (maxIterations < 0)


Mostly we use asserts to check input arguments, Let's discuss it with @ybabak @YuriBabak

Replaced with asserts.

zaleslaw · 2017-12-28T09:37:02Z

modules/ml/src/main/java/org/apache/ignite/ml/optimization/GradientDescent.java

+        if (convergenceTol < 0)
+            throw new IllegalArgumentException("Convergence tolerance must be non-negative but got " + convergenceTol);
+        this.convergenceTol = convergenceTol;
+        return this;


Mostly we don't use chain and use only setters. Let's discuss it with @ybabak @YuriBabak
I'd like chain

As discussed will keep this approach with chain methods.

zaleslaw · 2017-12-28T09:38:08Z

modules/ml/src/main/java/org/apache/ignite/ml/optimization/GradientDescent.java

+     */
+    public Vector optimize(Matrix data, Vector initialWeights) {
+        Vector weights = initialWeights, oldWeights = null, oldGradient = null;
+        if (data instanceof SparseDistributedMatrix) {


Could we refactor it to one method to avoid copy-paste?

Good idea, done.

zaleslaw · 2017-12-28T09:45:21Z

modules/ml/src/main/java/org/apache/ignite/ml/regressions/linear/LinearRegressionQRTrainer.java

+        long ts1 = System.currentTimeMillis();
+        Vector groundTruth = extractGroundTruth(data);
+        Matrix inputs = extractInputs(data);
+        long ts2 = System.currentTimeMillis();


Don't forget remove these timestamps

zaleslaw · 2017-12-28T09:52:35Z

...s/ml/src/test/java/org/apache/ignite/ml/regressions/linear/ArtificialRegressionDatasets.java

+        83.6427296424, 27.4571268153, 73.5881193584, 27.1465364511, 79.4095449062}, -5.14077007134);
+
+    /** */
+    public static class Dataset {


Please rename this class to TestDataset or something else. Or use LabeledDataset after update

zaleslaw · 2017-12-28T09:56:24Z

...ml/org/apache/ignite/yardstick/ml/regression/IgniteOLSMultipleLinearRegressionBenchmark.java

-
-        // Check expected residuals from R
-        mdl.estimateResiduals();
+        LinearRegressionSGDTrainer trainer = new LinearRegressionSGDTrainer(100_000, 1e-12);


But this was about OLS Regression and you use SGDTrainer. It's strange

Generally speaking SGDTrainer also makes OLS Regression just because it uses Least Square as a loss function. Anyway, I replaced it with QR trainer for better transparency.

zaleslaw · 2017-12-28T09:57:05Z

modules/ml/src/test/resources/datasets/regression/boston.csv

+20.6, 4.83567, 0.0, 18.1, 0.0, 0.583, 5.905, 53.2, 3.1523, 24.0, 666.0, 20.2, 388.22, 11.45
+15.2, 0.15086, 0.0, 27.74, 0.0, 0.609, 5.454, 92.7, 1.8209, 4.0, 711.0, 20.1, 395.09, 18.06
+7.0, 0.18337, 0.0, 27.74, 0.0, 0.609, 5.414, 98.3, 1.7554, 4.0, 711.0, 20.1, 344.05, 23.97
+8.1, 0.20746, 0.0, 27.74, 0.0, 0.609, 5.093, 98.0, 1.8226, 4.0, 711.0, 20.1, 318.43, 29.68


Don't forget add readme file with short description about attributes, licenses or link to the dataset repository

README.txt with dataset descriptions added.

zaleslaw · 2017-12-28T09:58:36Z

...n/ml/org/apache/ignite/examples/ml/regression/linear/DistributedLinearRegressionExample.java

+                SparseDistributedMatrix distributedMatrix = new SparseDistributedMatrix(data);
+
+                System.out.println(">>> Create new linear regression trainer object.");
+                Trainer<LinearRegressionModel, Matrix> trainer = new LinearRegressionQRTrainer();


OLS could be solved not by QR decomposition only.

Added separate examples for QRTrainer and SGDTrainer.

dmitrievanthony and others added 15 commits December 22, 2017 18:08

IGNITE-5217 Added initial version of Gradient Descent trainer of OLS …

e95d8b6

…regression

Merge branch 'master' of https://github.com/apache/ignite into ignite…

1d67f26

…-5217

IGNITE-5217 Move gradient descent to optimization package, add test

bd76022

datasets, add BarzilaiBorwein gradient descent updater.

IGNITE-5217 Add artificial regression datasets and refactor SGD.

254dc78

IGNITE-5217 Improve code style in accordance with guidelines.

0b588c1

IGNITE-5217 Make LinearRegressionModel exportable.

22fb769

IGNITE-5217 Add simple linear regression trainer based on QR decompos…

7c0f8a4

…ition.

IGNITE-5217 Linear regression tests refactoring.

f2a9d34

Merge remote-tracking branch 'origin/master' into ignite-5217

92e64e9

IGNITE-5217 Add distributed matrices support to linear regression.

f29d385

IGNITE-5217 Add distributed gradient calculator.

ed6cd77

IGNITE-5217 Add SparseDistributedMatrixMapReducer.

053d751

IGNITE-5217 Revert accidental changes.

0792bee

IGNITE-5217 Revert accidental changes.

06fb44c

IGNITE-5217 Remove old classes of OLS regression.

bbf8183

zaleslaw reviewed Dec 28, 2017

View reviewed changes

dmitrievanthony added 5 commits December 28, 2017 15:10

IGNITE-5217 Fixes after pull request review.

6313f39

IGNITE-5217 Fixed typo in RegressionsTestSuite.

5917c80

IGNITE-5217 Fixes after pull request review.

f0f735a

IGNITE-5217 Fixes of datasets licence headers.

531f2b2

IGNITE-5217 Add missing licence header.

db860b2

asfgit closed this in b206085 Dec 28, 2017

dmitrievanthony deleted the ignite-5217 branch February 6, 2018 07:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression #3308

IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression #3308

dmitrievanthony commented Dec 28, 2017

zaleslaw left a comment

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017 •

edited

Loading

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

zaleslaw Dec 28, 2017

dmitrievanthony Dec 28, 2017

IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression #3308

IGNITE-5217: Add Gradient Descent and QR-based trainers for Linear Regression #3308

Conversation

dmitrievanthony commented Dec 28, 2017

zaleslaw left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitrievanthony Dec 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitrievanthony Dec 28, 2017 •

edited

Loading