add least squares regression cookbook #3064

sanuj · 2016-03-12T15:03:29Z

I'm not sure if i should add the shogun-data subproject thing in this commit.
@karlnapf Please review.

karlnapf · 2016-03-12T15:17:32Z

data

@@ -1 +1 @@
-Subproject commit c70a2dc726f7dfb6d60813e87fbd5a3fe3372069
+Subproject commit 0d5c0f60c839d5ddbbff61735a713cf17c209e9a


that does the job for data

karlnapf · 2016-03-12T15:18:17Z

The above build automatically generates this

karlnapf · 2016-03-12T15:21:26Z

I will review this soon, gotta run now

sanuj · 2016-03-12T18:06:23Z

doc/cookbook/source/examples/regression/least_squares_regression.rst

+Example
+-------
+
+Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as


Putting :sgclass:CDenseFeatures gives link error. It's not there in knn.sg also.

http://www.shogun-toolbox.org/CDenseFeatures redirects to http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CDenseFeatures.html which does not exist. The correct link for CDenseFeatures is http://www.shogun-toolbox.org/doc/en/latest/singletonshogun_1_1CDenseFeatures.html

yes, that is a bug in doxygen. For now we cannot link against CDenseFeatures. But that is easy to fix via grep later

karlnapf · 2016-03-12T20:48:35Z

doc/cookbook/source/examples/regression/least_squares_regression.rst

+Least Squares Regression
+========================
+
+A Linear regression model can be defined as :math:`y_i = \bf{w}.\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`.


either use \cdot or eve better w^\top x_i (also below)

karlnapf · 2016-03-12T21:10:30Z

Good page!
Sorry about all the change requests ... I am still making up my mind about how to do this best. But I think we are almost there.

BTW, also please add a test file in data/testsuite/meta/regression/least_squares_regression.dat in the data repository

karlnapf · 2016-03-12T21:26:43Z

doc/cookbook/source/examples/regression/least_squares_regression.rst

+One can differentiate :math:`E(\bf{w})` with respect to :math:`\bf{w}` and equate to zero to determine the :math:`\bf{w}` that minimizes :math:`E(\bf{w})`. This leads to solution of the form:
+
+.. math::
+    {\bf w} = \left(\sum_{i=1}^N{\bf x}_i{\bf x}_i^T\right)^{-1}\left(\sum_{i=1}^N y_i{\bf x}_i\right)


btw here you used ^T, which is not consistent with the above. Also in LaTeX, this is ^\top

karlnapf · 2016-03-12T21:53:19Z

OK one more thing. This class is just a wrapper for linear ridge regression with 0 regulariser.
Could you just write a page for the ridge regression, and then mention that there is a wrapper class called LeastSquaresRegression which just sets the regulariser to 0?

sanuj · 2016-03-13T19:48:07Z

@karlnapf Updated the commit.

karlnapf · 2016-03-13T23:15:45Z

doc/cookbook/source/examples/regression/least_squares_regression.rst

+
+where :math:`t_i` is a true label and :math:`N` is the number of testing samples.
+
+:sgclass:`CLeastSquaresRegression` is a wrapper class for :sgclass:`CLinearRidgeRegression` with regularization coefficient :math:`\tau = 0`.


sorry you got me wrong here. I meant, rename the example to be one for CLinearRidgeRegression (including the math, which is almost the same), and then mention that LeastSquares is a special case -- rather than the other way around. Get what I mean?

karlnapf · 2016-03-13T23:23:09Z

Sweet, this is starting to look good.
I wrote some more comments, mostly minor -- but I really want to get the first few pages right so that people have an easier time writing the next ones.
Most importantly, make this an example for ridge regression. No need to have one for least squares. You can just mention this in the text. Ah I just realise that the data file then also has to be renamed and re-calculated -- I should not have merged it

sanuj · 2016-03-14T09:31:47Z

@karlnapf sorry i didn't pay attention to the mse-accuracy thing. Everything has been corrected now.

karlnapf · 2016-03-14T12:27:13Z

doc/cookbook/source/examples/regression/linear_ridge_regression.rst

+A linear ridge regression model can be defined as :math:`y_i = \bf{w}^\top\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`.
+
+.. math::
+    E({\bf{w}}) = \sum_{i=1}^N(y_i-{\bf w}^\top {\bf x}_i)^2 + \tau||{\bf w}||^2


This should go, see my earlier comment

karlnapf · 2016-03-14T12:28:27Z

doc/cookbook/source/examples/regression/linear_ridge_regression.rst

+
+.. sgexample:: linear_ridge_regression.sg:create_features
+
+We create an instance of :sgclass:`CLinearRidgeRegression` classifier, passing it training data and labels.


just mention that you pass \tau. See knn http://buildbot.shogun-toolbox.org/cookbook_pr/cbdbb335010353edc98a37ebc2713fc0bdf00762/examples/classifier/knn.html

karlnapf · 2016-03-14T12:31:59Z

All comments are minor. The data file does not have to be touched anymore. Ill merge data and wait for the final update on this one.

NICE! :)

sanuj · 2016-03-14T14:36:18Z

@karlnapf updated but the one travis build got terminated.

vigsterkr · 2016-03-14T16:20:20Z

@sanuj there's something else with this because i've re-ran the gcc task and it's still timing out.

karlnapf · 2016-03-14T17:29:07Z

doc/cookbook/source/examples/regression/linear_ridge_regression.rst

+Linear Ridge Regression
+=======================
+
+A linear ridge regression model can be defined as :math:`y_i = \bf{w}^\top\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`. One can show the solution can be written as:


E(w) is not defined and should not be used. Just say "minimizes the squared loss plus a :math:L_2 regularisation term"

(also remove: by finding appropriate w)

karlnapf · 2016-03-14T21:27:13Z

no idea what is wrong with travis?

karlnapf · 2016-03-14T21:28:40Z

the knn example times out ...

karlnapf · 2016-03-14T21:37:01Z

I just tried this locally and it did work. No idea :(
Just compiling locally with the same cmake options

karlnapf · 2016-03-14T23:12:54Z

Very weird. travis passed after a few restarts....I'll merge for now

@vigsterkr any ideas?

add least squares regression cookbook

karlnapf · 2016-03-14T23:57:45Z

Something cheesy is going on there. The merged build failed.
I then managed to reproduce this freeze locally, running cmake with only -DENABLE_TESTING=ON
@sanuj can you investigate this a bit? I will also tomorrow

karlnapf · 2016-03-14T23:57:58Z

This has high priority as the develop build is broken

vigsterkr · 2016-03-15T05:36:23Z

i told you not to merge this....

sanuj · 2016-03-15T06:29:48Z

All tests pass with -DENABLE_TESTING=ON:

.
.
.
        Start 275: generated_cpp-kernel_ridge_regression
275/275 Test #275: generated_cpp-kernel_ridge_regression ..............................   Passed    0.01 sec

100% tests passed, 0 tests failed out of 275

Total Test time (real) =  15.84 sec

With -DENABLE_TESTING=ON and -DPythonModular=ON a total of 209 tests fail (python modular and generated python) on my local. This has happened before and got fixed automatically but it's happening again (I don't know why).
In both the cases, sometimes the build gets stuck on Unit-LaRank.
@karlnapf do you think updating the data dir with new commits can fix it?

karlnapf · 2016-03-15T10:07:26Z

@sanuj @vigsterkr the timeout has nothing to do with this cookbook page here -- it happened before and it is the knn that times out. @sanuj It also has nothing to do with modular tests failing.

karlnapf reviewed Mar 12, 2016
View reviewed changes

sanuj reviewed Mar 12, 2016
View reviewed changes

sanuj force-pushed the cookbook branch 3 times, most recently from 7879a8a to 4ca7374 Compare March 12, 2016 19:03

karlnapf reviewed Mar 12, 2016
View reviewed changes

sanuj force-pushed the cookbook branch 3 times, most recently from cba74f3 to c6b85d6 Compare March 13, 2016 19:22

karlnapf reviewed Mar 13, 2016
View reviewed changes

sanuj force-pushed the cookbook branch from c6b85d6 to cbdbb33 Compare March 14, 2016 08:57

karlnapf reviewed Mar 14, 2016
View reviewed changes

sanuj force-pushed the cookbook branch from cbdbb33 to 3a527ea Compare March 14, 2016 12:57

karlnapf reviewed Mar 14, 2016
View reviewed changes

add linear ridge regression in cookbook

4b6eb6b

sanuj force-pushed the cookbook branch from 3a527ea to 4b6eb6b Compare March 14, 2016 17:43

karlnapf added a commit that referenced this pull request Mar 14, 2016

Merge pull request #3064 from sanuj/cookbook

32270be

add least squares regression cookbook

karlnapf merged commit 32270be into shogun-toolbox:develop Mar 14, 2016

karlnapf mentioned this pull request Mar 15, 2016

Add cookbook entry for Gaussian Naive Bayes example #3066

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add least squares regression cookbook #3064

add least squares regression cookbook #3064

sanuj commented Mar 12, 2016

karlnapf Mar 12, 2016

karlnapf commented Mar 12, 2016

karlnapf commented Mar 12, 2016

sanuj Mar 12, 2016

sanuj Mar 12, 2016

karlnapf Mar 12, 2016

karlnapf Mar 12, 2016

karlnapf commented Mar 12, 2016

karlnapf Mar 12, 2016

karlnapf commented Mar 12, 2016

sanuj commented Mar 13, 2016

karlnapf Mar 13, 2016

karlnapf commented Mar 13, 2016

sanuj commented Mar 14, 2016

karlnapf Mar 14, 2016

karlnapf Mar 14, 2016

karlnapf commented Mar 14, 2016

sanuj commented Mar 14, 2016

vigsterkr commented Mar 14, 2016

karlnapf Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

vigsterkr commented Mar 15, 2016

sanuj commented Mar 15, 2016

karlnapf commented Mar 15, 2016

		@@ -1 +1 @@
		Subproject commit c70a2dc726f7dfb6d60813e87fbd5a3fe3372069
		Subproject commit 0d5c0f60c839d5ddbbff61735a713cf17c209e9a


		where :math:`t_i` is a true label and :math:`N` is the number of testing samples.

		:sgclass:`CLeastSquaresRegression` is a wrapper class for :sgclass:`CLinearRidgeRegression` with regularization coefficient :math:`\tau = 0`.


		.. sgexample:: linear_ridge_regression.sg:create_features

		We create an instance of :sgclass:`CLinearRidgeRegression` classifier, passing it training data and labels.

add least squares regression cookbook #3064

add least squares regression cookbook #3064

Conversation

sanuj commented Mar 12, 2016

Choose a reason for hiding this comment

karlnapf commented Mar 12, 2016

karlnapf commented Mar 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf commented Mar 12, 2016

Choose a reason for hiding this comment

karlnapf commented Mar 12, 2016

sanuj commented Mar 13, 2016

Choose a reason for hiding this comment

karlnapf commented Mar 13, 2016

sanuj commented Mar 14, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf commented Mar 14, 2016

sanuj commented Mar 14, 2016

vigsterkr commented Mar 14, 2016

Choose a reason for hiding this comment

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

karlnapf commented Mar 14, 2016

vigsterkr commented Mar 15, 2016

sanuj commented Mar 15, 2016

karlnapf commented Mar 15, 2016