New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add least squares regression cookbook #3064
Conversation
@@ -1 +1 @@ | |||
Subproject commit c70a2dc726f7dfb6d60813e87fbd5a3fe3372069 | |||
Subproject commit 0d5c0f60c839d5ddbbff61735a713cf17c209e9a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that does the job for data
The above build automatically generates this |
I will review this soon, gotta run now |
Example | ||
------- | ||
|
||
Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting :sgclass:CDenseFeatures
gives link error. It's not there in knn.sg
also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
http://www.shogun-toolbox.org/CDenseFeatures
redirects to http://www.shogun-toolbox.org/doc/en/latest/classshogun_1_1CDenseFeatures.html
which does not exist. The correct link for CDenseFeatures is http://www.shogun-toolbox.org/doc/en/latest/singletonshogun_1_1CDenseFeatures.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that is a bug in doxygen. For now we cannot link against CDenseFeatures. But that is easy to fix via grep later
7879a8a
to
4ca7374
Compare
Least Squares Regression | ||
======================== | ||
|
||
A Linear regression model can be defined as :math:`y_i = \bf{w}.\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either use \cdot or eve better w^\top x_i
(also below)
Good page! BTW, also please add a test file in |
One can differentiate :math:`E(\bf{w})` with respect to :math:`\bf{w}` and equate to zero to determine the :math:`\bf{w}` that minimizes :math:`E(\bf{w})`. This leads to solution of the form: | ||
|
||
.. math:: | ||
{\bf w} = \left(\sum_{i=1}^N{\bf x}_i{\bf x}_i^T\right)^{-1}\left(\sum_{i=1}^N y_i{\bf x}_i\right) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw here you used ^T
, which is not consistent with the above. Also in LaTeX, this is ^\top
OK one more thing. This class is just a wrapper for linear ridge regression with 0 regulariser. |
cba74f3
to
c6b85d6
Compare
@karlnapf Updated the commit. |
|
||
where :math:`t_i` is a true label and :math:`N` is the number of testing samples. | ||
|
||
:sgclass:`CLeastSquaresRegression` is a wrapper class for :sgclass:`CLinearRidgeRegression` with regularization coefficient :math:`\tau = 0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry you got me wrong here. I meant, rename the example to be one for CLinearRidgeRegression (including the math, which is almost the same), and then mention that LeastSquares is a special case -- rather than the other way around. Get what I mean?
Sweet, this is starting to look good. |
@karlnapf sorry i didn't pay attention to the mse-accuracy thing. Everything has been corrected now. |
A linear ridge regression model can be defined as :math:`y_i = \bf{w}^\top\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`. | ||
|
||
.. math:: | ||
E({\bf{w}}) = \sum_{i=1}^N(y_i-{\bf w}^\top {\bf x}_i)^2 + \tau||{\bf w}||^2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should go, see my earlier comment
|
||
.. sgexample:: linear_ridge_regression.sg:create_features | ||
|
||
We create an instance of :sgclass:`CLinearRidgeRegression` classifier, passing it training data and labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just mention that you pass \tau. See knn http://buildbot.shogun-toolbox.org/cookbook_pr/cbdbb335010353edc98a37ebc2713fc0bdf00762/examples/classifier/knn.html
All comments are minor. The data file does not have to be touched anymore. Ill merge data and wait for the final update on this one. NICE! :) |
@karlnapf updated but the one travis build got terminated. |
@sanuj there's something else with this because i've re-ran the gcc task and it's still timing out. |
Linear Ridge Regression | ||
======================= | ||
|
||
A linear ridge regression model can be defined as :math:`y_i = \bf{w}^\top\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`. One can show the solution can be written as: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E(w) is not defined and should not be used. Just say "minimizes the squared loss plus a :math:L_2
regularisation term"
(also remove: by finding appropriate w)
no idea what is wrong with travis? |
the knn example times out ... |
I just tried this locally and it did work. No idea :( |
Very weird. travis passed after a few restarts....I'll merge for now @vigsterkr any ideas? |
add least squares regression cookbook
Something cheesy is going on there. The merged build failed. |
This has high priority as the develop build is broken |
i told you not to merge this.... |
All tests pass with
With |
@sanuj @vigsterkr the timeout has nothing to do with this cookbook page here -- it happened before and it is the knn that times out. @sanuj It also has nothing to do with modular tests failing. |
I'm not sure if i should add the shogun-data subproject thing in this commit.
@karlnapf Please review.