Skip to content

Commit

Permalink
add least squares regression cookbook
Browse files Browse the repository at this point in the history
  • Loading branch information
sanuj committed Mar 12, 2016
1 parent fd151b5 commit 7879a8a
Show file tree
Hide file tree
Showing 4 changed files with 66 additions and 31 deletions.
@@ -0,0 +1,38 @@
========================
Least Squares Regression
========================

A Linear regression model can be defined as :math:`y_i = \bf{w}.\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`.

.. math::
E({\bf{w}}) = \sum_{i=1}^N(y_i-{\bf w}\cdot {\bf x}_i)^2
One can differentiate :math:`E(\bf{w})` with respect to :math:`\bf{w}` and equate to zero to determine the :math:`\bf{w}` that minimizes :math:`E(\bf{w})`. This leads to solution of the form:

.. math::
{\bf w} = \left(\sum_{i=1}^N{\bf x}_i{\bf x}_i^T\right)^{-1}\left(\sum_{i=1}^N y_i{\bf x}_i\right)
-------
Example
-------

Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as

.. sgexample:: least_squares_regression.sg:create_features

We create an instance of the :sgclass:`CLeastSquaresRegression` classifier, passing it training data and labels.

.. sgexample:: least_squares_regression.sg:create_instance

Then we run the `train` function in `LeastSquaresRegression` and apply it to test data.

.. sgexample:: least_squares_regression.sg:train_and_apply

After prediction, we can output the predicted :math:`y_i` and calculate mean squared error by using :sgclass:`CMeanSquaredError`.

.. sgexample:: least_squares_regression.sg:ouput_and_calculate_error

----------
References
----------
:wiki:`Ordinary_least_squares`
27 changes: 27 additions & 0 deletions examples/meta/src/regression/least_squares_regression.sg
@@ -0,0 +1,27 @@
CSVFile f_feats_train("../data/regression_1d_linear_features_train.dat")
CSVFile f_feats_test("../data/regression_1d_linear_features_test.dat")
CSVFile f_labels_train("../data/regression_1d_linear_labels_train.dat")
CSVFile f_labels_test("../data/regression_1d_linear_labels_test.dat")

#![create_features]
RealFeatures features_train(f_feats_train)
RealFeatures features_test(f_feats_test)
RegressionLabels labels_train(f_labels_train)
RegressionLabels labels_test(f_labels_test)
#![create_features]

#![create_instance]
LeastSquaresRegression lsr(features_train, labels_train)
#![create_instance]

#![train_and_apply]
lsr.train()
Labels labels_predict = lsr.apply(features_test)
#![train_and_apply]


#![ouput_and_calculate_error]
MeanSquaredError mse()
real accuracy = mse.evaluate(labels_predict, labels_test)
RealVector output = labels_predict.get_values()
#![ouput_and_calculate_error]

This file was deleted.

0 comments on commit 7879a8a

Please sign in to comment.