Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add least squares regression cookbook
- Loading branch information
Showing
4 changed files
with
66 additions
and
31 deletions.
There are no files selected for viewing
Submodule data
updated
8 files
38 changes: 38 additions & 0 deletions
38
doc/cookbook/source/examples/regression/least_squares_regression.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
======================== | ||
Least Squares Regression | ||
======================== | ||
|
||
A Linear regression model can be defined as :math:`y_i = \bf{w}.\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`. | ||
|
||
.. math:: | ||
E({\bf{w}}) = \sum_{i=1}^N(y_i-{\bf w}\cdot {\bf x}_i)^2 | ||
One can differentiate :math:`E(\bf{w})` with respect to :math:`\bf{w}` and equate to zero to determine the :math:`\bf{w}` that minimizes :math:`E(\bf{w})`. This leads to solution of the form: | ||
|
||
.. math:: | ||
{\bf w} = \left(\sum_{i=1}^N{\bf x}_i{\bf x}_i^T\right)^{-1}\left(\sum_{i=1}^N y_i{\bf x}_i\right) | ||
------- | ||
Example | ||
------- | ||
|
||
Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as | ||
|
||
.. sgexample:: least_squares_regression.sg:create_features | ||
|
||
We create an instance of the :sgclass:`CLeastSquaresRegression` classifier, passing it training data and labels. | ||
|
||
.. sgexample:: least_squares_regression.sg:create_instance | ||
|
||
Then we run the `train` function in `LeastSquaresRegression` and apply it to test data. | ||
|
||
.. sgexample:: least_squares_regression.sg:train_and_apply | ||
|
||
After prediction, we can output the predicted :math:`y_i` and calculate mean squared error by using :sgclass:`CMeanSquaredError`. | ||
|
||
.. sgexample:: least_squares_regression.sg:ouput_and_calculate_error | ||
|
||
---------- | ||
References | ||
---------- | ||
:wiki:`Ordinary_least_squares` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
CSVFile f_feats_train("../data/regression_1d_linear_features_train.dat") | ||
CSVFile f_feats_test("../data/regression_1d_linear_features_test.dat") | ||
CSVFile f_labels_train("../data/regression_1d_linear_labels_train.dat") | ||
CSVFile f_labels_test("../data/regression_1d_linear_labels_test.dat") | ||
|
||
#![create_features] | ||
RealFeatures features_train(f_feats_train) | ||
RealFeatures features_test(f_feats_test) | ||
RegressionLabels labels_train(f_labels_train) | ||
RegressionLabels labels_test(f_labels_test) | ||
#![create_features] | ||
|
||
#![create_instance] | ||
LeastSquaresRegression lsr(features_train, labels_train) | ||
#![create_instance] | ||
|
||
#![train_and_apply] | ||
lsr.train() | ||
Labels labels_predict = lsr.apply(features_test) | ||
#![train_and_apply] | ||
|
||
|
||
#![ouput_and_calculate_error] | ||
MeanSquaredError mse() | ||
real accuracy = mse.evaluate(labels_predict, labels_test) | ||
RealVector output = labels_predict.get_values() | ||
#![ouput_and_calculate_error] |
30 changes: 0 additions & 30 deletions
30
examples/undocumented/python_modular/regression_least_squares_modular.py
This file was deleted.
Oops, something went wrong.